Академический Документы
Профессиональный Документы
Культура Документы
Main topics in multivariate statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Main topics in multivariate statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Exploratory methods Graphics for multivariate data . . . . . Principal component analysis (PCA) Possible uses of PCA . . . . . . . . . . . Possible uses of PCA . . . . . . . . . . . Factor analysis: idea . . . . . . . . . . . Factor analysis: model . . . . . . . . . . Factor analysis . . . . . . . . . . . . . . . Linear discriminant analysis. . . . . . . Linear discriminant analysis. . . . . . . Cluster analysis . . . . . . . . . . . . . . . Multidimensional scaling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
More formal methods Normal distribution theory . . . . . . . . . . Tests of signicance for multivariate data Canonical correlation . . . . . . . . . . . . . . Remaining topics we did not cover . . . . .
We have data on several variables, there is some interdependence between the variables, and none of them is clearly the main variable of interest s Methods that are mostly of exploratory nature: x Graphics for multivariate data x Principal component analysis (PCA) x Factor analysis x Linear discriminant analysis (LDA) x Cluster analysis x Multidimensional scaling x ... 2 / 20
More formal topics: x Normal distribution theory x Tests of signicance for multivariate data x Multivariate analysis of variance (MANOVA) x Canonical correlation analysis x ... 3 / 20
Exploratory methods
Graphics for multivariate data
Goal: visualize multivariate data s We covered: x Scatterplot matrix: pairs() x Star plots and segment plots: stars() x Conditioning plots: coplot() x Bi-plot of rst two principal components: biplot() s Other techniques: x Interactive 3 dimensional plots x Plots based on multidimensional scaling (more about this later) x ...
s
4 / 20
5 / 20
6 / 20
Interest in rst principal component: x Example: How to combine the scores on 5 dierent examinations to a total score? Since the rst principal component maximizes the variance, it spreads out the scores as much as possible. s Interest in 2nd - pth principal components: x When all measurements are positively correlated, the rst principal component is often some kind of average of the measurements (e.g., size of birds, severity index of psychiatric symptoms). x Then the other principal components give important information about the remaining pattern (e.g., shape of birds, pattern of psychiatric symptoms) 7 / 20
Interest in rst few principal components: x Dimension reduction: summarize the data with a smaller number of variables, losing as little information as possible. x Can be used for graphical representations of the data (bi-plot). s Use PCA as input for regression analysis: x Highly correlated explanatory variables are problematic in regression analysis. x One can replace them by their principal components, which are uncorrelated by denition. 8 / 20
Idea:
x In social sciences (e.g., psychology), it is often not possible to measure the variables of interest
directly (e.g., intelligence, social class). Such variables are called latent variables or common factors. x Researchers examine such variables indirectly, by measuring variables that can be measured and that are believed to be indicators of the latent variables of interest (e.g., examination scores on various tests) x We want to relate the latent variables of interest to the measured variables 9 / 20
where x x = (x1 , . . . , xp ) are the observed variables (random) x f = (f1 , . . . , fk ) are the common factors (random) x u = (u1 , . . . , up ) are the specic factors (random) x ij are the factor loadings (constants) s Note: f1 , . . . , fk are not observed s Main goal: estimate factor loadings 10 / 20
Factor analysis
Assumptions: x E(x) = 0 (if this is not the case, simply subtract the mean vector) x E(f ) = 0, Cov(f ) = I x E(u) = 0, Cov(ui , uj ) = 0 for i = j x Cov(f, u) = 0 s Estimation: x Under the above assumptions, Cov(x) = = + x Two estimation methods: principal factor analysis and maximum likelihood s Factor loadings are non-unique; factor rotation can be used to ease interpretation
s
11 / 20
12 / 20
Maximum likelihood: x Suppose the exact distributions of the populations 1 , . . . , g are known x Then the maximum likelihood discriminant rule is to allocate an observation x to the population which gives the largest likelihood to x, i.e., to the population with the highest density at the point x. x If the exact distributions are unknown, but we know the shape of the distributions, then we can rst estimate their parameters, and then use the above rule. This is the sample maximum likelihood discriminant rule. s For two groups from two multivariate normal distributions with the same covariance matrix, Fishers linear discriminant analysis equals the maximum likelihood rule. 13 / 20
Cluster analysis
We have multivariate data without group labels. s We want to see if there are clusters in the data, i.e., groups of observations that are homogeneous and separated from the other groups. This is sometimes called unsupervised learning. s Methods we discussed: x Hierarchical clustering x k-means clustering x Model based clustering s Possible applications: x Marketing: nd groups of customers with similar behavior x Biology: classify plants or animals x Internet: cluster text documents
s
14 / 20
Multidimensional scaling
s s s
s s
Not discussed in class Goal: Construct a map from a distance matrix, where the map should represent the distances between the objects as accurate as possible. Possible applications: x Psychology/sociology: subjects say how similar/dierent pairs of objects are. Multidimensional scaling then creates a pictures showing the overall relationships between the subjects. Can be used to aid clustering See overhead slides and R-code 15 / 20
16 / 20
17 / 20
18 / 20
Canonical correlation
s s s s s
We study the relationship between a group of variables Y1 , . . . , Yp and another group of variables X1 , . . . , Xq by searching for linear combinations ai X and bi Y that are most highly correlated Size of (ai X, bi Y ) tells us about the strength of the relationship between X and Y Loadings in ai and bi tell us about the type of relationship between X and Y One can test if the true canonical correlation is dierent from zero (not discussed in class) Possible application: nd clusters among the variables (instead of among the observations) 19 / 20
20 / 20
10