Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 8: Factor analysis (FA)

Similar presentations


Presentation on theme: "Lecture 8: Factor analysis (FA)"— Presentation transcript:

1 Lecture 8: Factor analysis (FA)
Rationale and use of FA The underlying model (what is a factor analysis anyway?) Communalities and specificities Factor scores and loadings The use of and rationale for rotations Orthogonal and oblique rotations Component retention, significance, and reliability. Bio 8100s Applied Multivariate Biostatistics 2001

2 What is factor analysis?
From a set of p variables X1, X2,…, Xp, we try and find (“extract”) a set of factors F1, F2,…, Fp that underly the observed variability in X1, X2,…, Xp The hope (sometimes faint) is that most of the variability in the original set of p variables will be accounted for by f < p factors. Bio 8100s Applied Multivariate Biostatistics 2001

3 The uses of factor analysis
Study correlations among a larger number of variables by grouping variables into “factors” such that variables within each factor are more highly correlated with variables in that factor than with variables in other factors Summarize many variables by a few factors, and Interpret each factor according to the meaning of the measured variables Bio 8100s Applied Multivariate Biostatistics 2001

4 The factor model For a set of p variables X1, X2,…, Xp, the model is
where the Xis are standardized variables, aijs are the factor loadings, Fj are the common factors, and ej is a factor specific to variable Xi… … and Note: in the factor model, observables (X) are linear functions of unobservables (F)! Bio 8100s Applied Multivariate Biostatistics 2001

5 The factor model (cont’d)
Since we have Com(Xi) is the communality of Xi, (that portion of its variance that is “explained” by common factors), and Var(ei) is the specificity of Xi (the portion unexplained by common factors) Bio 8100s Applied Multivariate Biostatistics 2001

6 The factor model (cont’d)
..and So, the sample correlation between Xi and Xj is sum of the product of factor loadings. Hence, variables Xi and Xj can be highly correlated only if they have high loadings on the same factors. Since we have Bio 8100s Applied Multivariate Biostatistics 2001

7 The geometry of factors
X3 Variables (Xi) are linear functions of factors (Fi). Hence, the factors are vectors in the Euclidean space defined by the set of variables {Xi}. Because the Fi s are uncorrelated, these vectors meet at right angles. F1 F2 X2 X1 Two factors (F1, F2) in the 3-D Euclidean space defined by {X1, X2, X3} Bio 8100s Applied Multivariate Biostatistics 2001

8 A comparison of PCA and FA models
PCA model: unobservables (Zj) a linear function of observables (Xi) Therefore: PC scores are determinant (given X and a, each Z is unique) we can compute PC scores for each observation. FA model: observables (Xi) a linear function of unobservables (Fj) Therefore: Factors are indeterminant (given X and a, no F is unique) it makes no sense to “compute” factor scores for individual observations Bio 8100s Applied Multivariate Biostatistics 2001

9 Estimating factor loadings
Estimation of the loadings for each factor can be accomplished through several different methods (e.g. least-square estimation, maximum likelihood estimation, iterated principal axis, etc.)… The extracted factors may differ depending on the method of estimation. Bio 8100s Applied Multivariate Biostatistics 2001

10 Factor loadings Factor loadings (aij) are the (theoretical) multiple correlation with the extracted factor. For each variable, the (loading)2 for each factor summed over all factors equals the proportion of the variable’s variance explained by all factors, i.e., the communality… … and the sum of (loading)2 for each variable is the variance explained by each factor. Bio 8100s Applied Multivariate Biostatistics 2001

11 More on factor loadings in factor analysis
Sometimes factors include variables with similar loadings, which form a “natural” group. To assist in interpretation, we may want to choose another component frame which emphasizes these differences among groups. Rotating of factors follows the same procedure/rationale as for PCA. FACTOR(2) Factor plot Bio 8100s Applied Multivariate Biostatistics 2001

12 Bio 8100s Applied Multivariate Biostatistics
FACTOR(2) FACTOR(2) Bio 8100s Applied Multivariate Biostatistics 2001

13 A final word: PCA versus FA
In practice, PCA solutions to data reduction problems often are very similar to FA solutions to the same problem. “The choice of common factors or components methods often makes virtually no difference to the conclusions of a study.” N. Cliff, Analyzing multivariate data. New York: Harcourt, Brace Jovanovich. Bio 8100s Applied Multivariate Biostatistics 2001


Download ppt "Lecture 8: Factor analysis (FA)"

Similar presentations


Ads by Google