Presentation is loading. Please wait.

Presentation is loading. Please wait.

Factor analysis Caroline van Baal March 3 rd 2004, Boulder.

Similar presentations


Presentation on theme: "Factor analysis Caroline van Baal March 3 rd 2004, Boulder."— Presentation transcript:

1 Factor analysis Caroline van Baal March 3 rd 2004, Boulder

2 Phenotypic Factor Analysis (Approximate) description of the relations between different variables –Compare to Cholesky decomposition Testing of hypotheses on relations between different variables by comparing different (nested) models –How many underlying factors?

3 Factor analysis and related methods Data reduction –Consider 6 variables: –Height, weight, arm length, leg length, verbal IQ, performal IQ –You expect the first 4 to be correlated, and the last 2 to be correlated, but do you expect high correlations between the first 4 and the last 2?

4 Data analysis in non- experimental designs using latent constructs Principal Components Analysis Triangular Decomposition (Cholesky) Exploratory Factor Analysis Confirmatory Factor Analysis Structural Equation Models

5 Exploratory Factor Analysis Account for covariances among observed variables in terms of a smaller number of latent, common factors Includes error components for each variable x = P * f + u x = observed variables f = latent factors u = unique factors P = matrix of factor loadings

6 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA Factor 1 IQ, “g” 1

7 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA Factor 1 verbal Factor 2 performal 1 1

8 EFA equations C = P * D * P’ + U * U’ C = observed covariance matrix Nvar by nvar, symmetric P = factor loadings Nvar by nfac, full D = correlations between factors Nfac by nfac, standardized U = specific influences, errors Nvar by nvar, diagonal

9 Exploratory factor analysis No prior assumption on number of factors All variables load on all latent factors Factors are either all correlated or all uncorrelated Unique factors are uncorrelated Underidentification

10 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA Factor 1 verbal Factor 2 performal Fix to 0 1 1

11 Confirmatory factor analysis An initial model is constructed, because: –its elements are described by a theoretical process –its elements have been obtained from a previous analysis in another sample The model has a specific number of factors Variables do not have to load on all factors Measurement errors may correlate Some latent factors may be correlated, while others are not

12 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA Factor 1 verbal Factor 2 performal 11

13 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA Factor 1 verbal Factor 2 performal 11

14 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA VCFDPO

15 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA VCFDPO

16 CFA equations x = P * f + u x = observed variables, f = latent factors u = unique factors, P = factor loadings C = P * D * P’ + U * U’ C = observed covariance matrix P = factor loadings D = correlations between factors U = diagonal matrix of errors

17 Structural equations models The factor model x = P * f + u is sometimes referred to as the measurement model The relations between latent factors can also be modeled This is done in the covariance structure model, or the structural equations model Higher order factor models

18 SIMINFVOCCODCOMARIDIGBLCMAZPICPIAOBA VCFDPO 2 nd order Factor “g” F3F2F1 Second order factor model: C = P*(A*I*A’+B*B')*P' + U*U’

19 Five steps characterize structural equation models Model specification Identification –E.g., if a factor loads on 2 variables only, multiple solutions are possible, and the factor loadings have to be equated Estimation of parameters Testing of goodness of fit Respecification K.A. Bollen & J. Scott Long: Testing Structural Equation Models, 1993, Sage Publications

20 Practice! IQ and brain volumes (MRI) 3 brain volumes –Total cerebellum, Grey matter, White matter 2 IQ subtests –Calculation, Letters / numbers Brain and IQ factors are correlated Datafile: mri-IQ-all-twinA-5.dat

21 Script: phenofact.mx BEGIN MATRICES ; P FULL NVAR NFACT free ;! factor loadings D STAND NFACT NFACT !free ;! correlations between factors U DIAG NVAR NVAR free ;! subtest specific influences M Full 1 NVAR free ; ! means END MATRICES ; BEGIN ALGEBRA; C= P*D*P' +U*U' ;! variance covariance matrix END ALGEBRA; Means M / Covariances C /

22 in exploratory factor analysis, if nfact = 2, one of the factor loadings has to be fixed to 0 to make it an identified model fix P 1 2 In confirmatory factor analysis, specify a brain and an IQ factor SPECIFY P 101 0 102 0 103 0 0 204 0 205 0 206 (if a factor loads on 2 variables only, it is not possible to estimate both factor loadings. Equate them, or fix one of them to 1)

23 Phenotypic Correlations: MRI-IQ, Dutch twins (A), n=111/296 pairs brain cereb brain grey brain white IQ calc IQ L/n Cerebellum1 Grey.631 White.61.551 calculation.23.25.261 Letter/numb..30.19.461

24 What is the fit of a 1 factor model? –C = P * P’ + U*U’, P = 5x1 full, U = 5x5 diagonal What is the fit of a 2 factor model? –Same, P = 5x2 full with 1 factor loading fixed to 0 –(Reducion: fix first 3 factor loadings of factor 2 to 0) Data suggest 2 latent factors: a brain (first 3) and an IQ factor (last 2): what is the evidence for this model? –Same, P = 5x2 full with 5 factor loadings fixed to 0 Can the 2 factor model be improved by allowing a correlation between these 2 factors? –C = P * D * P’ + U*U’, P = 5x2 full matrix (5 fixed), D = stand 2x2 matrix, U = 5x5 diagonal matrix

25 Principal Components Analysis SPSS, SAS, Mx (functions \eval, \evec) Transformation of the data, not a model Is used to reduce a large set of correlated observed variables (x i ) to (a smaller number of) uncorrelated (orthogonal) components (c i ) x i is a linear function of c i

26 PCA path diagram D P S = observed covariances = P * D * P’ x1 x2x3 x4 x5 c1c2c3c4c5

27 PCA equations Covariance matrix q S q = q P q * q D q * q P q ’ P = full q by q matrix of eigenvectors D = diagonal matrix of eigenvalues P is orthogonal: P * P’ = I (identity) Criteria for number of factors Kaiser criterion, scree plot, %var Important: models not identified! x1 x2x3 x4 x5 c1c2c3c4c5

28 Correlations: satisfaction, n=100 Var 1 work Var 2 work Var 3 work Var 4 home Var 5 home Var 6 home Var 11 Var 2.651 Var 3.65.731 Var 4.14.161 Var 5.15.18.24.661 Var 6.14.24.25.59.731

29 ++ 0 0 0 0 0 0 workhome Var 1Var 2Var 3 Var 4 Var 5Var 6

30 PCA: Factor loadings (eigenvalues 2.89 & 1.79) Factor 1Factor 2 Var 1 (work).65.56 Var 2 (work).72.54 Var 3 (work).74.51 Var 4 (home).63-.56 Var 5 (home).71-.57 Var 6 (home).71-.53

31 Triangular decomposition (Cholesky) x1 x2x3 x4 x5 y1y2y3y4y5 1 operationalization of all PCA outcomes Model is just identified! Model is saturated (df=0) 1 1 111

32 Triangular decomposition S = Q * Q’ ( = P # * P # ‘, where P# is P*  D) 5 Q 5 = f110000 f21f22000 f31f32f3300 f41f42f43f440 f51f52f53f54f55 Q is a lower matrix This is not a model! This is a transformation of the observed matrix S. Fully determinate!

33 Saturated model, # latent factors script: phenochol.mx BEGIN MATRICES ; P LOWER NVAR NVAR free ;! factor loadings M FULL 1 NVAR free ; ! means END MATRICES ; BEGIN ALGEBRA; C= Q*Q' ;! variance covariance matrix K=\stnd(C) ;! correlation matrix X=\eval(K) ;! eigen values (i.e., variance of latent factors) Y=\evec(K) ;! eigenvectors (i.e., regression coefficients) END ALGEBRA; Means M / Covariances C /


Download ppt "Factor analysis Caroline van Baal March 3 rd 2004, Boulder."

Similar presentations


Ads by Google