Download presentation
Presentation is loading. Please wait.
Published byΣτυλιανός Μανιάκης Modified over 5 years ago
1
Applied Psychometric Strategies Lab Applied Quantitative and Psychometric Series
Falynn Thompson, MS Introduction to Confirmatory Factor Analysis (CFA) October 10, 2018 Hi everyone, welcome to our first aps talk. Today I will give an introductory talk on confirmatory factor analysis. Please ask questions at the end of this talk. Later if you want to stick around we will do a review game.
2
RQ RH Does the hypothesized model fit the data?
What are some example questions (RQs) and hypotheses (RHs) one can propose within a CFA framework? RQ Does the hypothesized model fit the data? Does the five-factor model (Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to Experience) from NEO instrument fit the data? RH The hypothesized model will fit the data. The five-factor model (Extraversion, Agreeableness, Conscientiousness, Neuroticism, and Openness to Experience) fit the data from the NEO instrument. Personality is hypothesized as having 5 factors called the Big Five. 30 indicators of the NEO instrument are hypothesized to measure 5 factors of personality
3
What is a CFA? CFA is a procedure used to verify a hypothesized measurement model Relationship between latent variables or factors (unobserved) and indicators (observed) variables Latent factors are unobserved variables Represent the construct of interest Indicators are manifest variables Item responses or scores on an instrument CFA is a procedure used to test a hypothesized measurement model. In order to be able to test a hypothesized measurement model, you will need a strong background or understanding of a theory because your model would be based on theory. It is a measurement model because it shows the relationship between latent variables and indicators. Latent variables are variables that cannot be observed such as self- efficacy and depression. So for instance, if you want to measure depression, you could use observed variables that the literature review says measure depression like loss of interest, restless sleep, social isolation, etc. Such observed variables can be items on a scale or scores on an instrument. CFA is not exploratory factor analysis because it has no hypothesized measurement structure provided. The model is driven by the data.
4
What’s the difference between EFA and CFA
Exploratory Factor Analysis (EFA) Confirmatory Factor Analysis (CFA) Focus Factor structure emerging from the data Factor structure confirmation (theory driven) Vocality Indicators are free to load on all factors (i.e., multivocal) Each indicator is specified to load on a specific factor in the model (typically without cross-loadings) (i.e., univocal) RQ What’s the dimensionality of the variables in my scale? Does the hypothesized measurement model fit the data? So let’s look at a few differences between a EFA and CFA . Again, an EFA model is based on the data, whereas a CFA model is based on theory. Vocality between the two is different. In an EFA items are able to load on more than one factor and in a CFA model, items are not. They are able to load on one specific factor. The questions that you will ask are different for the two. For an EFA you would ask. And for a CFA you would ask… it could be a one-factor, 8-factor model again it’s according to theory. Clark LA, Watson D. Constructing validity: Basic issues in objective scale development. Psychol Assess. 1995; 7:309–319. 21. Tabachnick, BG., Fidell, LS., Osterlind, SJ. Using multivariate statistics. 4th. New York: Harper & Row; 2001. Data-driven b/c you have no idea about the number of factors there are in the model nor the relationship between the factors and indicators; so the structure “emerges” from the data and the indicators are freely loaded on all the factors in the model
5
Conceptual representation of a forced two-factor EFA solution and two-factor CFA
y1 e1 y2 e2 y3 e3 y4 e4 y5 e5 y6 e6 F 2 F 1 y1 e1 y2 e2 y3 e3 y4 e4 y5 e5 y6 e6 F 2 Here's a simple example of comparing a 2-factor efa and cfa model We see that the one to our left that the indicators are loading on both factors because we simply are not sure of the relationship between the factors and the indicators.
6
What information do we learn from a CFA?
Latent factor (η; eta); Indicator (y) Factor pattern loading (λ; lambda) Latent factor covariance (φ; phi) Error variances (uniquenesses) (ε; epsilon) η2 y6 ε6 y5 ε5 y4 ε4 y3 ε3 y2 ε2 y1 ε1 η1 λ12 1 λ13 λ25 λ26 φ12 φ11 φ22 These are standard Greek symbols we use. Not everyone uses these. The factor pattern loadings are represented by the lambda sign. You can see those located between the factors and the items. The factor loadings show the effect of the latent variable on an indicator. If youre wondering, the lambda sub 1 and 2 just means factor 1 item 2. Next is phi, phi are the latent factor covariance or correlation. That’s the double-headed arrow between both the factors. Error variances uniquenesses are represented by the letter epsilon. It is the unexplained variability of the indicators. Latent factor is represented by eta. This is the total score for each factor. And the indicator in this model is represented by y. it can be any letter. Factor loadings are regression coefficients. Tells the amount of expected change in an indicator for every one unit change in the latent variable. It’s regression at the latent level Factor correlations shows us the degree and direction of relationship between two or more factors. Is useful for testing a measure with a different population for validation. If the loadings are bad for one item, we know not to include it. Indicator error variance (uniqueness) is the variance of the indicator that is unique to the indicator and not shared with other indicator. It’s the variance of the indicator in relation to the factor
7
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters and report Do the values “make sense”? The first step is data preparation. So just make sure your data are ready to be analyzed. This can include coding variables and cleaning up data. Next is model specification, estimation of model parameters, model fit examination, which I will talk more about these in the following slides. Respecification of the model, which happens when hypothesized model is rejected because it doesn’t fit the data well. So you modify the model to improve its fit. This is questionable because some will say this is good if theory backs up the modified model. Others will say this is bad practice because it exploits the data or capitalized on chance findings without theory to support the change. Lastly, examine parameters to make sure the values make sense.
8
Let’s work with an example
Data come from the publicly available National Financial Well-Being Survey (2015) Data used here come from Sample 1, which is made up of adults age 62 and under N = 5,000, but 3 did not complete the 10-item scale, so the data are analyzed on 4,997 cases The data we used come from the National Financial Well-Being Survey. The data consists of adults aged 62 and under. There were 3 incomplete scales so it gave us a total of 4,997 participants
9
10-item Financial Well-Being (FWB) Scale
Item Name FWB1_1 I could handle a major unexpected expense FWB1_2 I am securing my financial future FWB1_3R Because of my money situation... I will never have the things I want in life FWB1_4 I can enjoy life because of the way I’m managing my money FWB1_5R I am just getting by financially FWB1_6R I am concerned that the money I have or will save won’t last FWB2_1R Giving a gift would put a strain on my finances for the month FWB2_2 I have money left over at the end of the month FWB2_3R I am behind with my finances FWB2_4R My finances control my life
10
What were the response options used for items on the FWB Scale?
6 items (FWB1_1-FWB1_6) used the following response options (1) Not at all (2) Very Little (3) Somewhat (4) Very Well (5) Completely 4 items (FWB2_1-FWB2_4) used the following response options (1) Never (2) Rarely (3) Sometimes (4) Often (5) Always
11
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters and report Do the values “make sense”? The first step is data preparation. So just make sure your data are ready to be analyzed. This can include coding variables and cleaning up data. Next is model specification, estimation of model parameters, model fit examination, which I will talk more about these in the following slides. Respecification of the model, which happens when hypothesized model is rejected because it doesn’t fit the data well. So you modify the model to improve its fit. This is questionable because some will say this is good if theory backs up the modified model. Others will say this is bad practice because it exploits the data or capitalized on chance findings without theory to support the change. Lastly, examine parameters to make sure the values make sense.
12
Model Specification The measurement model must be specified in advance so there is a single unique solution Based on theory or previous literature What things need to be specified for my model? The number of factors (latent variables) Should you allow factors to be correlated? Which indicators (variables, items) load on each factor? In our example, we hypothesize that financial well-being can be represented by one factor So you need to have an idea of the # of factros you need in the model; relationship between the observed and latent variables; and it specifies the uniqueness (error) terms, which shows the amount of variance in the indicator that is not explained by the factor. ????For instance, in EFA you can use rotation in order to get your data to fit the simplest structure with the highest possible loading number. If we have two or more factors in an EFA there are too many possibilities to consider for fit, so use rotation to find the simplest soluation. For CFA, we aren’t looking at infinite solutions, but rather one model of how factor loadings correlate between latent variables and items. They either load well or don’t, we aren’t manipulating the loadings to find the best fit, so one solution.
13
Model Identification A model is identified if it has a unique best-fit solution The number of known pieces of information (from the covariance matrix) must be greater than the number of parameters to be estimated Identification of CFA models relies on the number of indicators per factor A model is identified when the number of information is more than the number of unknown in the model.
14
What are some identification guidelines?
Three-indicator rule for uncorrelated factors Two-indicator rule for correlated factors Assumes uncorrelated errors and no multivocal items In reality, we should have multiple indicators per factor! Here are some guidelines that can show you whether or not a model is identified. Typically identified models have three indicators for uncorrelated factors….
15
Conceptual representation of a one-factor CFA model for FWB
FWB1_3R FWB1_4 FWB1_5R FWB1_6 FWB2_1 FWB2_2 FWB2_3 FWB2_4 e1 e2 e3 e4 e5 e6 e1 e2 e3 e4
16
Example: Two-Factor Linear CFA Model
FWB1 FWB2 FWB1_1 Do I need to fix a loading on each factor to make it identified? FWB1_2 FWB1_3R FWB1_4 FWB1_5R FWB1_6R FWB2_1R FWB2_2 FWB2_3R FWB2_4R e1 e2 e3 e4 e5 e6 e1 e2 e3 e4
17
One final detail: Factor metric
The latent variables (factors and uniquenesses) are unobserved, so they have no inherent measurement scale The metric for a factor is determined by Fixing the loading of one item on each factor to 1 Mplus default OR Fixing the variance of each latent factor to 1 As a result the loadings are standardized Latent variables do not have a measurement scale or metric so ?the metric for a factor? Is achieved by fixing a loading of one item on each factor or fixing the variance of a factor to 1
18
Step 1: Model Specification
Recall the Mplus default is to fix the first factor loading to 1, so in the output, the factor loading of FWB1_1 will be 1. This identifies the metric of Factor1 The Mplus is set up to fix the first indictor to 1 and if we want a standardized factor.
19
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters Do the values “make sense”?
20
Model Estimation: Estimators
Maximum Likelihood (ML) Interval (Continuous) Assumes multivariate normality Robust Maximum Likelihood (MLR) Useful when multivariate normality is not tenable Means and Variance Adjusted Weighted Least Squares (WLSMV) Ordinal (Categorical)
21
Example Mplus Input File for 1-factor CFA
22
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters Do the values “make sense”? The first step is data preparation. So just make sure your data are ready to be analyzed. This can include coding variables and cleaning up data. Next is model specification, estimation of model parameters, model fit examination, which I will talk more about these in the following slides. Respecification of the model, which happens when hypothesized model is rejected because it doesn’t fit the data well. So you modify the model to improve its fit. This is questionable because some will say this is good if theory backs up the modified model. Others will say this is bad practice because it exploits the data or capitalized on chance findings without theory to support the change. Lastly, examine parameters to make sure the values make sense.
23
How do I assess model fit?
Exact Fit Chi-Square (2), desire p > .05 Approximate Fit Indices Standardized root mean square residual (SRMR) values≤ .08 is acceptable, but above .10 is poor fitting Root mean square error of approximation (RMSEA) values≤ .06 is acceptable Tucker Lewis Index (TLI) and comparative fit index (CFI) Values ≥ .96 is acceptable Goodness of fit according to Hu, L. T. & Bentler, P. M. (1999). Cutoff criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling 6, 1–55. Goodness of fit index- A numerical summary of the difference between the observed and the expected values under a statistical model. (A Maydeu-Olivares and C Garcı ́a-Forero, 2010)
24
Asparouhov & Muthén (2018, May 2, SRMR in Mplus)
Exact fit is defensible if Chi-square is not significant, regardless of SRMR Approximate fit is tenable if Chi-square is significant and SRMR ≤ .08 and standardized residuals are all small (|rres| < .10; Kline, 2016) Poor fit is concluded if Chi-square is significant and SRMR > .08
25
Mplus Output for 1-factor CFA Model Fit Results
26
Mplus Output for 1-factor CFA Local Fit Results
One large residual out of 36 is not terrible
27
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters Do the values “make sense”? The first step is data preparation. So just make sure your data are ready to be analyzed. This can include coding variables and cleaning up data. Next is model specification, estimation of model parameters, model fit examination, which I will talk more about these in the following slides. Respecification of the model, which happens when hypothesized model is rejected because it doesn’t fit the data well. So you modify the model to improve its fit. This is questionable because some will say this is good if theory backs up the modified model. Others will say this is bad practice because it exploits the data or capitalized on chance findings without theory to support the change. Lastly, examine parameters to make sure the values make sense.
28
NO
29
What are the steps for performing a CFA?
Data preparation* Model specification Model identification Estimation of model parameters Examine model fit Respecification of the model?* If the theoretical/hypothesized model is rejected, a researcher may modify it to fit the data Some will say this is good (if theory driven) Others will say this it is bad practice because it exploits the data (capitalization on chance findings and results may not be replicable – sample specific) Examine parameters Do the values “make sense”? The first step is data preparation. So just make sure your data are ready to be analyzed. This can include coding variables and cleaning up data. Next is model specification, estimation of model parameters, model fit examination, which I will talk more about these in the following slides. Respecification of the model, which happens when hypothesized model is rejected because it doesn’t fit the data well. So you modify the model to improve its fit. This is questionable because some will say this is good if theory backs up the modified model. Others will say this is bad practice because it exploits the data or capitalized on chance findings without theory to support the change. Lastly, examine parameters to make sure the values make sense.
30
Mplus Output for 1-factor CFA: Standardized Factor Loadings
31
Example APA style write-up of CFA data analysis plan
A confirmatory factor analysis (CFA) was conducted to test the one-factor measurement model of the 10-item Financial Well-Being Scale (FWB). The CFA was assessed for exact fit with the WLSMV χ2 and approximate fit with SRMR according to the guidelines put forth by Asparouhov and Muthén (2018). Specifically, exact fit was concluded if the χ2 was not significant (p > .05). Otherwise, approximate fit was concluded if SRMR ≤ .08 and absolute standardized residual correlations were small (i.e., there are no large residuals). According to Kline (2016), small residual correlations may be defined as < .10. OR… Using Asparouhov and Muthen’s (2018) paper on model fit using χ2 and SRMR, the results showed approximate fit between the model and the observed data.
32
Example APA style write-up of CFA Results
Results showed the one-factor FWB model had approximate fit to the data, χ2(35) = , p < . 001, SRMR = Furthermore, all absolute standardized residual correlations were small (< .10). However, one residual correlation was at .12, but can be considered small. Standardized factor pattern loadings ranged from .66 to .83. OR… Using Asparouhov and Muthen’s (2018) paper on model fit using χ2 and SRMR, the results showed approximate fit between the model and the observed data.
33
Example Mplus Input File for 2-factor CFA
34
Mplus Output for 2-factor CFA Model Fit Results
35
Mplus Output for 2-factor CFA Local Fit Results
36
Mplus Output for 2-factor CFA: Standardized Factor Loadings
37
Mplus Output for 2-factor CFA: Factor Correlation
38
How much darn data do I need to collect for a CFA?
Generally, large samples are required, but sample size determination is complex Some authors say that at least cases should be used Other research suggests that sample size requirements should be based on model complexity Others suggest at least 5 times the number of estimated parameters Large numbers are necessary to improve stability in estimates, reduce error, and improve replicability
39
Applied Psychometric Strategies Lab Applied Quantitative and Psychometric Series
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.