Presentation on theme: "1 What is? Structural Equation Modeling (A Very Brief Introduction) Patrick Sturgis University of Surrey."— Presentation transcript:
1 What is? Structural Equation Modeling (A Very Brief Introduction) Patrick Sturgis University of Surrey
2 What is SEM? SEM is not one statistical technique. It integrates a number of different multivariate techniques into one model fitting process. It is essentially an integration of: –Measurement theory –Factor analysis –Regression –Simultaneous equation modeling –Path analysis
3 SEM is essentially Path Analysis using Latent Variables
4 What are Latent Variables? Most/all variables in the social world are not directly observable. This makes them latent or hypothetical constructs. We measure latent variables with observable indicators, e.g. questionnaire items. We can think of the variance of an observable indicator as being partially caused by: –The latent construct in question –Other factors (error)
5 x = t + e Measured True Score Error Random Error Systematic Error True score and measurement error Mean of Errors =0 Mean of Errors 0 True point on continuum
6 The True Score Equation X = t + e Observed item True score error Problem – with one indicator, the equation is unidentified. We cant separate true score and error. Can be expressed diagrammatically
7 This means we need multiple indicators of each latent variable With multiple indicators we can use Factor Analysis to estimate these parameters Factor analysis transforms correlated observed variables into uncorrelated components We can then use a subset of components to summarise the observed relationships Identifying True Score & Error
8 A Common Factor Model Latent Construct Indicators become conditionally independent Factor loadings = regression of factor on indicators.184.108.40.206 Indicator 1Indicator 2Indicator 3Indicator 4
9 Factor Analysis So the factor loading is the standardised regression of the latent variable on the indicator. Squaring the factor loading gives us the % of variance explained by the latent variable (factor). This can be considered as the true score component of the item. 1-the % variance explained by the factor gives us the residual or error variance. Thus, the variance of the factor contains only the true score component of each item.
10 Benefits of Latent Variables Most social concepts are complex and multi- faceted Using single measures will not adequately cover the full conceptual map Systematic error biases descriptive and causal inferences Stochastic error in dependents leaves estimates unbiased but less efficient Stochastic error in independents attenuates associational effect sizes estimates
11 Remember SEM is essentially Path Analysis using Latent Variables We now know about latent variables, what about path analysis?
12 Path Analysis Sewell Wright, a biologist, developed the fundamental ideas of path analysis in the 1920s. The diagrammatic representation of a theoretical model. Standardised Notation. Estimation of a series of regression models to decompose effects: –Direct –Indirect –Total
13 Exam stress Exam preparation Physical/mental anxiety Exam performance + + - + Direct, Indirect and Total Effects + Example: effect of exam nerves on exam performance
14 We can break down effects of X on Y into direct, indirect and total. X1 Y1 Y2 Direct = a Indirect = b x c Total = a + b x c a b c Effect Decomposition
15 Standard Symbols for Path Analysis Latent variable Observed variable Residual or Error Term Causal Effect Covariance Path
16 So when a path diagram includes latent variables… gender O1 e1 1 1 O2 e2 1 O3 e3 1 class O9 e4 O8 e5 O7 e6 1 111 attitude O4 e7 O5 e8 O6 e9 1 111 behaviour O10e10 O11e11 O12e12 1 1 1 1 e13 1 e14 1 …it becomes a SEM
17 Simultaneous Equations We might estimate this as 3 separate models factor model, run out factor score variable Regression Y on X Regression Y on X, Z In SEM, we estimate the equations simultaneously
18 Estimation & Model Fit Variety of estimators available but predominantly maximum likelihood (ML) Model fit can be tested by comparison of Likelihoods for specified v baseline model Tests significance of difference in likelihood between specified and observed variance covariance matrices With large n, no model fits! Adjusted indices (RMSEA, CFI, etc. etc..) Perhaps more useful for testing nested models, where the comparator is more substantively meaningful.
19 Other things we can do with SEM… Panel data models (tomorrow!) Categorical endogenous variables Multiple group models Latent variable interactions Model missing data Complex sample data Multi-level SEM