# Applied Structural Equation Modeling for Dummies, by Dummies February 22, 2013 Indiana University, Bloomington Joseph J. Sudano, Jr., PhD Center for.

## Presentation on theme: "Applied Structural Equation Modeling for Dummies, by Dummies February 22, 2013 Indiana University, Bloomington Joseph J. Sudano, Jr., PhD Center for."— Presentation transcript:

Applied Structural Equation Modeling for Dummies, by Dummies February 22, 2013 Indiana University, Bloomington Joseph J. Sudano, Jr., PhD Center for Health Care Research and Policy Case Western Reserve University at The MetroHealth System Adam T. Perzynski, PhD

Thanks So Much!! Acknowledgements: Bill Pridemore PhD
Adam Perzynski PhD David W. Baker MD Randy Cebul MD Fred Wolinsky PhD No conflicts of interest (but I wish there were some major financial ones!)

Presentation Outline Conceptual overview. What is SEM?
Basic idea underpinning SEM Major applications Shared characteristics among SEM techniques Terms, nomenclature, symbols, vocabulary Basic SEM example Sample size, other issues and model fit Software and texts

What Is Structural Equation Modeling?
SEM: very general, very powerful multivariate technique. Specialized versions of other analysis methods. Major applications of SEM: Causal modeling or path analysis. Confirmatory factor analysis (CFA). Second order factor analysis. Covariance structure models. Correlation structure models. SEM is a very general and very powerful multivariate technique that includes specialized versions of other analysis methods as special cases. I’m going to assume that you have some basic understanding of statistical concepts such as standard deviation, variance, covariance and correlations, and analytic techniques such as ordinary least squares, because SEM is basically a specialized version or versions of other analysis methods that you are probably aware of. Causal modeling or path analysis, which hypothesizes causal relationships among variables and test the causal models with linear equation systems. These models can involve either observed variables, latent variables, or both. Confirmatory factor analysis, an extension of factor analysis in which specific hypotheses about the structure of the factor loadings and intercorrelations are tested. Second order factor analysis, an extension of factor analysis in which the correlation matrix of the common factors is itself factor analyzed to provide second order factors. Classic example is the SF-36. Covariance structure models, which hypothesizes that a covariance matrix has a particular form. For example, you can test the hypothesis that a set of variables all have equal variances with this procedure. Classic example is testing for measurement invariance across language groups or sex in personality inventories or subjective measures of health and wellbeing. Correlation structure models, which hypothesizes that a correlation matrix has a particular form. A classic example is the hypothesis that the correlation matrix has the structure of a circumplex (or circular ordering), for example circumplex models of personality and emotions or the circumplex model of marital and family systems.

Advantages of SEM Compared to Multiple Regression
More flexible modeling Uses CFA to correct for measurement error Attractive graphical modeling interface Testing models overall vs. individual coefficients More flexible assumptions (particularly allowing interpretation in the face of multicollinearity). Uses confirmatory factor analysis (CFA) to reduce measurement error by having multiple indicators per latent variable. Most software packages have these nifty graphical interfaces that can save models in gif tif and other formats to insert into documents. SEM allows one to test entire models and to test them overall, versus focusing on individual coefficients.

Test models with multiple dependent variables Ability to model mediating variables Ability to model error terms Other advantages include the ability to test models with multiple dependent variables and the ability to model mediating variables and error terms as well.

Test coefficients across multiple between-subjects groups Ability to handle difficult data Longitudinal with auto-correlated error Multi-level data Non-normal data Incomplete data Test coefficients across multiple between subjects groups. Simple Example: Is the effect of education on income the same for blacks and whites? Ability to handle difficult data: Longitudinal with auto-correlated error Multi-level data Non-normal data Incomplete data

Shared Characteristics of SEM Methods
SEM is a priori Think in terms of models and hypotheses Forces the investigator to provide lots of information which variables affect others directionality of effect

Shared Characteristics of SEM Methods
SEM allows distinctions between observed and latent variables Basic statistic in SEM in the covariance Not just for non-experimental data View many standard statistical procedures as special cases of SEM Statistical significance less important than for more standard techniques

Terms, Nomenclature, Symbols, and Vocabulary (Not Necessarily in That Order)
Variance = s2 Standard deviation = s Correlation = r Covariance = sXY = COV(X,Y) Disturbance = D X Y D Measurement error = e or E A X E But before we move on to some examples we need to get acquainted and re-acquainted with some terms and symbols, the vocabulary of SEM if you will. We have all the basic statistical notions and terms like variance, standard deviation and correlation, but of particular importance to SEM is the notion of covariance and the covariance matrix. Covariance is used in most of the statistical packages as the basic data for structural equation modeling. There is a distinction usually made in SEM regarding residual variances for variables in equations, where disturbance or D is the variance in Y unexplained by a variable X presumed to affect it. Measurement error is slightly different. First X is an observed variable that is presumed to measure A, a latent variable; E is variance in X unexplained by A.

Terms, Nomenclature, Symbols, and Vocabulary
Experimental research independent and dependent variables. Non-experimental research predictor and criterion variables In their most basic form, experimental and non-experimental studies concern the relationship between two variables. In experimental studies we usually use the terms independent and dependent, where the independent var is manipulated by the researcher and the effect on the dependent variable is observed. In non-experimental studies, many researchers tend to use the terms predictor and criterion instead of (respectively) independent and dependent although I use them all with almost no distinction between the two. Somewhat different names for SEM...we typically use the terms observed or manifest indicators indicated by a square or rectangle and latent variables or factors indicated by a circle or oval Observed (or manifest) Latent (or factors)

Terms, Nomenclature, Symbols, and Vocabulary
Exogenous Endogenous Direct effects Reciprocal effects Correlation or covariance “of external origin” Outside the model “of internal origin” Inside the model

Terms, Nomenclature, Symbols, and Vocabulary
Measurement model That part of a SEM model dealing with latent variables and indicators. Structural model Contrasted with the above Set of exogenous and endogenous variables in the model with arrows and disturbance terms The measurement model is that part (possibly all) of a SEM model which deals with the latent variables and their indicators. A pure measurement model is a confirmatory factor analysis (CFA) model in which there is unmeasured covariance (two-headed arrows) between each possible pair of latent variables, there are straight arrows from the latent variables to their respective indicators, there are straight arrows from the error and disturbance terms to their respective variables, but there are no direct effects (straight arrows) connecting the latent variables. The measurement model is evaluated like any other SEM model, using fit indices or measures. Some folks suggest there is no point in proceeding to the structural model until one is satisfied the measurement model is valid. The structural model may be contrasted with the measurement model. It is the set of exogenous and endogenous variables in the model, together with the direct effects (straight arrows) connecting them, and the disturbance and error terms for these variables (reflecting the effects of unmeasured variables not in the model)

Measurement Model: Confirmatory Factor Analysis
Observed or manifest variables GHQ Hostility Hopelessness Self-rated health Psychosocial health D1 e4 e3 e2 e1 The measurement model is that part (possibly all) of a SEM model which deals with the latent variables and their indicators. A pure measurement model is a confirmatory factor analysis (CFA) model in which there is unmeasured covariance (two-headed arrows) between each possible pair of latent variables, there are straight arrows from the latent variables to their respective indicators, there are straight arrows from the error and disturbance terms to their respective variables, but there are no direct effects (straight arrows) connecting the latent variables. The measurement model is evaluated like any other SEM model, using goodness of fit measures. There is no point in proceeding to the structural model until one is satisfied the measurement model is valid. Latent construct or factor Singh-Manoux, Clark and Marmot Multiple measures of socio-economic position and psychosocial health: proximal and distal measures.

Observed or manifest variables GHQ Hostility Hopelessness Income Occupation Education Self-rated health Psychosocial health D1 e4 e3 e2 e1 In traditional regression models we might bundle the 4 observed variables over on the right side into a composite variable “psychosocial health”…we would not have the ability to indicate how educ et al were related to one another...and we would assume perfect measurement of psychosocial health…in SEM we don’t do that. The structural model may be contrasted with the measurement model. It is the set of exogenous and endogenous variables in the model, together with the direct effects (straight arrows) connecting them, and the disturbance and error terms for these variables (reflecting the effects of unmeasured variables not in the model) Latent construct or factor Singh-Manoux, Clark and Marmot Multiple measures of socio-economic position and psychosocial health: proximal and distal measures.

Causal Modeling or Path Analysis and Confirmatory Factor Analysis
Education Hostility e1 a= direct effect b+c=indirect Hopelessness e2 Psychosocial health c Income GHQ e3 D1 D3 Here’s an example of direct and indirect effects, and how the graphical expression of multiple regression in the previous slide now gives way to a more web-like “causal model” where temporal relationships are implied, and the notion of mediation or mediating variables is put forth using a combination of path analysis and confirmatory factor analysis. Self-rated health e4 Occupation D2 Singh-Manoux, Clark and Marmot Multiple measures of socio-economic position and psychosocial health: proximal and distal measures.

What Sample Size is Enough for SEM?
The same as for regression* More is pretty much always better Some fit indexes are sensitive to small samples *Unless you do things that are fancy!

What’s a Good Model? Fit measures: Chi-square test
CFI (Comparative Fit Index) RMSE (Root Mean Square Error) TLI (Tucker Lewis Index) GFI (Goodness of Fit Index) And many, many, many more IFI, NFI, AIC, CIAC, BIC, BCC

How Many Indicators Do I Need?
That depends… How many do you have? (e.g., secondary data analysis) A prior concerns Scale development standards Subject burden More is often NOT better

Software LISREL 9.1 from SSI (Scientific Software International)
IBM’s SPSS Amos EQS (Multivariate Software) Mplus (Linda and Bengt Muthen) CALIS (module from SAS) Stata’s new sem module R (lavaan and sem modules)

SPSS Amos Screenshot

Stata Screenshot

Texts (and a reference)
Barbara M. Byrne (2012): Structural Equation Modeling with Mplus, Routledge Press She also has an earlier work using Amos Rex Kline (2010): Principles and Practice of Structural Equation Modeling, Guilford Press Niels Blunch (2012): Introduction to Structural Equation Modeling Using IBM SPSS Statistics and Amos, Sage Publications James L. Arbuckle (2012): IBM SPSS Amos 21 User’s Guide, IBM Corporation (free from the Web) Rick H. Hoyle (2012): Handbook of Structural Equation Modeling, Guilford Press Great fit index site:

Thanks So Much Again!! Questions????

Download ppt "Applied Structural Equation Modeling for Dummies, by Dummies February 22, 2013 Indiana University, Bloomington Joseph J. Sudano, Jr., PhD Center for."

Similar presentations