Presentation on theme: "Representing a Research Problem As a Structural Equation Model David Kaplan, Ph.D. Department of Educational Psychology University of Wisconsin – Madison."— Presentation transcript:
Representing a Research Problem As a Structural Equation Model David Kaplan, Ph.D. Department of Educational Psychology University of Wisconsin – Madison Royal Netherlands Academy of Arts and Sciences Symposium "Advising on Research Methods” March 28 th – 29 th, 2007, Amsterdam.
Introduction How to recognize a research question that can utilize SEM as an analytic method. What are the practical concerns that might arise? How might they be addressed? How to improve the use of SEM for practical questions.
How to recognize a research question that can utilize SEM as an analytic method. What is the value added in using SEM? –Testing mediation (the main value) –Greater degrees of falisfiability (d.f.) –Handling of measurement error Although not restricted to SEM –Incredible model flexibility
Testing mediation (the main value) The main advantage to using conventional SEM is the decomposition of effects –Direct –Indirect –Total This is the main difference when compared to standard regression or ANOVA.
Greater degrees of falisfiability (d.f.) Well specified SEMs that represent theoretical predictions have greater d.f. than standard regression models. Degrees of Freedom are Degrees of Falsifiablity – the number of ways the model can differ from the data assuming the model is true.
We prefer models with high levels of d.f. They can be more severely tested (Popper, Miller, Mayo). Should models fit, then with high d.f. comes greater confidence in the model.
Handling of measurement error A major benefit of SEM is the incorporation of factor analysis to handle measurement error. This was Jöreskog’s achievement. This is not restricted to SEM because regression is a special case of SEM.
Incredible model flexibility New developments now open up a large range of possibilities for modeling that we should be aware of. –Multilevel questions –Longitudinal questions –Mixture distributions
What are the practical concerns that might arise? Sample size –Everyone wants to “do SEM”, but sometimes N’s are too small. Other practical concerns can now be addressed, given large N –e.g. missing data, non-normality Software: Comprehensive but expensive versus limited but inexpensive.
The biggest practical concern is faddism! Editors or reviewers request SEM when not appropriate or practical. Tension when under “publish or perish” pressure. Presentation of findings that are methodologically suspect. –e.g. reporting hundreds of goodness-of-fit indices or using correct procedures that are just “too technical” for the readership. The ethical dilemmas are real.
How might they be addressed? More methodologists on editorial boards. Better education of practitioners – more refined understanding of the role of statistical modeling. This why we are here today.
Better training of methodologists working alongside substantive researchers. Collaborate with smart people in disciplinary fields. Mutual feedback.
How to Improve the Use of SEM for Addressing Practical Questions Modeling the data generating process more carefully. Being concerned about exogeneity. Using the model for causal inference when possible and desired. Related to model evaluation. Severely testing hypotheses.
Modeling the data generating process more carefully. The DGP is the process that generated the data. It’s the joint distribution of the data. This is typically not given enough attention in modeling. Need to examine more closely the probabilistic properties of the joint distribution of the data.
Being concerned about exogeneity. Most statistics textbooks give unclear definitions about exogenous variables Econometric theory provides an important way to think about exogenous variables. Kaplan (2004) – an overview of the problem.
What is exogeneity? Weak exogeneity –Can marginal information in the exogenous variables be ignored when estimating the conditional information Strong exogeneity (Granger non-causality) –Importance of past dynamics Super exogeneity (parameter invariance) –Changes in the conditional relationship.
Using the model for causal inference when possible and desired Do the exogenous variables allow for well defined counterfactual statements? –Manipulationist view of causation (Woodward) –Note seriousness of exogeneity assumption. If so, then counterfactual conditionals can be tested on the model. Causal inference is possible!
Examine the reduced-form of the model. Using total-effects. If causal inference is not possible, then the model describes associations and does not lend itself to “causal modeling”. But this is ok too!
Model evaluation –Misspecification tests –Out-of-sample predictions –Does the model teach us something we didn’t know before?
Severely Testing Hypotheses The additional d.f. in SEMs are a good thing. They put our theories to severe tests. –A critical-rationalist perspective (Popper, Miller, Mayo) Our models are conjectures and the d.f. provide the means for refutation.
Advice to the Consultant 1.Listen for questions of mediation. 2.Listen for subtle issues regarding the data. Multilevel issues, mixture distributions 3.Listen for “causal language”. Can the researcher go beyond goodness-of-fit?
Advice (cont’d) 4. Listen for when SEM is not appropriate. Don’t talk the client into using SEM But keep open to new modeling flexibility. 5. Be aware of the tension that researchers face. But don’t feel the pressure to “bless” analyses that are not correct. 6. Ask questions, but don’t presume.