Presentation on theme: "Separating Wheat from Chaff: Interpreting Results from Contemporary Analytic Methods Rod McCloy Gary Lautenschlager."— Presentation transcript:
Separating Wheat from Chaff: Interpreting Results from Contemporary Analytic Methods Rod McCloy Gary Lautenschlager
Structural Equation Modeling (SEM) Gary Lautenschlager
Indulge me for a moment… Admonishment I give to my students: All models are wrong… but some are more useful than others. So find more useful models.
Brief Conceptual Review: Path Analysis A 3 variable path model ANTECDNT Person characteristic Exogenous variable MEDIATOR Intervening Variable JOBPERF Job Performance All variables are Observed cf. Schmidt, Hunter & Outerbridge (1986) JAP ANTECDNT MEDIATOR JOBPERF
Brief Conceptual Review: SEM Observed vs Latent variable
SEM as a 2-Step: STEP 1 1. Measurement Model (CFA) component Purpose - confirm that individual latent variables reflect intended constructs Process - covariation among observed variables is used to derive latent variables and the model implied covariances are computed based on the configuration of user hypothesized loadings
CFA Issues Model identification To avoid indeterminacy issues, must fix at least one loading for each latent variable to set its measurement scale. Choice should be made carefully, “best” indicator approach Number of factors For a single-group CFA usually compare the fit of “nested” models with k and k+1 factors (competing theories) For multiple-group CFA can compare constraints across groups to determine measurement equivalence Interpretation issues Scaling of latent variables (fixing the metric) Naming of latent variables Reification of latent variables
Issues in Defining Measurement Models Latent variables correct for measurement error through multiple indicators. Based on the premise that what is common (variance) among the indicators is reliable and hence good. To statistically identify a given latent variable requires a minimum of 3 observed variables, a basic tenet of factor analysis
Issues in Defining Measurement Models cont’d Sometimes multiple indicators do not exist or are too costly Under such circumstances, some researchers suggest use of “quasi” multiple indicators by creating “parcels” of items from a sole multi-item measure The use of parcels is controversial Little, Cunningham, Shahar, Widaman (2002) To parcel of not to parcel: Exploring the question, weighing the merits. Structural Equation Modeling 9(2), 151-173. May become problematic when content representation is dispersed too broadly among smaller parcels.
A Measurement Model (CFA) Let’s start with this CFA model result based on real but idealized data: 9 observed variables 3 latent variables The fit is reasonable and you will be able to judge more for yourself as we proceed.
SEM as a 2-Step: STEP 2 After achieving a satisfactory outcome from the first CFA step, test the focal path model of interest. Structural (regression) Model component Purpose – examine the “fit” of the proposed structural model Process - covariation among observed variables is used to define latent variables AND to compute path model coefficients that indicate causal direction as specified by the researcher
“Fit” of SEM Models Global Fit Measures Various indices that assess how well a given model reproduces the observed covariances based on a particular view of a baseline model. Though many of these indices are like R 2 values being bounded by 0, +1 they do not reflect how good the predictions are with any such SEM model. Reduced form equations (noted in handout) do the latter.
Now consider the first LISREL analysis… HANDOUT Page 1 SEM ANALYSIS SIOP06 Workshop PARTIAL MEDIATION SEM ANALYSIS SIOP06 Workshop PARTIAL MEDIATION Many other SEM packages are available, AMOS, EQS, Mplus, SAS PROC CALIS, etc
Equivalent SEM Models Some models that look very different have the same exact fit How to decide among them as the data will not be informative? Theory! Or good rationale! Measurement model 2 nd Order Factor model Mediation model
Do you see the empirical examples models we examined? The measurement model was: The structural model was:
The fit of each model is equivalent! Can only tell them apart by logic, rationale or theory.
Mediation in SEM The vast majority of SEM models involve mediation or indirect effects. James, L.R., Mulaik, S.A. & Brett, J.M. (2006). A Tale of Two Methods. Organizational Research Methods, 9, 233-244. Discuss the advantages of SEM for examining mediation via multiple indicators rather than the single indicator approach outlined in Baron & Kenny (1986).
A test for complete mediation… HANDOUT Page 9 SEM ANALYSIS SIOP06 Workshop COMPLETE MEDIATION SEM ANALYSIS SIOP06 Workshop COMPLETE MEDIATION James, et al. argue that the inclusion of the significant path for a direct effect, if also supported by rationale or theory, is proper. So for the present example, reject the more parsimonious model in favor of the earlier partial mediation owing to the significant path and an “improvement in fit”.
SEM Models are Not Causal Cause has to be justified by Theory or other sound rationale. "...the method of path coefficients is not intended to accomplish the impossible task of deducing causal relations from the values of the correlation coefficients“ from Sewall Wright, 1934, Annals of Mathematical Statistics, 5, 161-215.
Measurement Equivalence Burgeoning interest in whether with different cultures/subjects/contexts we are measuring the “same thing” when we use a test. Some general approaches taken: CFA IRT
Measurement invariance (MI) can be considered the degree to which measurements conducted under different conditions yield measures of the same attributes (Horn & McArdle, 1992). CFA allows for tests of specific types of invariance.
An example: Measurement Invariance for Personality Constructs Meade, A., Lautenschlager, G. J. & Michels. (in press) Are Internet and Paper-and-Pencil Personality Tests Truly Comparable? A Measurement Invariance Study. Organizational Research Methods. Took a CFA approach to invariance tests…
Online Assessment Many organizations are using online assessment as an option to the traditional paper format, or as the only means for obtaining responses. Given these desirable properties and their prevalence of use, more study is clearly needed regarding the comparability of personality test scores across online and P&P administration formats.
Rationale Dimensions of personality have been identified that are stable over time, predict job task (Barrick & Mount, 1991; Mount & Barrick, 1998) and contextual performance reasonably well (Borman, Penner, Allen, & Motowidlo, 2001), and do not result in as much adverse impact against protected minority groups (Hogan, Hogan, & Roberts, 1996).
Measure used in example Six-item Conscientiousness Scale of Occupational Personality Questionnaire (OPQ-32; SHL, 2000).
Design of Measurement Equivalence Study A quasi-experimental study of 4 Groups using 2 crossed design factors: Online vs Paper format Choice vs NoChoice of format Method & procedure would be similar if say comparing responses across samples from 4 different cultures/countries.
Series of Models Tested across Experimental Groups Model Number NameTest PurposeParameters Constrained Across Groups 1Configural Invariance Does the same general factor model hold in each sample? None. Models estimated freely but simultaneously. 2MetricDo items relate to the latent factor in the same way. Factor loadings (Λ x ) 3ScalarAre there observed item mean differences across groups? Item intercepts (τ x ) 4Unique Variance Are the sources of specific item variance the same across groups? Item uniqueness terms (θ x ) 5Latent Factor Variance Is the variance of latent variable different across groups? Latent Factor Variance/Covariance Matrix (Φ) 6Latent MeanIs the latent variable mean different across groups? Latent mean (κ)
Let’s consider some features of the output for Conscientiousness A one factor solution is obtained in each group for 6 Conscientiousness items. How well does the single factor reflect the covariation among the six items separately within each of the 4 groups. My handout has an abbreviated portion of this 4- group analysis used to establish a baseline model for subsequent measurement equivalence tests. Go to p 17.Go to p 17
Global Fit Measures Many measures have been developed to provide an index of model fit Many scaled between 0 and 1 Analogous to R-square in regression Began to be used as test of model fit Conventional cut point: >.90 Some focus on residuals Conventional cut point <.05 All measures look at global rather than specific fit
Error of Approximation vs. Errors of Estimation Browne, M. W. and Cudeck, R. (1993) Alternative ways of assessing model fit. In Bollen, K.A. & Long, J.S. (eds) Testing structural equation models. Newbury Park, CA: Sage. When measurement models are considered along with structural models, almost all SEM models are misspecified to some extent We should ask about that extent We should distinguish between how close the model approximates the true covariance structure and how fuzzy is our estimate of the model. They find support for RMSEA as an index
Arguing for Models People report many fit statistics and claim that a particular model fits if values exceed.90. Be skeptical about global fit statistics – The fit can be great in one part of model and awful in another Most impressive is a series of models, with one fitting better than another If model is developed with a data set, then the fit statistics should be calculated with a replication data set.
Hierarchical Linear Modeling Gary Lautenschlager
HLM Under Different Names Also known as… Multilevel linear models Mixed effects models (SAS, SPSS) Random effects models It will seem “like” moderated multiple regression but it is NOT OLS MMR.
HLM: Purposes Tests for cross-level relationships. Most social science research involves hierarchical data structures. Individuals are often nested in groups, groups within larger units, etc. Traditional analytic methods have tended to ignore nesting which then creates aggregation bias & other problems. HLM is more useful for analyzing nested data and also for the study of change & individual growth over time
HLM: The Basics To run HLM, you must have Nested (and explicitly linked) data Level 1 outcome measures (and usually predictors too) Measures of the Level 2 grouping variable (and usually Level 2 predictors too)
Modeling Building Based on theory or sound rationale In the end, “correct” model specification is important Build a model starting with the most simple building to the justifiably more complex. Must consider models at 2 or more levels.
Steps in HLM Determine if there is any variance to explain. Assess relations between Level 1 predictors and Level 1 outcomes, within each unit.
Steps in Basic HLM If there is significant variability in intercepts, assess the relationship between Level 2 predictors and intercepts. If there is significant variability in slopes, assess the relationship between Level 2 predictors and slopes. Repeat steps at higher levels of analysis – if desirable & available.
HLM Output Provides Information Regarding: The strength of the Level 1 relationships, as variance explained in the Level 1 outcome measure by the Level 1 predictors. The extent and significance of between-unit variability in the Level 1 intercepts and slopes. Statistics regarding variance components (residual variability) in the multilevel model.
HLM Output Typically Provides Information Regarding: The extent to which the Level 2 predictors explain between-unit variability in the Level 1 outcome measure. Beware: You may be explaining a lot of nearly nothing! The extent to which the Level 2 predictors explain between-unit variability in the Level 1 slopes.
Hierarchical Linear Modeling Multi-step analytic procedure that examines relationships between independent and dependent variables across multiple levels Level 1: Within-group co-variation, Group Mean Centering shown  Level 2: Level 1 intercepts and slopes are modeled using Between-group variables [2a] [2b]
Putting these together forms Multi-level model Unlike OLS approach, “error” depends on group & INDIVIDUAL levels of data.
Public vs. Catholic Example How does student SES influence Ach? Does the relation differ depending on school-type? School type is a group-level “measure”
Public vs. Catholic Example Considering multiple schools, several of each type eq  is developed: This is NOT a linear model that can be estimated by standard regression programs. Assume W=0 for Public, W=1 for Private leads to the following interpretation of coefficients in model…
Legend mean achievement for all Public common to Public Schools mean difference between Public & Private average SES-Ach slope in Public common to Public Schools mean difference in SES-Ach slope Pub vs Priv unique effect of school j on mean Ach holding W j constant unique effect of school j on SES-Ach slope holding W j constant
HLM Terminology Random coefficients Coefficients that are assumed to vary across groups Within unit intercepts Within unit slopes Random effects variance components, we try to minimize these by attributing to effects in the model Fixed effects Level 2 intercept Level 2 slope
HLM estimates/coefficients Level 1 & Level 2 intercepts & slopes Variance of Level 1 & Level 2 residuals Covariance of Level 2 residuals t-test for Level 2 parameters Chi-square test for variance components
Centering Decisions for Predictors Raw metric – can lead to nonsensical results where zero is “not an option” on the measurement scale Grand mean centering Level 1 intercept can be interpreted as an adjusted mean in ANCOVA sense Group mean centering (Level 2 Mean) Intercepts = group means adjusted for Level 1 predictors For details: Kreft, de Leeuw & Aiken (1995) MBR,1995
One-way ANOVA Random Effects Assume no level 1 (X) or 2 (W) predictors are involved.  Level 1 model:  Level 2 model:  Combined model: Above often useful step in HLM to provide CI for Grand Mean Treatment Effect
Consequences of submodel 1… Fully unconditional… no predictors at either level 1 or 2 Variance of outcomes Y owes to treatment & error: Intraclass correlation, proportion of variance that is between level 2 units
Data Example 7,185 students 90 Public Schools, 70 Catholic Schools ~45 Students from each school Student as Level 1: Math Ach (Y) & SES (X) measured at student level School as Level 2: SECTOR (school type) & MEANSES
Handout HLM Analysis p. 27 One-way Random Effects ANOVA Level-1 Model Y = B0 + R Level-2 Model B0 = G00 + U0
Next steps… Several Models are estimated following this one: Consider level 1 or 2 predictors depending on goal useful? residual analysis? Add level two predictors useful? residual analysis? Assume we did some of that and now have the following model to test:
Handout HLM Analysis p. 31 INTERCEPTS & SLOPES as OUTCOMES Level-1 Model Y = B0 + B1*(SES) + R Level-2 Model B0 = G00 + G01*(SECTOR) + G02*(MEANSES) + U0 B1 = G10 + G11*(SECTOR) + G12*(MEANSES) + U1
Consider equations for B0 & B1 on HLM analysis p. 35. What do they mean? SECTOR has impact on slope, can you characterize that?
Some Conclusions Inclusion of SECTOR and MEANSES reduces the unexplained variance in school means. Bottom of p 30. U0 8.61 p 35 U0 2.38 73% of variance in mean MathAch acct’ d for
Though we did not examine the interim analysis the last model accounts for 78% of the unconditional variance in slopes (obtained from Random Coefficients Model). Random Coefficients Model Level-1 Model Y = B0 + B1*(SES) + R Level-2 Model B0 = G00 + U0 B1 = G10 + U1
An important consideration is that the unconditional variance in slopes was fairly small compared to that of school means Caveat explaining even 99% of almost nothing may not mean much practically.
Centering Xs or Level 1 predictors: Helpful, at times even critical Location of Xs – Raw/natural metric Grand Mean centering Centering at Level 2 mean aka Group Mean Centering Other, e.g. value of X at a particular point in time as “zero” For studying Repeated Measures & growth curve models Dummy variable centering
Centering Ws or Level 2 predictors Location of Ws – not so critical Grand Mean centering is often convenient