General Structural Equations (LISREL) Week 3 #4 Mean Models Reviewed Non-parallel slopes Non-normal data.

Slides:



Advertisements
Similar presentations
Writing up results from Structural Equation Models
Advertisements

Need to check (e.g., cov) and pretty-up: The LISREL software may be obtained at Other software packages include Eqs (
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling. What is SEM Swiss Army Knife of Statistics Can replicate virtually any model from “canned” stats packages (some limitations.
General Structural Equations Week 2 #5 Different forms of constraints Introduction for models estimated in multiple groups.
Structural Equation Modeling Mgmt 290 Lecture 6 – LISREL Nov 2, 2009.
General Structural Equation (LISREL) Models
Structural Equation Modeling
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
4.3 Confidence Intervals -Using our CLM assumptions, we can construct CONFIDENCE INTERVALS or CONFIDENCE INTERVAL ESTIMATES of the form: -Given a significance.
8. Heteroskedasticity We have already seen that homoskedasticity exists when the error term’s variance, conditional on all x variables, is constant: Homoskedasticity.
Multiple regression analysis
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Economics 20 - Prof. Anderson1 Multiple Regression Analysis y =  0 +  1 x 1 +  2 x  k x k + u 6. Heteroskedasticity.
The General LISREL MODEL and Non-normality Ulf H. Olsson Professor of Statistics.
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
Structural Equation Modeling
G Lecture 51 Estimation details Testing Fit Fit indices Arguing for models.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
Economics Prof. Buckles
The General LISREL MODEL and Non-normality Ulf H. Olsson Professor of Statistics.
The General (LISREL) SEM model Ulf H. Olsson Professor of statistics.
G Lect 31 G Lecture 3 SEM Model notation Review of mediation Estimating SEM models Moderation.
Linear Regression 2 Sociology 5811 Lecture 21 Copyright © 2005 by Evan Schofer Do not copy or distribute without permission.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Bootstrap spatobotp ttaoospbr Hesterberger & Moore, chapter 16 1.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Objectives of Multiple Regression
Inference for regression - Simple linear regression
Overview of Meta-Analytic Data Analysis
Moderation in Structural Equation Modeling: Specification, Estimation, and Interpretation Using Quadratic Structural Equations Jeffrey R. Edwards University.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inference for Linear Regression Conditions for Regression Inference: Suppose we have n observations on an explanatory variable x and a response variable.
1 General Structural Equation (LISREL) Models Week 1 Class #3.
1 General Structural Equation (LISREL) Models Week 2 #3 LISREL Matrices The LISREL Program.
Chapter 8 – 1 Chapter 8: Bivariate Regression and Correlation Overview The Scatter Diagram Two Examples: Education & Prestige Correlation Coefficient Bivariate.
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Estimation Kline Chapter 7 (skip , appendices)
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
1 General Structural Equations (LISREL) Week 1 #4.
General Structural Equation (LISREL) Models Week 3 # 3 MODELS FOR MEANS AND INTERCEPTS.
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Robust Estimators.
G Lecture 3 Review of mediation Moderation SEM Model notation
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
1 ICPSR General Structural Equation Models Week 4 #4 (last class) Interactions in latent variable models An introduction to MPLUS software An introduction.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
SEM Model Fit: Introduction David A. Kenny January 12, 2014.
8-1 MGMG 522 : Session #8 Heteroskedasticity (Ch. 10)
General Structural Equations (LISREL)
Estimation Kline Chapter 7 (skip , appendices)
Tutorial I: Missing Value Analysis
 Assumptions are an essential part of statistics and the process of building and testing models.  There are many different assumptions across the range.
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Estimating standard error using bootstrap
Structural Equation Modeling using MPlus
INTRODUCTORY STATISTICS FOR CRIMINAL JUSTICE
CHAPTER 29: Multiple Regression*
CHAPTER 26: Inference for Regression
LISREL matrices, LISREL programming
Confirmatory Factor Analysis
General Structural Equation (LISREL) Models
Causal Relationships with measurement error in the data
Presentation transcript:

General Structural Equations (LISREL) Week 3 #4 Mean Models Reviewed Non-parallel slopes Non-normal data

2 Models for Means and Intercepts (continued) Multiple Group Models: For “zero order” latent variable mean differences: “free” individual measurement equation intercepts but constrain them to equality across groups Fix the latent variable means to 0 in group 1 Free the latent variable means in groups 2->k If the latent variables of interest are endogenous and if there are exogenous latent variables in the model, constrain construct equation path coefficients to zero.

3 Models for Means and Intercepts (continued) For “zero order” latent variable mean differences: “free” individual measurement equation intercepts but constrain them to equality across groups Fix the latent variable means to 0 in group 1 Free the latent variable means in groups 2->k If the latent variables of interest are endogenous and if there are exogenous latent variables in the model, constrain construct equation path coefficients to zero. Individual LV mean parameters represent contrast with (differerence from) “reference group” (group with LV mean set to zero; LR tests requested for joint hypotheses (e.g, constrain means to zero in all groups vs. model with groups 2->k freed) Check modification indices on measurement equation intercepts to verify “proportional indicator differences” assumption holds (or at least holds approximately)

4 AMOS Programming Check off “means and intercepts” Means and intercepts will now appear on diagram. Where variances used to appear, there will now be two parameters (mean + variance); where the variable is dependent, one parameter (intercept) will appear. Impose appropriate parameter constraints [insert brief demonstration here!]

5 Review yesterday’s slides from slide 52 Uses World Values Study 1990 data for an example We’ll use an updated version (new data, some difference in countries) today Refer to handout (slides not reproduced)

6 Means1a.LS8 - tau-x elements allowed to vary between countries. Must fix kappa (mean of ksi’s) to 0 since otherwise not identified. Chi-square= df=42 United States: TAU-X A006 F028 F066 F063 F118 F (0.0263) (0.0688) (0.0563) (0.0733) (0.0941) (0.0749) TAU-X F120 F (0.0883) (0.0757) CANADA: TAU-X A006 F028 F066 F063 F118 F (0.0236) (0.0612) (0.0551) (0.0706) (0.0812) (0.0646) TAU-X F120 F (0.0713) (0.0636)

7 Means1b.ls8Measurement model like means1a, but now we are expressing group 1 versus group 2 differences in means by 2 parameters (1 for each latent variable) as opposed to calculating them for each indicator using, e.g., TX 1 [1] – TX1 [2]. Chi-square= df=48 KAPPA in Group 2 (Canada) [Kappa in Group 1 is fixed to zero] KSI 1 KSI (0.0731) (0.0948) Above provides significance tests for: Canada-U.S. differences in religiosity (z= , p<.001) Canada-U.S. differences in sex/morality attitudes (z=3.4138, p<.001) For a joint significance test to see if both the means for Religiosity and Sex/morality are different (null hypothesis, differences both = 0), see program Means1c.ls8. Chi-square = df=50 for this model; subtract chi-squares ( ) for test (df=2).

8 Diagnostics for this model: See Modification Indices for TX vectors: USA Modification Indices for TAU-X A006 F028 F066 F063 F118 F Modification Indices for TAU-X F120 F CANADA Modification Indices for TAU-X A006 F028 F066 F063 F118 F Modification Indices for TAU-X F120 F Expected Change for TAU-X A006 F028 F066 F063 F118 F Expected Change for TAU-X F120 F

9 Means2a Model with exogenous single-indicator variables. Single indicator ksi-variables: gender, age, education. Specification GA=IN in group 2 implies a parallel slopes model. Thus, the AL parameters in group 2 can be interpreted as “group 1 vs. group 2 differences, controlling for differences in sex, education and age”. TAU-X GENDER AGE EDUC (0.0146) (0.4750) (0.0413) ALPHA ETA 1 ETA (0.0714) (0.0954) KAPPA GENDER AGE EDUC (0.0187) (0.6297) (0.0504)

10 Diagnostics: Test of equal slopes (GA=IN) assumption: Modification Indices for GAMMA GENDER AGE EDUC ETA ETA A global test will require the estimation of a separate model (Means2b) with GA=PS (parallel slopes assumption relaxed). Chi-squaredf CFI Chi-square comparisonsMeans2a: Means2b:

11 Means2b ALPHA CANADA (FIXED TO 0 IN US) ETA 1 ETA (0.0725) (0.0968) GAMMA - USA GENDER AGE EDUC ETA (0.1003) (0.0031) (0.0352) ETA (0.1462) (0.0045) (0.0520) GAMMA-Canada GENDER AGE EDUC ETA (0.0931) (0.0028) (0.0389) ETA (0.1200) (0.0036) (0.0521)

12 Expressing effects when parallel slope assumption is relaxed: is pattern diverging, converging, crossover? Equations: Eta1 = alpha1 + gamma1 Ksi 1 + gamma2 Ksi2 + gamma3 Ksi 3 + zeta1 Hold constant at the 0 values of all Ksi variables except one. Not quite the overall mean (Ksi=0 in group 1, but in group 2 it’s 0 + kappa), but close enough. In group 1, alpha1 = 0, equation is: Eta1 = gamma1 [1]Ksi1 [+alpha1=0 + gamma2 Ksi2=0 + gamma3 Ksi3=0 + zeta1 where E(zeta1)=0 In group 2, alpha1 = alpha1[2] Eta1 = alpha1[2] + gamma1[2] Ksi1[+ other terms =0] Now, the question is, at what values do we evaluate the equation? 1. Ksi1=0This is the Ksi1 mean in group 1. (we could, alternatively use something like kappa1[2]/2, which is half way between the group 1 and the group 2 mean of kappa1 … or even a weighted version) 2. Ksi1 = 0 + k standard deviations, where k can be any reasonable number 1? 1.5? 2.0? 3. Ksi1 = 0 – k standard deviations.

13 How do we find the standard deviation of Ksi? Look at the PHI matrix to obtain variances, and take the square root of these! PHI USA GENDER AGE EDUC GENDER (0.0102) AGE (0.2350) ( ) EDUC (0.0204) (0.6670) (0.0818)

14 For education, if we had a pooled estimate (Canada + US) we could use it, otherwise, we can be approximate , ~ 1.72 sqrt(1.72) = 1.3. So we will want to evaluate at EDUC=0, EDUC=+1.3 (or perhaps +2.6?), EDUC=-1.3 (or perhaps -2.6?). At Educ=0, Canada-US difference is (see alpha parameter, above) USA=0 Canada= At Educ=-2.6, USA= 0 + (-2.6 *.0817)[usa gamma for educ =.0817] = Canada = (-2.6 *.1525)[ Canadian gamma for educ =.1525] = 858 At Educ = +2.6, USA = 0 + (2.6 *.0817) =.2124 Canada = (2.6 *.1525) = 1.651

15

16 For age, approximate variance is sqrt (270) = We could thus use 0 ± or 0 ≠ (or 0 ≠ (1.5 * 16.43) or if we knew that the mean was approximately 42 (see tau-x parameter), we could simply do something like ± 20 years (more intuitive)

17 Models for Four Groups Canada U.S.A. Germany U.K. Means3a GA=PS Chi-square = df=180 Means3b GA=INChi-square = df=198

18 Formulas: USA: =0.0738*B8 Canada: =1.087+(B8*0.1457) UK : = (B8* ) Germany: = (B8*0.0957) [B8 refers to the first education row. Formula becomes B9, B10 For rows below]

19

Dealing with data that are not normally distributed within the traditional LISREL framework Questions: - how bad is it if our data are not normally distributed? - what can we do about it? - are there easy “fixes”?

21 Non-Normal Data How about just ignoring the problem? Early 1980s: Robustness studies. Major findings:  In almost all cases, using LV models better than OLS even if data non-normal  (assumes multiple indicators available)  some discussion of conditions under which parameters might not be accurate (e.g., low measurement coefficient models)

22 Non-Normal Data Early articles:  A. Boomsa, On the Robustness of LISREL  Johnson and Creech, American Sociological Review, 48(3), 1983,  Henry, ASR, 47:  (related: Bollen and Barb, ASR, 46: ) See a good summary of early and later simulation studies: West, Finch and Curran in Hoyle.

23 Non-Normal Data  See a good summary of early and later simulation studies: West, Finch and Curran in Hoyle.  Formal properties: Consist ent? Asymp. Efficient? Acov (θ) X2X2 Multinormal (no kurtosis) √√√√ Elliptical√√XX Arbitrary√XXX

24 Non-Normal Data Many of the studies have involved CFA models E.g., Curran, West, Finch, Psych. Methods, 1(1), General findings (non-normal data): ML, GLS produce X 2 values too high Overestimated by 50% in simulations GLS, ML produce X 2 value slightly larger when sample sizes small, even when data are normally distributed Underestimation of NFI, TLI, CFI Also underestimated in small samples esp. NFI Moderate underestimation of std. errors (phi 25%, lambda 50%)

25 Non-Normality  Detection: u r = E(x – u r ) r kurtosis  4 th moment Mean of 3 standardized: u 4 / u 2 2 Standardized 3 rd moment u 3 / (u 2 ) 3/2 Tests of statistical significance usually available (Bollen, p. 421) b 1, b 2 (skew,kurt)  N(0,1) test statistic for Kurtosis (H 0 : B 2 – 3 = 0)  Different tests (one approx. requires N>1000) Joint test κ 2 Approx. distr. as X 2, df=2 Mardia ’ s multivariate test: skewedness, kurtosis, joint.

26 Non-Normality An alternative estimator: F wls (also called F agls ): [s – σ(θ) ’ w -1 [s – σ(θ)] Browne, British Journal of Mathematical and Statistical Psychology, 41 (1988) 193ff. also 37 (1984), Optimal weight matrix? asymptotic covariance matrix of s ij Acov(s ij,s gh ) = N -1 ( σ ijgh - σ ij σ gh ) S ijgh = 1/N Σ (z i )(z j )(z g )(z h ) where z i is the mean-deviated value If multinomial: σ ijgh = σ ij σ gh + σ jg σ jh + σ jh σ jg (reduces to GLS) W -1 is ½ * (k)(k+1) + ½ (k)(K+1)

27 Non-Normality An alternative estimator: F wls (also called F agls ): [s – σ(θ)’ w -1 [s – σ(θ)] W -1 is ½ * (k)(k+1) + ½ (k)(K+1) Computationally intense: 20 variables: 22,155 distinct elements To be non-singular, N must be > p + ½ (p)(p+1) 20 variables: minimum variables: minimum 495 Older versions of LISREL used to impose higher restrictions (refused to run until thresholds well above the minima shown above were reached)

28 Non-Normality An alternative estimator: F wls (also called F agls ): [s – σ(θ)’ w -1 [s – σ(θ)] W -1 is ½ * (k)(k+1) + ½ (k)(K+1) The AGLS estimator is commonly available in SEM software LISREL 8 AMOS SAS-CALIS EQS Be careful! Not really suitable for small N problems Good idea to have sample sizes in the thousands, not hundreds.

29 Non-Normality An alternative estimator: F wls (also called F agls ): [s – σ(θ)’ w -1 [s – σ(θ)] W -1 is ½ * (k)(k+1) + ½ (k)(K+1) The AGLS estimator is commonly available in SEM software LISREL 8: ME=WL in OU statement; must also provide asymptotic covariance matrix generated by PRELIS  AC FI= statement follows CM FI= statement AMOS: check box on analysis options Again, the problem is that this estimator can be unstable given the size of the matrix (acov) that needs to be inverted (especially in moderate sample sizes)

30 Non-Normality Sample program in LISREL with adf estimator: LISREL model for religiosity and moral conservatism Part 2: ADF estimation DA NI=14 NO=1456 CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc SE / MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c ps=sy,fr te=sy va 1.0 ly 1 1 ly 8 2 fr ly 2 1 ly 3 1 ly 4 1 ly 5 1 fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2 fr te 2 1 te te 7 6 ou me=ml se tv sc nd=3 mi

31 Non-Normality Generating asymptotic covariance matrix in PRELIS

32 Non-Normality Generating asymptotic covariance matrix in PRELIS Resultant matrix will be much larger than covariance matrix

33 Non-Normality ADF estimation LISREL model for religiosity and moral conservatism Part 2: ADF estimation DA NI=14 NO=1456 CM FI=h:\icpsr99\nonnorm\relmor1.cov ACC FI=h:\icpsr99\nonnorm\relmor1.acc SE / MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c ps=sy,fr te=sy va 1.0 ly 1 1 ly 8 2 fr ly 2 1 ly 3 1 ly 4 1 ly 5 1 fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2 fr te 2 1 te te 7 6 ou me=wl se tv sc nd=3 mi

34 Non-Normality ML, scaled statistics LISREL model for religiosity and moral conservatism Part 2: ADF estimation DA NI=14 NO=1456 CM FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.cov ACC FI=h:\icpsr2003\Week4Examples\nonnormaldata\relmor1.acc SE / MO NY=11 NX=3 NE=2 Nk=3 fixedx ly=fu,fi ga=fu,fr c ps=sy,fr te=sy va 1.0 ly 1 1 ly 8 2 fr ly 2 1 ly 3 1 ly 4 1 ly 5 1 fr ly 6 2 ly 7 2 ly 9 2 ly 10 2 ly 11 2 fr te 2 1 te te 7 6 ou me=ml se tv sc nd=3 mi

35 Non-Normality Low tech solutions: For variables that are continuous, TRANSFORMATION See classic regression texts such as Fox Common transformations: X  log(X) (usually natural log) X  sqrt (X) X  X 2 X  1/ X (even harder to interpret since this will result in sign reversal) Transforming to remove skewedness often/usually removes kurtosis, but this is not guaranteed “Normalization” as an extreme option (e.g., map rank-ordered data onto N(0,1) distribution).

36 Non-Normality Generally, if kurtosis between +1 and -1, not considered too problematic (See Bollen, 1989) From this…….

37 Transformations AMOS:Transformations must be performed on SPSS dataset. Save new dataset, and work from this. (e.g, COMPUTE X1 = LOG(X1).) LISREL:Transformations can be performed in PRELIS. PRELIS already provides distribution information on variables as a “check” PRELIS “compute” dialogue box under transformations Remember to SAVE the Prelis dataset after each transformation. Use of stat package (SPSS, Stata, SAS) may be preferable

38 Transformations All the usual caveats apply: 1. If a variable only has 4-5 values, transformation will not normalize a variable (at the very least, will still have tucked-in tails) – though it could help bring it closer to within the +1  -1 range (Kurtosis) 2. If a categorized variable has one value with a majority of cases, then no transformation will work 3. If the variable has negative values, make sure to add a constant (“offset”) before logging

39 Other solutions: 1. Robust test statistics (Bentler) Implementation: EQS, LISREL 2. Muthen has recently developed a WLSM (mean- adjusted) and WLSMV (mean and variance adjusted) estimator Implementation: MPLUS only 3. Bootstrapping Implementation: AMOS (easy to use) LISREL (awkward) 4. CATEGORICAL VARIABLE MODELS (CVM).

40 Bootstrapping  Computationally intensive  Sampling with replacement; from resampling space R draw bootstrap sample S * n,j where j=# of samples, n=bootstrap n  Typically, bootstrap N = sample N  Repeat resampling B times, get set of values Issue: what if, across 200 resamples, 2 of them have ill-defined matrices?  Usually, these are discarded

41 Bootstrapping  Computationally intensive  Sampling with replacement; from resampling space R draw bootstrap sample S * n,j where j=# of samples, n=bootstrap n  Typically, bootstrap N = sample N  Repeat resampling B times, get set of values Issue: what if, across 200 resamples, 2 of them have ill- defined matrices?  Usually, these are discarded  Tests: 5% confidence intervals (want large # of samples… confidence intervals do not need to be symmetric (can look to value at 95 th percentile and at 5 th among bootstrapped samples).  More common to compute standard errors

42 Bootstrapping  Overall model X 2 correction (available in AMOS).. Bollen and Stine.  Yang and Bentler (chapter in Marcoulides & Schumacker): “faith” in bootstrap based on its appropriateness in other app’s Simulation study, 1995, if explor. factor analysis … rotated solutions close, but not so with unrotated solutions “It seems that in the present stage of development, the use of the bootstrap estimator in covariance structure analysis is still limited. It is not clear whether one can trust the bias estimates.”

43 Bootstrapping  Ichikawa and Konishi, 1995 When data multinormal, bootstrap se’s not as good as ML Bootstrap doesn’t seem to work when N<150 consistent overestimation (at N=300, not a problem though).

44 The Categorical Variable Model Conceptual background: We observe y interested in latent y* with C discrete values Y i = C i – 1 if v i, ci-1 < y i *where v is a threshhold Y i = C i – 2 if v i, ci-2 < y i * ≤ v i, ci-1 Y i = C i – 3 if v i, ci-3 < y i * ≤ ≤ v i, ci-2 ….. 1 If v 1,1 if v i,1 < y i * ≤ v i,2 0if y i * ≤ v i,1 v ’ s are threshhold parameters to be estimated.

45 The Categorical Variable Model Observed and Latent Correlations X-variable scale y-variable scale Observed correl. Latent corr. Continuouscontinuouspearsonpearson Contiuouscategoricalpearsonpolyserial Continuousdichtoomouspoint-biserialbiserial Categoricalcategoricalpearsonpolychoric Dichotomousdichotomousphitetrachoric If it is reasonable to assume that continuous and normally distributed y* variables underlie the categorical y variables… a variety of latetn correlations can be specified.

46 The Categorical Variable Model If it is reasonable to assume that continuous and normally distributed y* variables underlie the categorical y variables… a variety of latetn correlations can be specified. First step: estimate thresholds using ML Second step: latent correlations estimated Third step: obtain a consistent estimator of the asymptotic covariance matrix of the latent correlations (for use in a weighted least squares estimator in the SEM model). Extreme case: ability to recover y* model when variables split into 25%/75% dichotomies: promising (though X 2 underestimated)