Structural Equation Modeling

Slides:

Advertisements

Similar presentations

Structural Equation Modeling. What is SEM Swiss Army Knife of Statistics Can replicate virtually any model from “canned” stats packages (some limitations.

Advertisements

SEM PURPOSE Model phenomena from observed or theoretical stances

A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.

Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.

Structural Equation Modeling

Structural Equation Modeling: An Overview P. Paxton.

SOC 681 James G. Anderson, PhD

Structural Equation Modeling

Multivariate Data Analysis Chapter 11 - Structural Equation Modeling.

The Simple Regression Model

“Ghost Chasing”: Demystifying Latent Variables and SEM

Structural Equation Modeling

GRA 6020 Multivariate Statistics Confirmatory Factor Analysis Ulf H. Olsson Professor of Statistics.

G Lecture 51 Estimation details Testing Fit Fit indices Arguing for models.

LECTURE 16 STRUCTURAL EQUATION MODELING.

The General (LISREL) SEM model Ulf H. Olsson Professor of statistics.

G Lect 31 G Lecture 3 SEM Model notation Review of mediation Estimating SEM models Moderation.

Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.

Stages in Structural Equation Modeling

Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.

G Lecture 61 G SEM Lecture 6 An Example Measures of Fit Complex nonrecursive models How can we tell if a model is identified? Direct and.

Multiple Sample Models James G. Anderson, Ph.D. Purdue University.

Confirmatory factor analysis

Path analysis: Observed variables Much has been written about path analysis; has been around for over 20 years; started in sociology. Usually has been.

Hypothesis Testing:.

Structural Equation Modeling 3 Psy 524 Andrew Ainsworth.

Confirmatory Factor Analysis Psych 818 DeShon. Purpose ● Takes factor analysis a few steps further. ● Impose theoretically interesting constraints on.

CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.

Bió Bió 2007 The Use of Structural Equation Modeling in Business Suzanne Altobello Nasco, Ph.D. Assistant Professor of Marketing Southern Illinois University.

SEM: Basics Byrne Chapter 1 Tabachnick SEM

ROB CRIBBIE QUANTITATIVE METHODS PROGRAM – DEPARTMENT OF PSYCHOLOGY COORDINATOR - STATISTICAL CONSULTING SERVICE COURSE MATERIALS AVAILABLE AT:

Regression Analyses. Multiple IVs Single DV (continuous) Generalization of simple linear regression Y’ = b 0 + b 1 X 1 + b 2 X 2 + b 3 X 3...b k X k Where.

Estimation Kline Chapter 7 (skip , appendices)

CJT 765: Structural Equation Modeling Highlights for Quiz 2.

CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.

1 General Structural Equations (LISREL) Week 1 #4.

Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.

Assessing Hypothesis Testing Fit Indices

Measures of Fit David A. Kenny January 25, Background Introduction to Measures of Fit.

Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.

Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.

CFA: Basics Beaujean Chapter 3. Other readings Kline 9 – a good reference, but lumps this entire section into one chapter.

G Lecture 3 Review of mediation Moderation SEM Model notation

SEM: Basics Byrne Chapter 1 Tabachnick SEM

Assessing Hypothesis Testing Fit Indices Kline Chapter 8 (Stop at 210) Byrne page

Correlation & Regression Analysis

CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.

Chapter 15 The Chi-Square Statistic: Tests for Goodness of Fit and Independence PowerPoint Lecture Slides Essentials of Statistics for the Behavioral.

SEM Model Fit: Introduction David A. Kenny January 12, 2014.

Structural Equation Modeling Mgmt 291 Lecture 3 – CFA and Hybrid Models Oct. 12, 2009.

Estimation Kline Chapter 7 (skip , appendices)

Examples. Path Model 1 Simple mediation model. Much of the influence of Family Background (SES) is indirect.

ALISON BOWLING CONFIRMATORY FACTOR ANALYSIS. REVIEW OF EFA Exploratory Factor Analysis (EFA) Explores the data All measured variables are related to every.

The general structural equation model with latent variates Hans Baumgartner Penn State University.

Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.

The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.

Advanced Statistical Methods: Continuous Variables

Structural Equation Modeling using MPlus

Chapter 15 Confirmatory Factor Analysis

CJT 765: Structural Equation Modeling

CJT 765: Structural Equation Modeling

Structural Equation Modeling

Structural Equation Modeling

Confirmatory Factor Analysis

SOC 681 – Causal Models with Directly Observed Variables

Structural Equation Modeling (SEM) With Latent Variables

James G. Anderson, Ph.D. Purdue University

Causal Relationships with measurement error in the data

Testing Causal Hypotheses

Structural Equation Modeling

Presentation transcript:

Structural Equation Modeling Verbal d1 Math Analytic d2 d3 d4 d9 d10 d11 d12 d6 d5 d7 d8 Structural Equation Modeling

What is SEM? Combines measurement models of CFA with goals of multiple regression analysis to allow the prediction of latent variables from other latent variables. Simultaneous regression equations Modeling latent variables from observed variables Estimate parameters of the measurement model & structural model Comparison between implied covariance matrix & observed covariance matrix

Advantages of SEM Testing multiple relationships at a time Multiple independent and dependent variables can be accommodated (DVs can even related to one another) Examining latent variables (but must link them to manifest variables) Specifying measurement error in the model Allows enhanced model fit No assumption of uncorrelated errors (although by default errors uncorrelated  need to change to allow correlated errors)

Model Testing in SEM Measurement model Path model Structural model Also known as confirmatory factor analysis Tests the relationship between the indicators and the latent variables they are supposed to measure Path model Tests the relationships between the exogenous and endogenous variables without the measurement model specified Structural model Tests the relationships between the exogenous and endogenous variables with the measurement model specified

Measurement Model 12  1 X1 X2 X3 X4 1 2 3 4 41 31 21 11  2 5 6 7 8 82 72 62 52 12

Path Model X1 X4 X6 X2 X5 X3

Structural Model 1 2 3 4 1 2 3 4 X1 X2 X3 X4 y1 y2 y3 y4 11 21 31 41 11 21 31 41 11  1 1 21 21  2 52 82 62 72 y5 y6 y7 y8  5  6  7  8

SEM Lingo Exogenous variable – Construct that acts only as a predictor or cause in a model, an IV. Not predicted by anything in the model (A) Endogenous variable – Construct that is an outcome variable in at least one causal relationship, a DV, also mediators (B, C, D) C B A D

SEM Lingo: Constructs and Indicators Exogenous constructs/latent variables are called Ksis – represented by ξ Endogenous constructs/latent variables are called etas –represented by η Exogenous indicator/manifest variable - X Endogenous indicator/manifest variable - Y

SEM Lingo: Constructs and Indicators Y X Y ξ η X η Y X Y Y Ksis Etas

SEM Lingo Nonrecursive – relationship is reciprocal in path diagram Recursive – relationships not reciprocal Nested Models – Models that have same constructs but differ in the number and type of causal relationships represented (i.e., parameters estimated) In nested models, one model is a subset of the other

Measurement Model Matrices Lambda-X (ΛX) Loadings of exogenous indicators; tells how you get from the manifest Xs to the latent Xs Lambda-Y (ΛY) Loadings of endogenous indicators; tells how you get from manifest Ys to latent Ys Theta-delta (θd) Errors of the exogenous indicators, the manifest X variables Theta-epsilon (θε) Errors of the endogenous indicators, the manifest Y variables 8 matrices are used in SEM. Four of them are for the measurement model and four for the structural model

Example   Intelligence School Performance GMAT IQ Test GPA # Pubs LX LY LY LX TE TE TD TD   d d

Structural Model Matrices Beta (B) Relationships of endogenous constructs to endogenous constructs; how DVs cause each other Gamma (Г) Relationships of exogenous constructs to endogenous constructs; how IVs cause DVs *Phi (Φ) Correlations among latent exogenous constructs; correlations among the IVs Psi (Ψ) Residuals from prediction of latent endogenous constructs; Tells whether residuals of prediction are correlated * Phi matrix is relevant in CFA because is allows us to determine whether we are looking at an ablique or orthogonal rotation (are the latent variables correlated?) By default, the values of phi will be free (correlation allowed – oblique). Some of the newer output will provide a correlation matrix of all etas and ksis rather than the phi matrix (this will have the phi matrix submsumed under it).

Example   d Gamma Gamma Beta Phi Gamma GPA # Pubs GMAT IQ Test d Work Performance Sup Rating Peer  Motivation Sup Report Self d Gamma Gamma Beta Phi Gamma School Performance GPA # Pubs  Intelligence GMAT IQ Test d Gamma Psi – Resid

Assumptions in SEM Observations are independent Respondents are randomly sampled In maximum likelihood estimation, multivariate normality assumption Continuous variables (using correlations) except when use: Polychoric correlation matrix Tetrachoric correlation matrix Polyserial correlation matrix Biserial correlation matrix Polychoric correlation matrix – two ordinal variables, 3+ categories Tetrachoric correlation matrix – two binary variables Polyserial correlation matrix – 1 metric, 1 ordinal measure with 3+ categories Biserial correlation matrix – 1 binary measure, 1 metric measure

SEM Sample Size Requirements Absolute minimum = number of covariances or correlations in the matrix Typical min = 5 respondents/parameter estimated, 10/parameter preferred When not MV normal – 15 respondents per parameter ML estimation Can use as few as 50, but 100-150 recommended Ideal n = 200 Estimation techniques other than ML tend to require larger sample sizes

Estimation Procedures In SEM, goal of estimation is to minimize error between the observed and reproduced values in VCV matrix  choose parameter estimates that increase likelihood of reproducing VCV matrix Regr: min Σ(y – y’)2 SEM: min (obs VCV – repr VCV) Note this is analygous to LSE in regression

Estimation Procedures Unweighted Least Squares (OLS) Used in regression, but not in SEM OLS is scale invariant only if the errors of measurement are uncorrelated  this is an assumption in regression but not in SEM Assumes MV normality

SEM Estimation Procedures Generalized Least Squares (GLS) Used in SEM You take the least squares and weight it with a VCV matrix  yields a scale-free estimation procedure Assumes MV Normality

SEM Estimation Procedures Maximum Likelihood Estimation (ML) Weights least squares estimates with VCV matrix; updates VCV matrix each iteration Assumes ML normality As increase sample, GLS = ML Finds parameter estimates that maximize the probability of the data Most commonly used and default estimation procedure in LISREL

SEM Estimation Procedures Weighted Least Squares Makes no assumptions about distribution Need huge sample (n = 500+) No assumption of ML normality required In practice, WLS not really used. ML and GLS are robust against assumptions of multivariate normality

Steps in Conducting SEM Draw a picture of your model including both your latent and manifest variables Test the fit of your measurement model. Adjust as needed to enhance fit of measurement model Once measurement model fits, test fit of structural model Modify structural model involves doing exploratory SEM  not recommended

Anderson & Gerbing Two Step Approach Step 1. Adequacy of Measurement Model Test measurement model; allow all latent variables to correlate Adjust meas model as needed to enhance fit If fit of measurement model is poor, don’t test structural model Fit of structural model is a necessary but not sufficient condition for the fit of structural model Step 2. Adequacy of Structural Model

Hair - Stages in SEM Develop a theoretically based model Construct a path diagram of causal relationships Convert path diagram into set of structural and measurement models Choose the input matrix type and estimate the proposed model Assess the identification of the model Evaluate goodness-of-fit criteria Interpret and modify the model

Develop Model & Path Diagram product factors price-based relationship usage satisfaction with company

Convert path diagram to structural and measurement models X3 X4 Y1 product factors price-based relationship usage satisfaction w/ company Y3 X1 X2 Y4 Y2 X5 X6

Choose Input Matrix & Estimate Model Careful with missing data! No “Missing data” correlation matrix Choice of correlation matrix or VCV matrix Correlation matrix yields standardized weights from +1 to -1 VCV matrix is better for validating causal relationships

Analyzing Correlation versus Covariance Matrices SEM models are based on the decomposition of covariance matrices, not correlation matrices. The solutions hold, strictly speaking, for the analysis of covariance matrices. To the extent that the solution depends on the scale of the variables, analyses based on covariance matrices and correlation matrices can differ.

Model Identification Degrees of freedom (df) are related to the number of parameter estimates Model df must be > or = 0 Just-identified model/saturated model: df = 0; perfect model fit *Over-identified model: df > 0 because more information in the matrix than the number of parameters estimates Under-identified model: df < 0 because model has more parameters estimated than information available. Can’t run due to infinite solutions Compare number of data points to number of parameters where the number of data points is equal to the # of variances and covariances. Overidentification when more data points than parameters to be estimated – this is required to run SEM If too many parameters are estimated, to reduce the number of parameters estimated we need to: Fixing – setting a parameter to a specific value Constraining – setting a parameter equal to another parameter Deleting To establish a scale in a factor: Fix the variance of the factor to 1 Fix the regression coefficient from the factor to one of the measured variables to 1

Identification Problems Often results from a large number of parameters estimated compared to number of correlations provided  too few degrees of freedom Solution: Estimate fewer parameters

Model Fit Fit of model denoted by two things: Small residuals Nonsignificant difference between original VCV matrix and reconstructed VCV matrix To assess model it, SEM provides numerous goodness of fit indices Different indices assess fit in different ways

Types of Goodness of Fit Indices Absolute Fit Measures Overall model fit, no adjustment for overfitting Incremental Fit Measures Compare proposed model fit to another model specified by researcher Parsimonious Fit Measures “Adjust” model to provide comparison between models with differing numbers of estimated coefficients

Specific Goodness of Fit Indices Absolute Fit Measures Chi2 (2) Goodness-of-fit (GFI) Root Mean Square Error of Approximation (RMSEA) Root Mean Square Residual (RMR) Incremental Fit Measures Adjusted-goodness-of-fit (AGFI) Normed Fit Index (NFI) Parsimonious Fit Measures Parsimony Normed Fit Index (PNFI) Parsimony Goodness of Fit Index (PGFI)

Chi2 - 2 A “badness of fit” measure Represents the extent to which the observed and reproduced correlation matrices differ High power (high n size) increases 2 so that it is significant penalized for large n 2 only look at when n.s. 2 difference test – compares nested models Most practical use of chi2

SEM - Degrees of Freedom df = number of known pieces of information – unknowns to be estimated Important in model fit if estimate fewer paths, fit will reduce just by chance (almost all paths not 0 just due to chance) Best possible model fit saturated model in which all the links are estimated by default Number of data points versus number of parameters to be estimated. The # of data points in SEM is the # of variances and covariances in

Goodness of fit index (GFI) The quality of the original model and its ability to reproduce the actual variance-covariance matrix is more easily gauged by GFI This index is similar to R2 in multiple regression This index tells us how much better our model compared to the null model Want higher values, > or = .90

SEM - Degrees of Freedom Manifest A1 Manifest A1 Latent A Latent A Manifest A2 Manifest A2 Manifest B1 Manifest B1 Latent B Latent B Manifest B2 Manifest B2 The second model will fit better (smaller chi2) merely because an additional parameter is estimated. The question of interest is: Is the fit significantly better than it would be without this additional link? Manifest C1 Manifest C1 Latent C Latent C Manifest C2 Manifest C2 Larger 2 - Worse Fit Smaller 2 - Better Fit

Root Mean Square Error of Approximation (RMSEA) A normed index with rules of thumb Prefer a RMSEA < or = .05

Root Mean Squared Residual (RMR) Want this value to be small, < or = .05 is ideal, < or = .1 is probably good Not a normed fit index; size of residuals influenced by variance of variables involved This is just the square root of the residuals. Smaller residuals indicate better fit

Adjusted Goodness of fit index (AGFI) Adjusted goodness-of-fit index (AGFI) was created to account for increases in fit due to chance Similar to the adjusted R2 in multiple regression Can be negative Less sensitive to changes in df than is PNFI/PGFI

Normed Fit Index (NFI) NFI compares fit of the null model to fit of the theoretical model Large values of NFI are best (ideally > or = .9) Criticism of NFI: Comparing the model fit to a model of nothing. Is this meaningful?

Parsimony Normed Fit Index (PNFI) Problem: Most fit indices increase just by estimating more parameters (free more links to be estimated) Just identified model  perfect fit PNFI penalizes you for lack of parsimony No clear benchmark for what is “good” – best to compare between models

Parsimony Goodness of Fit Index (PGFI) Gets smaller as you increase the number of paths in the model No clear benchmark for what is “good” – best to compare between models

Some Common Rules of Thumb for Model Fit Test or Index Good Fit Acceptable Fit Chi-Square Goodness of Fit p > .20 p > .05 GFI .95 .90 AGFI .80 RMR Depends on scale. Closer to 0 is better RMSEA < .05 < .08

Model Testing Strategies with SEM Model Confirmation Single model tested to fit or not fit Problem with “confirmation bias” – many possible models fit *Competing Models Strategy Compares competing models for best fit Nested models should be used Exploratory SEM/Model development Capitalizes on Chance

Testing Competing Nested Models Comparing the fit of hypothesized and alternative models that have the same constructs but differ in number of parameters estimated Models must be nested within one another to compare them  use 2 difference test to compare Typical nested models involve deleting or adding a single path

Testing Nested Models MODEL 1 A B C MODEL 2 A B C These two models are nested within one another because they differ only in the addition of a single link in the second model

Likelihood Ratio Test Problem: sensitive to sample size Solution: CFI changes in CFI less than -.01 Cheung & Rensvold (2002) Structural Equation Modeling, 9, 233-255 CFI is the comparative fit index. It is a modified version of the NFI (an incremental fit index) that is less sensitive to sample size – should be interpreted same as NFI but better for model comparison because sample size won’t affect it

Model Modification: Exploratory SEM Common management practice not to do this  don’t edit structural models  a confirmatory technique (but edit meas model OK) t values Tells us where deleting a path would enhance model fit (paths with n.s. t values)

Model Modification: Exploratory SEM Modification Indices Suggest where paths could be added to increase fit Represents the degree to which 2 would decrease if you added a path To justify change due to MI Fairly large Justifiable Impact model fit Not central to theory When using modification indices to enhance fit, you should only consider MIs that are > 10. Start with the largest MI and add that link – then consider the effects on the model as a whole and see if any more additions would help. A high MI should be associated with a high residual and the residuals should decrease once you free the parameter to be estimated there. Using MIs to modify model is exploratory SEM and involves capitalization on chance, thus some folks think you should not do this. If you choose to, add all the links to the model and then delete links you don’t want based on t test values (< 2) that reflect n.s. parameter estimates.

In LISREL, the modification indices are the changes in the goodness-of-fit c2 that would result from setting that parameter free. Modification Indices

In LISREL, the modification indices are the changes in the goodness-of-fit c2 that would result from setting that parameter free.

SEM with LISREL Traditional LISREL language SIMPLIS language* PRELIS Uses matrix language Specify whether aspects of matrix are free (FR) or fixed (FI) SIMPLIS language* More recent LISREL language More user-friendly syntax PRELIS Prepares raw data for use in LISREL - generates correlation matrix or VCV matrix to input

Running LISREL Program Title line Input Specification Model Specification Path Diagram Output Specification

Exercise: CFA Model Draw the diagram  1  2 X1 X2 X3 X4 X5 X6 X7 X8

Exercise: CFA Model Draw the diagram 12  1 X1 X2 X3 X4 1 2 3 4 41 31 21 11  2 X5 X6 X7 X8 5 6 7 8 82 72 62 52 12

Input Specification: SIMPLIS Title line: Exercise Observed variables: X1 X2 X3 X4 X5 X6 X7 X8 Covariance matrix: 1.65 0.45 1.14 0.35 0.30 1.01 0.51 0.49 0.43 1.58 0.07 0.20 0.20 0.23 0.75 0.17 0.14 0.17 0.27 0.23 0.65 0.41 0.20 0.07 0.21 0.11 0.25 0.85 0.22 0.23 0.26 0.36 0.24 0.25 0.16 0.75 Sample size: 200 Latent variables: LAT1 LAT2 You could input a correlation matrix rather than a VCV matrix or have it get raw data.

LISREL/SIMPLIS Code (continued) RELATIONSHIPS X1 = 1*LAT1 X2 = LAT1 X3 = LAT1 X4 = LAT1 X5 = 1*LAT2 X6 = LAT2 X7 = LAT2 X8 = LAT2 LAT1 = LAT2 LISREL OUTPUT: SS SC EF AD = OFF PRINT RESIDUALS PATH DIAGRAM END OF PROBLEM AD = Admissiaaibility check. Stops after a default number of iterations (20) if can’t converge on a solution. Usually we shut this off because LISREL can almost never converge with it turned on. Once shut off can specify another number of iterations you would like to allow if you’d like.

LISREL/SIMPLIS Code: Estimation Techniques Default estimation technique is ML Other Techniques available: Generalized least squares (GLS) Unweighted least squares (ULS) Weighted least squares (WLS) Other techniques with this syntax Method of estimation: GLS

LISREL/SIMPLIS Code: LISREL Output SS Print standardized solution SC Print completely standardized solution EF Print total & indirect effects, their standard errors & t values VA Print variances and covariances FS Print factor scores regression PC Print correlations of parameter estimates PT Print technical information

Example: Confirmatory Factor Analysis

Verbal d1 Math Analytic d2 d3 d4 d9 d10 d11 d12 d6 d5 d7 d8

Tests of significance for parameter estimates: t values.

Parameter estimates

CHI-SQUARE WITH 51 DEGREES OF FREEDOM = 55.50 (P = 0.31) ESTIMATED NON-CENTRALITY PARAMETER (NCP) = 4.50 90 PERCENT CONFIDENCE INTERVAL FOR NCP = (0.0 ; 26.77) MINIMUM FIT FUNCTION VALUE = 0.11 POPULATION DISCREPANCY FUNCTION VALUE (F0) = 0.0090 90 PERCENT CONFIDENCE INTERVAL FOR F0 = (0.0 ; 0.054) ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.013 90 PERCENT CONFIDENCE INTERVAL FOR RMSEA = (0.0 ; 0.032) P-VALUE FOR TEST OF CLOSE FIT (RMSEA < 0.05) = 1.00 EXPECTED CROSS-VALIDATION INDEX (ECVI) = 0.22 90 PERCENT CONFIDENCE INTERVAL FOR ECVI = (0.21 ; 0.26) ECVI FOR SATURATED MODEL = 0.31 ECVI FOR INDEPENDENCE MODEL = 3.98 CHI-SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM = 1962.12 INDEPENDENCE AIC = 1986.12 MODEL AIC = 109.50 SATURATED AIC = 156.00 INDEPENDENCE CAIC = 2048.69 MODEL CAIC = 250.29 SATURATED CAIC = 562.74 ROOT MEAN SQUARE RESIDUAL (RMR) = 0.028 STANDARDIZED RMR = 0.028 GOODNESS OF FIT INDEX (GFI) = 0.98 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.97 PARSIMONY GOODNESS OF FIT INDEX (PGFI) = 0.64 NORMED FIT INDEX (NFI) = 0.97 NON-NORMED FIT INDEX (NNFI) = 1.00 PARSIMONY NORMED FIT INDEX (PNFI) = 0.75 COMPARATIVE FIT INDEX (CFI) = 1.00 INCREMENTAL FIT INDEX (IFI) = 1.00 RELATIVE FIT INDEX (RFI) = 0.96 CRITICAL N (CN) = 696.82

Hypothesized Model Fit Statistics CHI-SQUARE WITH 51 DEGREES OF FREEDOM = 55.50 (P = 0.31) (This test models the variances and covariances as implied by the parameter expectations) CHI-SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM = 1962.12 (This test only models the variances of the variables and assumes all covariances are 0) GOODNESS OF FIT INDEX (GFI) = 0.98 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.97

CORRELATION MATRIX TO BE ANALYZED V1 V2 V3 V4 M1 M2 -------- -------- -------- -------- -------- -------- V1 1.00 V2 0.52 1.00 V3 0.52 0.48 1.00 V4 0.54 0.54 0.49 1.00 M1 0.16 0.22 0.19 0.23 1.00 M2 0.22 0.28 0.23 0.23 0.48 1.00 M3 0.19 0.21 0.13 0.17 0.47 0.46 M4 0.22 0.23 0.23 0.17 0.48 0.49 R1 0.23 0.25 0.29 0.23 0.14 0.22 R2 0.22 0.17 0.21 0.17 0.17 0.23 R3 0.28 0.22 0.26 0.22 0.18 0.23 R4 0.27 0.25 0.24 0.26 0.21 0.23 M3 M4 R1 R2 R3 R4 M3 1.00 M4 0.50 1.00 R1 0.15 0.28 1.00 R2 0.11 0.19 0.47 1.00 R3 0.17 0.25 0.50 0.51 1.00 R4 0.15 0.23 0.51 0.52 0.52 1.00

FITTED COVARIANCE MATRIX V1 V2 V3 V4 M1 M2 -------- -------- -------- -------- -------- -------- V1 1.00 V2 0.53 1.00 V3 0.51 0.49 1.00 V4 0.54 0.52 0.50 1.00 M1 0.21 0.20 0.20 0.21 1.00 M2 0.21 0.21 0.20 0.21 0.48 1.00 M3 0.21 0.20 0.19 0.20 0.46 0.47 M4 0.22 0.21 0.21 0.22 0.49 0.50 R1 0.24 0.23 0.22 0.23 0.19 0.20 R2 0.23 0.23 0.22 0.23 0.19 0.19 R3 0.25 0.24 0.23 0.24 0.20 0.20 R4 0.25 0.24 0.23 0.25 0.20 0.21 M3 M4 R1 R2 R3 R4 M3 1.00 M4 0.48 1.00 R1 0.19 0.20 1.00 R2 0.19 0.20 0.48 1.00 R3 0.20 0.21 0.50 0.50 1.00 R4 0.20 0.21 0.51 0.51 0.53 1.00 Fitted covariance matrix takes into account the scale of measurement

FITTED RESIDUALS V1 V2 V3 V4 M1 M2 -------- -------- -------- -------- -------- -------- V1 0.00 V2 -0.01 0.00 V3 0.01 -0.01 0.00 V4 0.00 0.02 -0.01 0.00 M1 -0.05 0.01 -0.01 0.02 0.00 M2 0.01 0.07 0.03 0.02 0.00 0.00 M3 -0.02 0.01 -0.06 -0.04 0.01 -0.01 M4 0.00 0.01 0.02 -0.04 -0.01 -0.01 R1 -0.01 0.02 0.07 -0.01 -0.05 0.02 R2 -0.01 -0.05 -0.01 -0.06 -0.03 0.04 R3 0.03 -0.02 0.03 -0.03 -0.02 0.03 R4 0.02 0.00 0.01 0.02 0.01 0.03 M3 M4 R1 R2 R3 R4 M3 0.00 M4 0.01 0.00 R1 -0.04 0.08 0.00 R2 -0.08 -0.01 -0.01 0.00 R3 -0.02 0.04 0.00 0.01 0.00 R4 -0.05 0.01 0.00 0.01 -0.01 0.00 Residuals should be small if fit is good. Fitted residuals depend on the unit of measurement

STANDARDIZED RESIDUALS V1 V2 V3 V4 M1 M2 -------- -------- -------- -------- -------- -------- V1 0.00 V2 -0.87 0.00 V3 0.77 -0.64 0.00 V4 0.17 1.20 -0.66 0.00 M1 -1.43 0.45 -0.21 0.63 0.00 M2 0.16 2.27 0.79 0.66 0.19 0.00 M3 -0.51 0.37 -1.84 -1.12 0.81 -0.40 M4 0.05 0.38 0.63 -1.40 -0.49 -1.08 R1 -0.30 0.59 2.23 -0.24 -1.44 0.68 R2 -0.35 -1.68 -0.27 -1.83 -0.76 1.11 R3 1.02 -0.62 0.92 -0.92 -0.59 0.81 R4 0.60 0.12 0.21 0.57 0.36 0.83 M3 M4 R1 R2 R3 R4 M3 0.00 M4 1.00 0.00 R1 -1.10 2.45 0.00 R2 -2.38 -0.34 -0.32 0.00 R3 -0.70 1.15 -0.03 0.74 0.00 R4 -1.44 0.45 -0.21 0.74 -0.91 0.00 Standardized residuals is a residual divided by the estimated standard error – adjusted for the scale of measurement

The chosen model fits the data quite well. How would other models do The chosen model fits the data quite well. How would other models do? A complete confirmatory analysis would not only test the preferred model but also examine alternative models to assess how easily they could account for the data. To the extent that reasonable alternatives exist, the preferred model must be considered with more caution.

Alternative Measurement Model 1 Verbal Math Analytic d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12

Parameter Estimates

Alternative Model 1 Fit Indices CHI-SQUARE WITH 54 DEGREES OF FREEDOM = 214.02 (P = 0.0) CHI-SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM = 1962.12 GOODNESS OF FIT INDEX (GFI) = 0.93 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.90

Alternative Measurement Model 2 F1 d1 d2 d3 d4 d5 d6 d7 d8 d9 d10 d11 d12

Parameter Estimates

Alternative Model 2 Fit Indices CHI-SQUARE WITH 54 DEGREES OF FREEDOM = 770.57 (P = 0.0) CHI-SQUARE FOR INDEPENDENCE MODEL WITH 66 DEGREES OF FREEDOM = 1962.12 GOODNESS OF FIT INDEX (GFI) = 0.73 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.61

Summary Fit Statistics Model Chi2 df GFI AGFI Hyp 55.50 51 .98 .97 Alt 1 214.02 54 .93 .90 Alt 2 770.57 54 .73 .61 Note: Our hypothesized measurement model is the best fit!!

Example Structural Model Testing

Hypothesized Model e e e e Home Aspire e e e e Ability Achieve e e e Family Income e Ed. Aspirations e Father Education Home Aspire e Occ. Aspirations e Mother Education e e Verbal Ability Verbal Achieve Ability Achieve e Quant. Ability e Quant. Achieve e

Syntax to Specify Hypothesized Structural Model RELATIONSHIPS faminc = 1*home faed = home moed = home verbab = 1*ability quantab = ability edasp = 1*aspire ocasp = aspire verach = 1*achieve quantach = achieve home = ability aspire = home ability achieve = home ability Syntax to Specify Hypothesized Structural Model Just the second part of the syntax which does not reflect the beginning stuff to enter data Note the 1* denotes that the factor loading has been fixed to one – this sets the latent variable to the scale of measurement used by the manifest variable.

COVARIANCE MATRIX TO BE ANALYZED edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 1.02 ocasp 0.79 1.08 verach 1.03 0.92 1.84 quantach 0.76 0.70 1.24 1.29 faminc 0.57 0.54 0.88 0.63 0.85 faed 0.44 0.42 0.68 0.53 0.52 0.67 moed 0.43 0.39 0.64 0.50 0.48 0.55 verbab 0.58 0.56 0.89 0.72 0.55 0.42 quantab 0.49 0.50 0.89 0.65 0.51 0.39 moed verbab quantab -------- -------- -------- moed 0.72 verbab 0.37 0.85 quantab 0.34 0.63 0.87

GOODNESS OF FIT STATISTICS CHI-SQUARE WITH 21 DEGREES OF FREEDOM = 57.17 (P = 0.000034) ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.093 CHI-SQUARE FOR INDEPENDENCE MODEL WITH 36 DEGREES OF FREEDOM = 1407.10 ROOT MEAN SQUARE RESIDUAL (RMR) = 0.047 STANDARDIZED RMR = 0.048 GOODNESS OF FIT INDEX (GFI) = 0.94 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.87

Parameter estimates

T tests

Parameter estimates

STANDARDIZED SOLUTION LAMBDA-Y aspire achieve -------- -------- edasp 0.93 - - ocasp 0.85 - - verach - - 1.28 quantach - - 0.97 LAMBDA-X home ability faminc 0.73 - - faed 0.73 - - moed 0.70 - - verbab - - 0.81 quantab - - 0.77 Lambda Y - Loadings of endogenous indicators; tells how you get from the manifest Ys to the latent Ys Lambda X - Loadings of exogenous indicators; tells how you get from the manifest Xs to the latent Xs

STANDARDIZED SOLUTION BETA aspire achieve -------- -------- aspire - - - - achieve 0.40 - - GAMMA home ability aspire 0.32 0.52 achieve 0.14 0.48 CORRELATION MATRIX OF ETA AND KSI aspire achieve home ability -------- -------- -------- -------- aspire 1.00 achieve 0.85 1.00 home 0.70 0.76 1.00 ability 0.75 0.88 0.73 1.00 PSI 0.39 0.14 Beta (B) Relationships of endogenous constructs to endogenous constructs; how DVs cause each other Gamma (Г) Relationships of exogenous constructs to endogenous constructs; how IVs cause DVs *Phi (Φ) Correlations among latent exogenous constructs (ksis); correlations among the IVs Psi (Ψ) Residuals from prediction of latent endogenous constructs; Tells whether residuals of prediction are correlated Replace phi – Correlations of ksis and etas Ksi - Exogenous constructs/latent variables Eta - Endogenous constructs/latent variables

FITTED COVARIANCE MATRIX edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 1.02 ocasp 0.79 1.08 verach 1.01 0.93 1.84 quantach 0.77 0.71 1.24 1.29 faminc 0.47 0.43 0.71 0.54 0.85 faed 0.48 0.44 0.72 0.54 0.54 0.67 moed 0.46 0.42 0.69 0.52 0.51 0.52 verbab 0.57 0.52 0.91 0.69 0.43 0.43 quantab 0.54 0.49 0.87 0.66 0.41 0.41 moed verbab quantab -------- -------- -------- moed 0.72 verbab 0.42 0.85 quantab 0.39 0.63 0.87 Covariance matrix on the indicator variables

FITTED RESIDUALS edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 0.00 ocasp 0.00 0.00 verach 0.01 -0.01 0.00 quantach -0.01 -0.01 0.00 0.00 faminc 0.09 0.10 0.16 0.09 0.00 faed -0.03 -0.01 -0.04 -0.02 -0.02 0.00 moed -0.02 -0.03 -0.05 -0.02 -0.04 0.03 verbab 0.01 0.04 -0.02 0.02 0.11 -0.01 quantab -0.05 0.00 0.02 -0.01 0.10 -0.02 moed verbab quantab -------- -------- -------- moed 0.00 verbab -0.04 0.00 quantab -0.06 0.00 0.00 SUMMARY STATISTICS FOR FITTED RESIDUALS SMALLEST FITTED RESIDUAL = -0.06 MEDIAN FITTED RESIDUAL = 0.00 LARGEST FITTED RESIDUAL = 0.16

STANDARDIZED RESIDUALS edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 0.00 ocasp 0.00 0.00 verach 1.42 -0.80 0.00 quantach -0.78 -0.36 0.00 0.00 faminc 3.54 3.11 5.35 2.80 0.00 faed -2.25 -0.58 -2.63 -0.86 -2.81 0.00 moed -1.03 -1.03 -2.15 -0.84 -3.24 6.34 verbab 0.88 1.96 -2.28 1.31 4.59 -0.90 quantab -2.56 0.18 1.82 -0.57 3.47 -1.29 moed verbab quantab -------- -------- -------- moed 0.00 verbab -2.14 0.00 quantab -2.37 0.00 0.00 SUMMARY STATISTICS FOR STANDARDIZED RESIDUALS SMALLEST STANDARDIZED RESIDUAL = -3.24 MEDIAN STANDARDIZED RESIDUAL = 0.00 LARGEST STANDARDIZED RESIDUAL = 6.34 Note highest residual between faed and moed – if you looked at MI would also see MI

STANDARDIZED RESIDUALS QPLOT OF STANDARDIZED RESIDUALS 3.5.......................................................................... . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x . . . . . x . . x N . . x O . . x R . . x x . M . . x x . A . . xx . L . x . x . . x x . . Q . x x . . U . x . . A . xx . . N . xx . . T . x x . . I . * . . L . * . . E . x . . S . x . . . x . . . x . . . . . . x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -3.5.......................................................................... -3.5 3.5 STANDARDIZED RESIDUALS

Alternative Model 1 Home Achieve Ability Aspire e Family Income Father Education Mother Education Verbal Ability Quant. Ability Ed. Aspirations Occ. Aspirations Verbal Achieve Quant. Achieve e

Syntax to Specify Alternative Structural Model RELATIONSHIPS faminc = 1*home faed = home moed = home verbab = 1*ability quantab = ability edasp = 1*aspire ocasp = aspire verach = 1*achieve quantach = achieve home = ability aspire = home ability achieve = home ability Let the errors for faed and moed correlate Syntax to Specify Alternative Structural Model

CHI-SQUARE WITH 20 DEGREES OF FREEDOM = 19.17 (P = 0.51) ROOT MEAN SQUARE ERROR OF APPROXIMATION (RMSEA) = 0.0 90 PERCENT CONFIDENCE INTERVAL FOR RMSEA = (0.0 ; 0.058) CHI-SQUARE FOR INDEPENDENCE MODEL WITH 36 DEGREES OF FREEDOM = 1407.10 ROOT MEAN SQUARE RESIDUAL (RMR) = 0.015 STANDARDIZED RMR = 0.015 GOODNESS OF FIT INDEX (GFI) = 0.98 ADJUSTED GOODNESS OF FIT INDEX (AGFI) = 0.95

STANDARDIZED SOLUTION LAMBDA-Y aspire achieve -------- -------- edasp 0.93 - - ocasp 0.85 - - verach - - 1.29 quantach - - 0.97 LAMBDA-X home ability faminc 0.81 - - faed 0.64 - - moed 0.59 - - verbab - - 0.81 quantab - - 0.77

BETA aspire achieve -------- -------- aspire - - - - achieve 0.38 - - GAMMA home ability aspire 0.44 0.39 achieve 0.19 0.43 CORRELATION MATRIX OF ETA AND KSI aspire achieve home ability -------- -------- -------- -------- aspire 1.00 achieve 0.85 1.00 home 0.76 0.83 1.00 ability 0.75 0.87 0.81 1.00 PSI 0.37 0.14

COVARIANCE MATRIX TO BE ANALYZED edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 1.02 ocasp 0.79 1.08 verach 1.03 0.92 1.84 quantach 0.76 0.70 1.24 1.29 faminc 0.57 0.54 0.88 0.63 0.85 faed 0.44 0.42 0.68 0.53 0.52 0.67 moed 0.43 0.39 0.64 0.50 0.48 0.55 verbab 0.58 0.56 0.89 0.72 0.55 0.42 quantab 0.49 0.50 0.89 0.65 0.51 0.39 moed verbab quantab -------- -------- -------- moed 0.72 verbab 0.37 0.85 quantab 0.34 0.63 0.87

FITTED COVARIANCE MATRIX edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 1.02 ocasp 0.79 1.08 verach 1.02 0.93 1.84 quantach 0.77 0.70 1.24 1.29 faminc 0.57 0.53 0.87 0.66 0.85 faed 0.45 0.41 0.68 0.51 0.52 0.67 moed 0.41 0.38 0.63 0.47 0.48 0.54 verbab 0.57 0.52 0.91 0.69 0.54 0.42 quantab 0.54 0.49 0.87 0.65 0.51 0.40 moed verbab quantab -------- -------- -------- moed 0.72 verbab 0.39 0.85 quantab 0.37 0.63 0.87

FITTED RESIDUALS edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 0.00 ocasp 0.00 0.00 verach 0.01 -0.01 0.00 quantach -0.01 -0.01 0.00 0.00 faminc -0.01 0.01 0.01 -0.02 0.00 faed 0.00 0.01 0.00 0.01 0.00 0.00 moed 0.02 0.01 0.01 0.03 0.00 0.00 verbab 0.01 0.04 -0.02 0.03 0.01 0.00 quantab -0.05 0.00 0.02 -0.01 0.00 -0.01 moed verbab quantab -------- -------- -------- moed 0.00 verbab -0.01 0.00 quantab -0.03 0.00 0.00

STANDARDIZED RESIDUALS edasp ocasp verach quantach faminc faed -------- -------- -------- -------- -------- -------- edasp 0.00 ocasp 0.00 0.00 verach 1.26 -1.01 0.00 quantach -0.52 -0.23 0.00 0.00 faminc -0.64 0.45 0.55 -1.17 0.00 faed -0.25 0.45 -0.23 0.58 0.15 0.00 moed 0.82 0.30 0.36 0.91 -0.15 0.00 verbab 0.88 1.93 -2.34 1.50 0.72 0.10 quantab -2.53 0.16 1.59 -0.38 -0.13 -0.50 moed verbab quantab -------- -------- -------- moed 0.00 verbab -0.63 0.00 quantab -1.10 0.00 0.00 SUMMARY STATISTICS FOR STANDARDIZED RESIDUALS SMALLEST STANDARDIZED RESIDUAL = -2.53 MEDIAN STANDARDIZED RESIDUAL = 0.00 LARGEST STANDARDIZED RESIDUAL = 1.93

STANDARDIZED RESIDUALS QPLOT OF STANDARDIZED RESIDUALS 3.5.......................................................................... . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .x . . . . . .x . . .x . N . .x . O . x. . R . xx . M . x . A . xx . L . * . . .* . Q . .* . U . x. x . A . xx . N . . * . T . x x . I . . x . L . . * . E . x . S . .x . . . x . . x . . . . . . x . . . . . . . . . . . . . . . . . . . . . . . . . . . . . -3.5.......................................................................... -3.5 3.5 STANDARDIZED RESIDUALS

Original Model Modified Model BETA aspire achieve BETA -------- -------- aspire - - - - achieve 0.40 - - GAMMA home ability aspire 0.32 0.52 achieve 0.14 0.48 BETA aspire achieve -------- -------- aspire - - - - achieve 0.38 - - GAMMA home ability aspire 0.44 0.39 achieve 0.19 0.43 Original Model Modified Model

Chi2 difference (1) = 38.00 Original Model Modified Model Chi2 57.17 (df = 21) 19.17 (df = 20) RMSEA 0.093 0.0 RMR 0.047 0.015 SRMR 0.048 0.015 GFI 0.94 0.98 AGFI 0.87 0.95 Chi2 difference (1) = 38.00