Structural Equation Modeling (SEM) – Some Basic Concepts Kathryn Sharpe & Wei Zhu.

Slides:



Advertisements
Similar presentations
Writing up results from Structural Equation Models
Advertisements

1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling. What is SEM Swiss Army Knife of Statistics Can replicate virtually any model from “canned” stats packages (some limitations.
SEM PURPOSE Model phenomena from observed or theoretical stances
Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
General Structural Equation (LISREL) Models
Structural Equation Modeling
The Multiple Regression Model.
Structural Equation Modeling (SEM) Niina Kotamäki.
Research Support Center Chongming Yang
Statistical Techniques I EXST7005 Multiple Regression.
Structural Equation Modeling: An Overview P. Paxton.
Confirmatory Factor Analysis
Path Analysis SAS/Calis. Read in the Data options formdlim='-' nodate pagno=min; TITLE 'Path Analysis, Ingram Data' ; data Ingram(type=corr); INPUT _TYPE_.
Applied Structural Equation Modeling for Dummies, by Dummies Borrowed gratefully from slides from February 22, 2013 Indiana University, Bloomington Joseph.
Correlation and regression Dr. Ghada Abo-Zaid
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Ch11 Curve Fitting Dr. Deshi Ye
Structural Equation Modeling
Psychology 202b Advanced Psychological Statistics, II April 5, 2011.
Multiple regression analysis
The Simple Linear Regression Model: Specification and Estimation
Chapter 10 Simple Regression.
Simultaneous Equations Models
Multivariate Data Analysis Chapter 11 - Structural Equation Modeling.
The Simple Regression Model
Structural Equation Modeling
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Dr. Mario MazzocchiResearch Methods & Data Analysis1 Correlation and regression analysis Week 8 Research Methods & Data Analysis.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Linear and generalised linear models Purpose of linear models Least-squares solution for linear models Analysis of diagnostics Exponential family and generalised.
LECTURE 16 STRUCTURAL EQUATION MODELING.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Simple Linear Regression Analysis
Structural Equation Modeling Continued: Lecture 2 Psy 524 Ainsworth.
Path Analysis. Figure 1 Exogenous Variables Causally influenced only by variables outside of the model. SES and IQ in Figure 1. The two-headed arrow.
Introduction to Linear Regression and Correlation Analysis
Chapter 11 Simple Regression
Montecarlo Simulation LAB NOV ECON Montecarlo Simulations Monte Carlo simulation is a method of analysis based on artificially recreating.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
SEM: Basics Byrne Chapter 1 Tabachnick SEM
1 Chapter 3 Multiple Linear Regression Multiple Regression Models Suppose that the yield in pounds of conversion in a chemical process depends.
Relationship between two variables Two quantitative variables: correlation and regression methods Two qualitative variables: contingency table methods.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Measurement Models: Exploratory and Confirmatory Factor Analysis James G. Anderson, Ph.D. Purdue University.
Path Analysis and Structured Linear Equations Biologists in interested in complex phenomena Entails hypothesis testing –Deriving causal linkages between.
G Lecture 81 Comparing Measurement Models across Groups Reducing Bias with Hybrid Models Setting the Scale of Latent Variables Thinking about Hybrid.
Robust Estimators.
Estimating and Testing Hypotheses about Means James G. Anderson, Ph.D. Purdue University.
Correlation & Regression Analysis
Chapter 8: Simple Linear Regression Yang Zhenlin.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Chapter 17 STRUCTURAL EQUATION MODELING. Structural Equation Modeling (SEM)  Relatively new statistical technique used to test theoretical or causal.
Structural Equation Modeling using MPlus
Linear Regression.
Chapter 15 Confirmatory Factor Analysis
CJT 765: Structural Equation Modeling
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Structural Equation Modeling
Multiple Regression Chapter 14.
Product moment correlation
Confirmatory Factor Analysis
Causal Relationships with measurement error in the data
Testing Causal Hypotheses
Structural Equation Modeling
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Structural Equation Modeling (SEM) – Some Basic Concepts Kathryn Sharpe & Wei Zhu

2 SEM Basics  SEM is a statistical technique for testing and estimating causal relationships first proposed in 1921 by the American Geneticist Dr. Sewall Green Wright ( )

3 SEM Basics  SEM without latent variables is called Path Analysis.  SEM is a confirmatory analysis procedure although sometimes it can also be used as an exploratory analysis tool.  SEM is a set of usually inter-related linear regression equations.

4 A simple example: Path Diagrams & Equations for Eating Disorder Every variable with an incoming arrow leads to a regression equation. Our regression equation system is as follows: Directional Arrows indicate cause and effect

5 SEM Programs For this example, we will use PROC CALIS. It takes our linear equations (previous slide) and estimates the parameters for the model. Then it evaluates the goodness of fit of the model. PROC CALIS (and PROC TCALIS) in SAS LISREL (Karl Gustav Jöreskog & Dag Sörbom) EQS (Peter Bentler, UCLA) AMOS The most popular software packages for SEM are:

6 Dr. Karl Gustav Jöreskog (Sweden) & Dr. Peter M. Bentler (USA) Widely viewed as leaders in SEM development in our times

7 SAS Code Give the linear equations describing the system Suppress Pearson correlations Proc calis: use cov rather than corr SAS correlation procedure Give variances of exogenous variables Error variances Specifies output Dataset with type Covariance matrix proc corr cov nocorr data=eddata outp=edcova(type=cov); run; proc calis cov mod data=edcova; Lineqs bi = b1 am + b2 sw + E1, sw = b3 am + b4 bi + E2, dt = b5 bi + b6 sw + E3, rd = b7 dt + E4 ; Std E1-E4 = The1-The4 ; Cov E1 E2 = Ps1; Run; We must set the error equal to a parameter value; otherwise it is assumed to be 0 BI and SW are correlated so we must estimate the correlation of their error terms’ variances

8 Results of SAS Analysis: Age of Menstruation (AM) Body Image (BI) Adolescent Self Worth (SW) Drive for Thinness (DT) Risk for Disorder (RD) SAS gives us parameter estimates, error estimates, and t- values for each path included in the model. We use a t-test to determine which paths are significant. In addition, we can calculate the confidence intervals (-4.02,.04) (-.77,.52) (-1.33,.79).8292 (.72,.94) (-.27,.06) (-2.02,1.92).3341 (.04,.63)

9 Goodness of Fit After reporting the parameter estimates, SAS reports many different measures of fit so we can evaluate it in any way we choose. The more measures we use to evaluate our model, the better. A good fit does not necessarily mean a perfect model. We can still have unnecessary variables or be missing important ones. By convention, a model is “good” if: GFI >.90/.95, Small Chi-Square value, large p-value, RMSEA Estimate should be close to zero. SAS Output:

10 Useful Websites Google and Wikipedia have done a good job for searching and summarizing many items including SEM. Type “structural equation modeling” in Google, you will see the SEM wiki site listed as the first item: Looking at the recommended sites towards the end of the SEM wiki page, you will find further useful links such as: 1.A good website for SEM lecture notes: LISREL: 3.EQS: 4.MPLUS: 5.GLLAMM: 6.SEM AFNI (brain functional pathway analysis): SAS Proc TCALIS: HTML/default/viewer.htm#statug_tcalis_sect087.htm HTML/default/viewer.htm#statug_tcalis_sect087.htm 8.The UCLA SAS Web:

Part II: PCNA and Bootstrap Resampling 1. Partial Correlation Network Analysis

12 PCNA: Generating a Path Diagram When there is not a hypothesized diagram for a SEM analysis, we can generate a path diagram using partial correlation network analysis. In 2006, Marrelec discussed the concept of detecting an underlying connectivity network in data, and the methods for analysis. He noted the importance of detection without hypothesized relationships, as SEM requires. In 2007, Marrelec et. al. published a work praising the use of Partial Correlation Network Analysis (PCNA) in conjunction with SEM. Partial correlation analysis is a technique that allows us to investigate the relationship between two variables free of influence from other variables. Consider two variables, X and Y. We want to know the correlation of X and Y while controlling for Z. The most intuitive way to understand partial correlation is to consider two regressions.

13 We have N variables, and we are interested to know which pairs have significant relationships when controlling for all other variables in the system. Additionally, we are interested to know which pairs’ relationships is changed by the disease state of the measured tissue, for example. For each pair of variables, i and j, we regress the two variables individually on all other variables in the system, and calculate the corresponding residuals. This creates two variables, and, representing the original variables free of the influence of all other variables in the system. Then we can evaluate their correlation. Our PCNA Bootstrap Methodology is the partial correlation of the variables. However, this is just one number, so we cannot incorporate the influence of covariates into the significance test of this value. This is why we use a bootstrapping procedure.

Part II: PCNA and Bootstrap Resampling 2. Bootstrap Resampling

15 Bootstrap Resampling Use each resample to calculate the partial correlation. Now we have a population of n measurements for each pair of variables. If we perform this analysis on our two datasets individually, we will have 1000 estimates of partial correlation for the normal tissue and 1000 estimates for the diseased tissue. We have our original sample of m subjects. 123m … Select one of them at random, and then replace it before randomly selecting the next. Repeat this m times. Now we have a sample of m subjects consisting of subjects from the original sample. However, some subjects may be repeated, and some subjects from the original sample may not be present in our resample. The idea behind bootstrapping is resampling with replacement. 1 i 23m …

16 Bootstrap Resampling We will let the significance of the relationships in the normal dataset represent the general significance of partial correlation among variables in the system. We can create a difference variable to estimate the difference of the partial correlation between the normal tissue and diseased tissue. The significance of the differences represents the influence of disease on the partial correlation between variables. The results we must evaluate are two lists of partial correlations (those for the normal tissue, and those for the diseased tissue). NormalDiseasedDifference V1W1V1 – W1 V2W2V2 – W2 V3W3V3 – W3 ……… Sort the normal and difference variables. If 0 is contained in the middle 95% of the observations, then we would say the relationship or influence of disease is insignificant for this pair of variables. (This is called the percentile method).

17 Results The results of the PCNA bootstrap in the brain data (four datasets; covariates: drug, group) example is shown at the left. No arrows! At this point, we would ask the collaborating researcher for input on the directionality of each path. For paths not easily determined, we can implement one path in each direction. The results would be a hypothesized relationship that can be verified using structural equation modeling with an independent data set.