Causal Relationships with measurement error in the data

Slides:



Advertisements
Similar presentations
Autocorrelation and Heteroskedasticity
Advertisements

Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
1 Regression as Moment Structure. 2 Regression Equation Y =  X + v Observable Variables Y z = X Moment matrix  YY  YX  =  YX  XX Moment structure.
Structural Equation Modeling. What is SEM Swiss Army Knife of Statistics Can replicate virtually any model from “canned” stats packages (some limitations.
SEM PURPOSE Model phenomena from observed or theoretical stances
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
Structural Equation Modeling Using Mplus Chongming Yang Research Support Center FHSS College.
Structural Equation Modeling
Lecture (11,12) Parameter Estimation of PDF and Fitting a Distribution Function.
Structural Equation Modeling: An Overview P. Paxton.
CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.
1 ESTIMATION, TESTING, ASSESSMENT OF FIT. 2 Estimation How do we fit  (  )? –Choose  so that the reproduced   (  ), is as close as possible to.
Ch11 Curve Fitting Dr. Deshi Ye
1 Chapter 2 Simple Linear Regression Ray-Bing Chen Institute of Statistics National University of Kaohsiung.
Structural Equation Modeling
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
PSY 307 – Statistics for the Behavioral Sciences
Chapter 10 Simple Regression.
The General LISREL MODEL and Non-normality Ulf H. Olsson Professor of Statistics.
Multivariate Data Analysis Chapter 11 - Structural Equation Modeling.
Structural Equation Modeling
Factor Analysis Ulf H. Olsson Professor of Statistics.
Chapter 11 Multiple Regression.
LECTURE 16 STRUCTURAL EQUATION MODELING.
The General (LISREL) SEM model Ulf H. Olsson Professor of statistics.
Structural Equation Modeling Intro to SEM Psy 524 Ainsworth.
Confirmatory factor analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Structural Equation Modeling 3 Psy 524 Andrew Ainsworth.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
CJT 765: Structural Equation Modeling Class 7: fitting a model, fit indices, comparingmodels, statistical power.
10 SEM is Based on the Analysis of Covariances! Why?Analysis of correlations represents loss of information. A B r = 0.86r = 0.50 illustration.
Estimation Kline Chapter 7 (skip , appendices)
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
Multivariate Statistics Confirmatory Factor Analysis I W. M. van der Veld University of Amsterdam.
Measurement Models: Identification and Estimation James G. Anderson, Ph.D. Purdue University.
Generalised method of moments approach to testing the CAPM Nimesh Mistry Filipp Levin.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Environmental Modeling Basic Testing Methods - Statistics III.
CJT 765: Structural Equation Modeling Class 8: Confirmatory Factory Analysis.
I271B QUANTITATIVE METHODS Regression and Diagnostics.
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University ECON 4550 Econometrics Memorial University of Newfoundland.
Structural Equation Modeling Mgmt 291 Lecture 3 – CFA and Hybrid Models Oct. 12, 2009.
Estimation Kline Chapter 7 (skip , appendices)
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Review of Statistical Inference Prepared by Vera Tabakova, East Carolina University.
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and  2 Now, we need procedures to calculate  and  2, themselves.
The SweSAT Vocabulary (word): understanding of words and concepts. Data Sufficiency (ds): numerical reasoning ability. Reading Comprehension (read): Swedish.
MathematicalMarketing Slide 3c.1 Mathematical Tools Chapter 3: Part c – Parameter Estimation We will be discussing  Nonlinear Parameter Estimation  Maximum.
Probability Theory and Parameter Estimation I
Kakhramon Yusupov June 15th, :30pm – 3:00pm Session 3
Correlation, Regression & Nested Models
CJT 765: Structural Equation Modeling
Fundamentals of regression analysis 2
STOCHASTIC REGRESSORS AND THE METHOD OF INSTRUMENTAL VARIABLES
CONCEPTS OF ESTIMATION
Linear regression Fitting a straight line to observations.
OVERVIEW OF LINEAR MODELS
LISREL matrices, LISREL programming
Review of Statistical Inference
Interval Estimation and Hypothesis Testing
Structural Equation Modeling
Regression Lecture-5 Additional chapters of mathematics
Simple Linear Regression
OVERVIEW OF LINEAR MODELS
Product moment correlation
ESTIMATION METHODS We know how to calculate confidence intervals for estimates of  and 2 Now, we need procedures to calculate  and 2 , themselves.
Testing Causal Hypotheses
Structural Equation Modeling
Presentation transcript:

Causal Relationships with measurement error in the data A brief introduction by Willem E.Saris

college titel en nummer Basic concepts Direct effect y x y Indirect effect z x y Spurious relation z x x z Joint effect w y 30/6/19 college titel en nummer

college titel en nummer An example of a model How can these effects be estimated ? 30/6/19 college titel en nummer

college titel en nummer Decomposition rule The correlation between two variables is equal to the sum of - the direct effect, - indirect effects, - spurious relationships and - joint effects between these variables. 30/6/19 college titel en nummer

Expression for the different components The indirect effect, spurious relations and joint effects are equal to the products of the coefficients along the path going from one variable to the other while one can not pass the same variable twice and can not go against the direction of the arrows. 30/6/19 college titel en nummer

college titel en nummer Derivations These derivations can also be used to estimate the parameters of this model. How ? 30/6/19 college titel en nummer

college titel en nummer A second example 30/6/19 college titel en nummer

A Structural Equations Model 30/6/19 college titel en nummer

college titel en nummer Derivations 30/6/19 college titel en nummer

college titel en nummer The Proof 30/6/19 college titel en nummer

The correlations between the variables The effects are equal to the correlations with x1 30/6/19 college titel en nummer

What if x1 is not observed ? Can we still estimate the effects ? 30/6/19 college titel en nummer

What happens if we have 4 observed variables ? With extra info 30/6/19 college titel en nummer

college titel en nummer Identification Of these three equations we need only one to determine the value of b41 when we have solved b11 and the other coefficients from the first three correlation coefficients This model is called overidentified or the degrees of freedom or df= 2 df= # correlations - # parameters to be estimated 30/6/19 college titel en nummer

college titel en nummer A test is possible If we know that b11 = .7 and that r(y1y4) = b11b41 =.35 it follows that b41= .5 Now we know all coefficients and two correlations are not used yet and can be used to test the model r(y2y4) = b21b41 r(y3y4) = b31b41 r(y2y4) - r(y2y4) = r(y2y4) - b21b41=.3- .6x.5 = .0 r(y3y4) - r(y3y4) = r(y3y4) - b31b41=.5 - .8x.5 = .1 These differences are called residuals. If these residuals are big the model must be wrong. 30/6/19 college titel en nummer

college titel en nummer Identification again With 3 observed variables df=0 and no test is possible With 2 observed variables df=-1 and no test is possible but even the effects can not be estimated If df<0 the model is not identified 30/6/19 college titel en nummer

college titel en nummer Estimation The decomposition rules only hold for the population correlations and not for the sample correlations But , normally, we know only the sample correlations It is easily shown that the solution is different depending of the equations used So an efficient estimation procedure is needed. 30/6/19 college titel en nummer

college titel en nummer Estimation There are several general principles. We will discuss: - the Unweighted Least Squares (ULS) procedure - the Weighted Least Squares (WLS) procedure. Both procedures are based on the residuals between the sample correlations and the expected values of the correlations. 30/6/19 college titel en nummer

college titel en nummer Estimation The expected correlations are a function of the parameters fij(p) where p represents the set of parameters of the model and fij the specific function which gives the link between the population correlations and the parameters for the variables i and j. 30/6/19 college titel en nummer

college titel en nummer ULS estimators The ULS procedure suggests to look for the parameter values that minimize the unweighted sum of squared residuals: FULS = S(rij –fij(p))2 where the summation is over all unique elements of the correlation matrix. 30/6/19 college titel en nummer

Estimation in this specific case The program looks for the values of all the parameters that minimize the function Fuls 30/6/19 college titel en nummer

college titel en nummer WLS estimators The WLS procedure suggests to look for the parameter values that minimize the weighted sum of squared residuals: FWLS = Swij(rij –fij(p))2 where the summation is also over all unique elements of the correlation matrix. These weights can be chosen in different ways. 30/6/19 college titel en nummer

college titel en nummer ADF estimator Using weights derived from the Variance Covariances of the covariances the Asymptotic Distribution Free estimator is specified. For any distribution of the observed variables this estimator is consistent and provides standard errors and a test statistic The problem is that it requires very large samples 30/6/19 college titel en nummer

college titel en nummer ML estimator The most commonly used procedure, the Maximum Likelihood (ML) estimator, can be specified as a special case of the WLS estimator. The ML estimator provides standard errors for the parameters and a test statistic for the fit of the model for much smaller samples but this estimator is developed under the assumption that the observed variables have a multivariate normal distribution. 30/6/19 college titel en nummer

Standard Procedure for testing S E Models Testing is essential for S E Models The test statistic t used is the value of the fitting function at its minimum If the model is correct, t is c2 (df) distributed Normally the model is rejected if t > Ca where Ca is the value of the c2 for which pr(c2df > Ca) =a We come back to this issue later 30/6/19 college titel en nummer

college titel en nummer LISREL input estimation and testing a factor model data ni=4 no=400 ma=km km 1.0 .42 1.0 .56 .48 1.0 .35 .30 .40 1.0 model ny=4 ne=1 ly=fu,fi te=di,fi ps=di,fi free ly 1 1 ly 2 1 ly 3 1 ly 4 1 free te 1 1 te 2 2 te 3 3 te 4 4 value 1 ps 1 1 out ULS 30/6/19 college titel en nummer

LISREL estimates of the effects of the latent factor 30/6/19 college titel en nummer

LISREL estimates of the error variances 30/6/19 college titel en nummer

college titel en nummer Goodness of fit test 30/6/19 college titel en nummer

LISREL input for different correlation matrix estimation and testing a factor model data ni=4 no=400 ma=km km 1.0 .42 1.0 .56 .48 1.0 .35 .50 .50 1.0 model ny=4 ne=1 ly=fu,fi te=di,fi ps=di,fi free ly 1 1 ly 2 1 ly 3 1 ly 4 1 free te 1 1 te 2 2 te 3 3 te 4 4 value 1 ps 1 1 out ULS 30/6/19 college titel en nummer

Estimates of the effects of the latent variable estimation and testing a factor model Number of Iterations = 9 LISREL Estimates (Unweighted Least Squares) LAMBDA-Y ETA 1 -------- VAR 1 0.64 (0.05) 14.18 VAR 2 0.67 (0.04) 15.43 VAR 3 0.79 15.75 VAR 4 0.64 14.28 30/6/19 college titel en nummer

Goodness of fit test of the model on the new correlation matrix Goodness of Fit Statistics W_A_R_N_I_N_G: Chi-square, standard errors, t-values and standardized residuals are calculated under the assumption of multi-variate normality. Degrees of Freedom = 2 Normal Theory Weighted Least Squares Chi-Square = 19.62 (P = 0.00) Estimated Non-centrality Parameter (NCP) = 17.62 90 Percent Confidence Interval for NCP = (6.96 ; 35.72) 30/6/19 college titel en nummer

college titel en nummer General Approach A model is specified with observed and latent variables Correlations (covariances) between the observed variables can be expressed in the parameters of the model (decomposition rules) If the model is identified the parameters can be estimated A test of the model can be performed if df>0 Eventual misspecifications can be detected Corrections in the models can be introduced 30/6/19 college titel en nummer

college titel en nummer Important Result The distinction between observed and latent variables makes the estimation of error variances possible The errors in social science survey data can be quite large. These errors will bias the estimates if not taken into account So the SEM approach has important advantages 30/6/19 college titel en nummer