Presentation is loading. Please wait.

Presentation is loading. Please wait.

Causal Relationships with measurement error in the data

Similar presentations


Presentation on theme: "Causal Relationships with measurement error in the data"— Presentation transcript:

1 Causal Relationships with measurement error in the data
A brief introduction by Willem E.Saris

2 college titel en nummer
Basic concepts Direct effect y x y Indirect effect z x y Spurious relation z x x z Joint effect w y 30/6/19 college titel en nummer

3 college titel en nummer
An example of a model How can these effects be estimated ? 30/6/19 college titel en nummer

4 college titel en nummer
Decomposition rule The correlation between two variables is equal to the sum of - the direct effect, - indirect effects, - spurious relationships and - joint effects between these variables. 30/6/19 college titel en nummer

5 Expression for the different components
The indirect effect, spurious relations and joint effects are equal to the products of the coefficients along the path going from one variable to the other while one can not pass the same variable twice and can not go against the direction of the arrows. 30/6/19 college titel en nummer

6 college titel en nummer
Derivations These derivations can also be used to estimate the parameters of this model. How ? 30/6/19 college titel en nummer

7 college titel en nummer
A second example 30/6/19 college titel en nummer

8 A Structural Equations Model
30/6/19 college titel en nummer

9 college titel en nummer
Derivations 30/6/19 college titel en nummer

10 college titel en nummer
The Proof 30/6/19 college titel en nummer

11 The correlations between the variables
The effects are equal to the correlations with x1 30/6/19 college titel en nummer

12 What if x1 is not observed ? Can we still estimate the effects ?
30/6/19 college titel en nummer

13 What happens if we have 4 observed variables ?
With extra info 30/6/19 college titel en nummer

14 college titel en nummer
Identification Of these three equations we need only one to determine the value of b41 when we have solved b11 and the other coefficients from the first three correlation coefficients This model is called overidentified or the degrees of freedom or df= 2 df= # correlations - # parameters to be estimated 30/6/19 college titel en nummer

15 college titel en nummer
A test is possible If we know that b11 = .7 and that r(y1y4) = b11b41 =.35 it follows that b41= .5 Now we know all coefficients and two correlations are not used yet and can be used to test the model r(y2y4) = b21b r(y3y4) = b31b41 r(y2y4) - r(y2y4) = r(y2y4) - b21b41=.3- .6x.5 = .0 r(y3y4) - r(y3y4) = r(y3y4) - b31b41= x.5 = .1 These differences are called residuals. If these residuals are big the model must be wrong. 30/6/19 college titel en nummer

16 college titel en nummer
Identification again With 3 observed variables df=0 and no test is possible With 2 observed variables df=-1 and no test is possible but even the effects can not be estimated If df<0 the model is not identified 30/6/19 college titel en nummer

17 college titel en nummer
Estimation The decomposition rules only hold for the population correlations and not for the sample correlations But , normally, we know only the sample correlations It is easily shown that the solution is different depending of the equations used So an efficient estimation procedure is needed. 30/6/19 college titel en nummer

18 college titel en nummer
Estimation There are several general principles. We will discuss: - the Unweighted Least Squares (ULS) procedure - the Weighted Least Squares (WLS) procedure. Both procedures are based on the residuals between the sample correlations and the expected values of the correlations. 30/6/19 college titel en nummer

19 college titel en nummer
Estimation The expected correlations are a function of the parameters fij(p) where p represents the set of parameters of the model and fij the specific function which gives the link between the population correlations and the parameters for the variables i and j. 30/6/19 college titel en nummer

20 college titel en nummer
ULS estimators The ULS procedure suggests to look for the parameter values that minimize the unweighted sum of squared residuals: FULS = S(rij –fij(p))2 where the summation is over all unique elements of the correlation matrix. 30/6/19 college titel en nummer

21 Estimation in this specific case
The program looks for the values of all the parameters that minimize the function Fuls 30/6/19 college titel en nummer

22 college titel en nummer
WLS estimators The WLS procedure suggests to look for the parameter values that minimize the weighted sum of squared residuals: FWLS = Swij(rij –fij(p))2 where the summation is also over all unique elements of the correlation matrix. These weights can be chosen in different ways. 30/6/19 college titel en nummer

23 college titel en nummer
ADF estimator Using weights derived from the Variance Covariances of the covariances the Asymptotic Distribution Free estimator is specified. For any distribution of the observed variables this estimator is consistent and provides standard errors and a test statistic The problem is that it requires very large samples 30/6/19 college titel en nummer

24 college titel en nummer
ML estimator The most commonly used procedure, the Maximum Likelihood (ML) estimator, can be specified as a special case of the WLS estimator. The ML estimator provides standard errors for the parameters and a test statistic for the fit of the model for much smaller samples but this estimator is developed under the assumption that the observed variables have a multivariate normal distribution. 30/6/19 college titel en nummer

25 Standard Procedure for testing S E Models
Testing is essential for S E Models The test statistic t used is the value of the fitting function at its minimum If the model is correct, t is c2 (df) distributed Normally the model is rejected if t > Ca where Ca is the value of the c2 for which pr(c2df > Ca) =a We come back to this issue later 30/6/19 college titel en nummer

26 college titel en nummer
LISREL input estimation and testing a factor model data ni=4 no=400 ma=km km 1.0 model ny=4 ne=1 ly=fu,fi te=di,fi ps=di,fi free ly 1 1 ly 2 1 ly 3 1 ly 4 1 free te 1 1 te 2 2 te 3 3 te 4 4 value 1 ps 1 1 out ULS 30/6/19 college titel en nummer

27 LISREL estimates of the effects of the latent factor
30/6/19 college titel en nummer

28 LISREL estimates of the error variances
30/6/19 college titel en nummer

29 college titel en nummer
Goodness of fit test 30/6/19 college titel en nummer

30 LISREL input for different correlation matrix
estimation and testing a factor model data ni=4 no=400 ma=km km 1.0 model ny=4 ne=1 ly=fu,fi te=di,fi ps=di,fi free ly 1 1 ly 2 1 ly 3 1 ly 4 1 free te 1 1 te 2 2 te 3 3 te 4 4 value 1 ps 1 1 out ULS 30/6/19 college titel en nummer

31 Estimates of the effects of the latent variable
estimation and testing a factor model Number of Iterations = 9 LISREL Estimates (Unweighted Least Squares) LAMBDA-Y ETA 1 VAR (0.05) 14.18 VAR (0.04) 15.43 VAR 15.75 VAR 14.28 30/6/19 college titel en nummer

32 Goodness of fit test of the model on the new correlation matrix
Goodness of Fit Statistics W_A_R_N_I_N_G: Chi-square, standard errors, t-values and standardized residuals are calculated under the assumption of multi-variate normality. Degrees of Freedom = 2 Normal Theory Weighted Least Squares Chi-Square = (P = 0.00) Estimated Non-centrality Parameter (NCP) = 17.62 90 Percent Confidence Interval for NCP = (6.96 ; 35.72) 30/6/19 college titel en nummer

33 college titel en nummer
General Approach A model is specified with observed and latent variables Correlations (covariances) between the observed variables can be expressed in the parameters of the model (decomposition rules) If the model is identified the parameters can be estimated A test of the model can be performed if df>0 Eventual misspecifications can be detected Corrections in the models can be introduced 30/6/19 college titel en nummer

34 college titel en nummer
Important Result The distinction between observed and latent variables makes the estimation of error variances possible The errors in social science survey data can be quite large. These errors will bias the estimates if not taken into account So the SEM approach has important advantages 30/6/19 college titel en nummer


Download ppt "Causal Relationships with measurement error in the data"

Similar presentations


Ads by Google