Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.

Similar presentations


Presentation on theme: "Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume."— Presentation transcript:

1 Lecture 6 (chapter 5) Revised on 2/22/2008

2 Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume that the covariance structure of the sequence of measurements on each unit is to be specified by the values of unknown parameters.

3 Parametric Models for Covariance Structure

4 Parametric Models for Covariance Matrices Model for the mean Model for the covariance matrix

5 Parametric Models for Covariance Matrices We now consider more general models for the covariance matrix which can be specified by looking at the empirical variogram.

6 Interpretation of the Variogram

7 Variance of the random effects Serial correlation Measurement error corr

8 Interpretation of the Variogram or Variogram: with variance and correlation function with variance Total variance:

9 Variogram for a model with Random Intercept plus serial correlation plus measurement error variance of the random effect measurement error variance variance of the serial process

10 Cow Data 19 weeks of measurement, 3 diets: barley, mixed, lupins (barley data shown above)

11 Example: Protein Content of Milk The “roof” of the variogram is placed at 0.087

12 Example: Protein Content of Milk (cont’d) the variogram does not reach the total variance the variogram does not start at zero

13 Example: Protein Content of Milk (cont’d) Variogram of protrs (17 percent of v_ij’s excluded) variance of the random intercept Measurement error

14 Example: Protein Content of Milk (cont’d)

15 Parametric Models for Covariance Structure We will cover this in the multi-level course

16 Models To develop a model, we need to understand the sources of variation: 1.Random Effects (Intercept): random variation between units 2.Serial Correlation: time-varying random process within a unit 3.Measurement Error: measurement process introduces a component of random variation

17 How to incorporate these qualitative features into specific models

18 Pure Serial Correlation Exponential correlation model for equally spaced measurements

19 Autoregressive Model of Order 1 Another way to build a serial correlation model is to assume an explicit dependence of the on its predecessors The simplest example is a AR(1) where AR(1) models are appealing for equally spaced data, less so for unequally spaced. For example, it would be hard to interpret if the measurements were not equally spaced in time xtgee + corr(AR1)

20 Exponential Correlation Model for Unequally Spaced data The exponential correlation model can handle unequally spaced data. We can assume Then prais command in stata allow a different coefficient for each time difference

21 Model-Fitting 1. Formulation: choosing the general form of the model Mean Association 2. Estimation: fitting the model 1.Weighted least squares for 2.ML for covariance parameters or subset 3.Iterate (1) and (2) to convergence

22 Model-Fitting (cont’d) 3. Diagnostic: checking that the model fits the data by examining residuals for lack of fit, correlation 4. Inference: calculating confidence intervals or testing hypotheses about parameters of interest

23 Step 1. Formulation Formulation of the model is a continuation of exploratory data analysis. Focus on the mean and covariance structures 1.Look at the residuals 2.Create time plots, scatterplot matrices and empirical variograms 3.Do you have stationarity in the residuals? If not, you need to transform the data or use inherently non-stationary models as a model with a random intercept and random slope 4.Once stationarity has been achieved, use the empirical variogram to estimate the underlying covariance structure

24 Step 2. Estimation

25 Step 4. Diagnostics The aim is to compare the data with the fitted model. How? 1.Super-impose the fitted mean response profiles on a time plot of the average observed responses within each combination of treatment and times. 2.Super-impose the fitted variogram on a plot of the empirical variogram.

26 Examples and Summary: Nepal Dataset This dataset contains anthropologic measurements on Nepalese children. The study design called for collecting measurements on 2258 kids at 5 time points, spaced approximately 4 months apart Q: Estimate the association between arm circumference and child’s weight, adjusted by age and gender, and accounting for the correlation in the repeated measures within the same child

27 Examples and Summary: Nepal Dataset Time varying confounder Goal: estimate the average change in arm circumference for one unit change In weight accounting for the potential confounding effect of the time varying covariate age and for sex. The standard errors of the estimated association Between arm circumference and weight are estimated accounting for the correlation of the repeated measures within a child,

28 Examples and Summary: Nepal Dataset Models for the covariance matrix: (measurement error only)

29 Examples and Summary: Nepal Dataset (More) Models for the covariance matrix: + measurement error

30 Independence Model (measurement error) 0

31 Independence Model (cont’d)

32 Uniform Model (also, “Exchangeable” or “Compound Symmetry”) A better model (than the Independence Model) is to assume same correlation for all pairs of observations: This is called the uniform, exchangeable, or compound symmetry correlation model.

33 Uniform (Random Intercept) plus measurement error

34 Uniform Model (also, “Exchangeable” or “Compound Symmetry”) (cont’d)

35

36 Uniform Model

37 Uniform Model (also, “Exchangeable” or “Compound Symmetry”) (cont’d)

38 Exponential Correlation Model A different model is to assume that the correlation of observations closer together in time is larger than that of observations farther apart in time. One model for this is the exponential model:

39 Exponential Correlation Model Within subject correlation at lag 1 (4 months separation approx)

40 Exponential Correlation Model (cont’d)

41 Exchangeable + Exponential Model

42 Random Intercept + Serial Correlation

43 Summary of Example We see that the exchangeable correlation (0.691) is similar to the model without the exponential correlation (0.748), and the exponential correlation (0.185) is now much smaller. The regression parameter estimates are also similar to the exchangeable case. This suggests that the exchangeable correlation model may be capturing the main correlation pattern.

44 Summary Modelling the correlation in longitudinal data is important to be able to obtain correct inferences on regression coefficients. This leads to: 1.Statistical Efficiency 2.Correct Standard Errors These are marginal models because the interpretation of the regression coefficients is the same as that in cross-sectional data. 1.Exchangeable correlation model: subject-specific formulation 2.Exponential correlation model: transition model formulation

45 Summary (cont’d) Three basic elements of correlation structure 1.Random effects 2.Auto-correlation or serial dependence 3.Observation-level noise or measurement error

46 Evaluating Covariance Models Once you have chosen a (set of) covariance model(s), how do you evaluate whether it fits the data well? Or, how do you compare several of them? Several tools (each work with either ML or ReML): 1.Likelihood Ratio Tests (LRTs) for comparing nested models 2.Akaikie’s Information Criterion (AIC) 3.QIC to compare models fitted with GEE 4.Examining fitted model variograms

47 Comparing Covariance Models with Akaike’s Information Criterion (AIC)

48 Comparing Covariance Models with Akaike’s Information Criterion (AIC) (cont’d)

49


Download ppt "Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume."

Similar presentations


Ads by Google