Presentation on theme: "Longitudinal data analysis in HLM. Longitudinal vs cross-sectional HLM Similar things: Fixed effects Random effects Difference: Cross-sectional HLM: individual,"— Presentation transcript:
Longitudinal data analysis in HLM
Longitudinal vs cross-sectional HLM Similar things: Fixed effects Random effects Difference: Cross-sectional HLM: individual, school,… Longitudinal HLM: observations over time, individual,…
Characteristics in longitudinal data Source of variations Within-subject variation (intra-individual variation) Between-subject variation (inter-individual variation) Often incomplete data or unbalanced data OLS regression is not suitable to analyze longitudinal data because its assumptions are violated by the data.
Limitations of traditional approach for modeling longitudinal data Univariate repeated measure ANOVA Person effects are random, time effects and other factor effects are fixed – it reduces residual variance by considering the person effects. Fixed time point (evenly or unevenly spaced) It assumes a unique residual variance- covariance structure (compound symmetry), which assume equal variance over time among observations from the same person and a constant covariance.
Limitations of traditional approach for modeling longitudinal data Univariate repeated measure ANOVA An alternative assumption, sphericity: it assumes equal variance difference between any two time points.
Limitations of traditional approach for modeling longitudinal data Univariate repeated measure ANOVA The assumptions could not be held for longitudinal data People change at varied rates, so that variances often change over time Covariances close in time usually greater than covariances distil in time Test of variance-covariance structure is necessary to validate significance tests
Limitations of traditional approach for modeling longitudinal data Multivariate repeated measure ANOVA Use generalized method – no specific assumptions about variances and covariances (unstructured). It does not allow any other structure, so when the repeated measures increase, it causes over-parameterization. Subjects with missing data on any time point will be deleted from analysis.
Limitations of traditional approach for modeling longitudinal data In addition, none of them allow time-varying predictors
Advantage of longitudinal data analysis in HLM Ability to deal with missing data (missing at random, MAR) No assumptions about compound symmetry More flexible: Unequal numbers of measurement or unequal measurement intervals Includes time-varying covariate
Research questions Is there any effect of time on average (fixed effect of time significant)? Does the average effect of time vary across persons (random effect of time significant)?
A Linear Growth Model Level 1 (within subject model) Y ti is the measurement of i th subject at t th time point Level 2 (between subject model)
An example Sample Intercept (Grand Mean) Individual Intercept Deviation Sample slope (Grand Mean) Individual slope Deviation Residual covariance
In the model Six Parameters: Fixed Effects: β 00 and β 10, level 2 Random Effects: Variances of r 0i and r 1i (τ 00 2, τ 11 2 ), level 2 Covariance of r 0i and r 1i (τ 01 ), level 2 Residual Variance of e ti (σ e 2 ), level 1
Average growth trend Final estimation of fixed effects: Standard Approx. Fixed Effect Coefficient Error T-ratio d.f. P-value For INTRCPT1, P0 INTRCPT2, B For TIME slope, P1 INTRCPT2, B Initial status, average English score at time 0 is Growth rate, average English increase at one unit of time increment is 1.50
Random intercept-slope Final estimation of variance components: Random Effect Standard Variance df Chi-square P-value Deviation Component INTRCPT1, R TIME slope, R level-1, E Initial status, students vary significantly in English score at time 0. Growth rates are different among different students, various slopes. A student whose growth is 1 SD above average is expected to grow at the rate of =2.43 per time unit
Reliability Ratio of the “true” parameter variance to the “total” observed variance. Close to zero means observed score variance must be due to error. Without knowledge of the reliability of the estimated growth parameter, we might falsely draw a conclusion due to incapability of detecting relations Random level-1 coefficient Reliability estimate INTRCPT1, B TIME, B
Correlation of change with initial status Choose “print variance-covariance matrices” under output settings. Students who have higher English score at initial point tend to have a faster growth rate. Tau (as correlations) INTRCPT1,B TIME,B
We could make it more complicated An intercepts- and Slopes-as-outcomes model Level 1 (within subject model) Y ti is the measurement of i th subject at t th time point Level 2 (between subject model)