Lecture 3 Linear random intercept models. Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The.

Slides:



Advertisements
Similar presentations
Qualitative predictor variables
Advertisements

Lecture 11 (Chapter 9).
SC968: Panel Data Methods for Sociologists Random coefficients models.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
1 Results from hsb_subset.do. 2 Example of Kloeck problem Two-stage sample of high school sophomores 1 st school is selected, then students are picked,
From Anova to Regression: analyzing the effect on consumption of no. of persons in family Family consumption data family.dta E/Albert/Courses/cdas/appstat00/From.
1 FE Panel Data assumptions. 2 Assumption #1: E(u it |X i1,…,X iT,  i ) = 0.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 4) Slideshow: interactive explanatory variables Original citation: Dougherty, C. (2012)
Lecture 4 (Chapter 4). Linear Models for Correlated Data We aim to develop a general linear model framework for longitudinal data, in which the inference.
Repeated Measures, Part 3 May, 2009 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and Biostatistics, UCSF.
Analysis of Clustered and Longitudinal Data Module 3 Linear Mixed Models (LMMs) for Clustered Data – Two Level Part A 1 Biostat 512: Module 3A - Kathy.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
1 Nonlinear Regression Functions (SW Chapter 8). 2 The TestScore – STR relation looks linear (maybe)…
Advanced Panel Data Techniques
Multilevel Models 4 Sociology 8811, Class 26 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.
Lecture 6: Repeated Measures Analyses Elizabeth Garrett Child Psychiatry Research Methods Lecture Series.
Sociology 601 Class 19: November 3, 2008 Review of correlation and standardized coefficients Statistical inference for the slope (9.5) Violations of Model.
Valuation 4: Econometrics Why econometrics? What are the tasks? Specification and estimation Hypotheses testing Example study.
Sociology 601 Class 21: November 10, 2009 Review –formulas for b and se(b) –stata regression commands & output Violations of Model Assumptions, and their.
Shall we take Solow seriously?? Empirics of growth Ania Nicińska Agnieszka Postępska Paweł Zaboklicki.

Sociology 601 Class 25: November 24, 2009 Homework 9 Review –dummy variable example from ASR (finish) –regression results for dummy variables Quadratic.
Sociology 601 Class 28: December 8, 2009 Homework 10 Review –polynomials –interaction effects Logistic regressions –log odds as outcome –compared to linear.
Multilevel Models 2 Sociology 8811, Class 24
Multilevel Models 2 Sociology 229A, Class 18
1 Multiple Regression EPP 245/298 Statistical Analysis of Laboratory Data.
Introduction to Regression Analysis Straight lines, fitted values, residual values, sums of squares, relation to the analysis of variance.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
So far, we have considered regression models with dummy variables of independent variables. In this lecture, we will study regression models whose dependent.
1 Michigan.do. 2. * construct new variables;. gen mi=state==26;. * michigan dummy;. gen hike=month>=33;. * treatment period dummy;. gen treatment=hike*mi;
Multilevel Models 3 Sociology 8811, Class 25 Copyright © 2007 by Evan Schofer Do not copy or distribute without permission.
In previous lecture, we dealt with the unboundedness problem of LPM using the logit model. In this lecture, we will consider another alternative, i.e.
A trial of incentives to attend adult literacy classes Carole Torgerson, Greg Brooks, Jeremy Miles, David Torgerson Classes randomised to incentive or.
Interpreting Bi-variate OLS Regression
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Sociology 601 Class 26: December 1, 2009 (partial) Review –curvilinear regression results –cubic polynomial Interaction effects –example: earnings on married.
EDUC 200C Section 4 – Review Melissa Kemmerle October 19, 2012.
TOBIT ANALYSIS Sometimes the dependent variable in a regression model is subject to a lower limit or an upper limit, or both. Suppose that in the absence.
1 INTERACTIVE EXPLANATORY VARIABLES The model shown above is linear in parameters and it may be fitted using straightforward OLS, provided that the regression.
1 TWO SETS OF DUMMY VARIABLES The explanatory variables in a regression model may include multiple sets of dummy variables. This sequence provides an example.
1 BINARY CHOICE MODELS: PROBIT ANALYSIS In the case of probit analysis, the sigmoid function is the cumulative standardized normal distribution.
1 Estimation of constant-CV regression models Alan H. Feiveson NASA – Johnson Space Center Houston, TX SNASUG 2008 Chicago, IL.
Multilevel Analysis Kate Pickett Senior Lecturer in Epidemiology.
MULTILEVEL ANALYSIS Kate Pickett Senior Lecturer in Epidemiology SUMBER: www-users.york.ac.uk/.../Multilevel%20Analysis.ppt‎University of York.
Repeated Measures, Part 2 May, 2009 Charles E. McCulloch, Division of Biostatistics, Dept of Epidemiology and Biostatistics, UCSF.
Lecture 8: Generalized Linear Models for Longitudinal Data.
Behavior in blind environmental dilemmas - An experimental study Martin Beckenkamp Max-Planck-Institute for the Research on Collective Goods Bonn – Germany.
What is the MPC?. Learning Objectives 1.Use linear regression to establish the relationship between two variables 2.Show that the line is the line of.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Scientific question: Does the lunch intervention impact cognitive ability? The data consists of 4 measures of cognitive ability including:Raven’s score.
Biostat 200 Lecture Simple linear regression Population regression equationμ y|x = α +  x α and  are constants and are called the coefficients.
Christopher Dougherty EC220 - Introduction to econometrics (chapter 5) Slideshow: exercise 5.2 Original citation: Dougherty, C. (2012) EC220 - Introduction.
Panel Data. Assembling the Data insheet using marriage-data.csv, c d u "background-data", clear d u "experience-data", clear u "wage-data", clear d reshape.
Estimation in Marginal Models (GEE and Robust Estimation)
Multilevel Models 3 Sociology 229A, Class 10 Copyright © 2008 by Evan Schofer Do not copy or distribute without permission.
Lecture 5. Linear Models for Correlated Data: Inference.
STAT E100 Section Week 12- Regression. Course Review - Project due Dec 17 th, your TA. - Exam 2 make-up is Dec 5 th, practice tests have been updated.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Modeling Multiple Source Risk Factor Data and Health Outcomes in Twins Andy Bogart, MS Jack Goldberg, PhD.
1 COMPARING LINEAR AND LOGARITHMIC SPECIFICATIONS When alternative specifications of a regression model have the same dependent variable, R 2 can be used.
From t-test to multilevel analyses Del-3
From t-test to multilevel analyses Del-2
From t-test to … multilevel analyses
From t-test to multilevel analyses (Linear regression, GLM, …)
Table 4. Panel Regression with Fixed Effects
Financial Econometrics Fin. 505
EPP 245 Statistical Analysis of Laboratory Data
Modeling Multiple Source Risk Factor Data and Health Outcomes in Twins
Presentation transcript:

Lecture 3 Linear random intercept models

Example: Weight of Guinea Pigs Body weights of 48 pigs in 9 successive weeks of follow-up (Table 3.1 DLZ) The response is measures at n different times, or under n different conditions. In the guinea pigs example the time of measurement is referred to as a "within-units" factor. For the pigs n=9 Although the pigs example considers a single treatment factor, it is straightforward to extend the situation to one where the groups are formed as the results of a factorial design (for example, if the pigs were separated into males and female and then allocated to the diet groups)

Pig Data 48 Pigs Weight (kg) Week

Pigs data model 1 – OLS fit. regress weight time Source | SS df MS Number of obs = F( 1, 430) = Model | Prob > F = Residual | R-squared = Adj R-squared = Total | Root MSE = weight | Coef. Std. Err. t P>|t| [95% Conf. Interval] time | _cons | OLS results

Example: Weight of Pigs For this type of repeated measures study we recognize two sources of random variation 1.Between: There is heterogeneity between pigs, due for example to natural biological (genetic?) variation 2.Within: There is random variation in the measurement process for a particular unit at any given time. For example, on any given day a particular guinea pig may yield different weight measurements due to differences in scale (equipment) and/or small fluctuations in weight during a day

A) Linear model with random intercept Variance between Variance within Intraclass correlation coefficient! Why?

Intraclass correlation coefficient, i.e. correlation within measurements from pig i

Pigs – RE model xtreg weight time, re i(Id) mle Random-effects ML regression Number of obs = 432 Group variable (i): Id Number of groups = 48 Random effects u_i ~ Gaussian Obs per group: min = 9 avg = 9.0 max = 9 LR chi2(1) = Log likelihood = Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | _cons | /sigma_u | /sigma_e | rho | Linear model with a random intercept - “conditional model”

Interpretation of results: Time effect: Among pigs with similar genetic variation (random effect), weight increases by 6.2 kg per week (95% CI: 6.1 to 6.3) Estimate of heterogeneity across pigs: sigma_u^2 = 3.8^2 = 14.4 Estimate of variation in weights within a pig over time: sigma_e^2 = 2.1^2 = 4.4 Fraction of total variability attributable to heterogeneity across pigs: 0.77 This is also a measure of intraclass correlation, within pig correlation….

Random Effects Model E[ Y i | U i ] = β 0 + β 1 time  + U i E[ Y i ] = β 0 + β 1 time

B) Marginal Model With a Uniform or Exchangeable correlation structure Model for the mean Model for the covariance matrix

Pigs – Marginal model xtreg weight time, pa i(Id) corr(exch) Iteration 1: tolerance = 5.585e-15 GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: exchangeable max = 9 Wald chi2(1) = Scale parameter: Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | _cons | “Population Average”, Marginal Model with Exchangeable Correlation structure results

Pigs data model 1 – GEE fit. xtgee weight time, i(Id) corr(exch). xtcorr Estimated within-Id correlation matrix R: c1 c2 c3 c4 c5 c6 c7 c8 c9 r r r r r r r r r GEE fit – Marginal Model with Exchangeable Correlation structure results

Marginal Model E[ Y i ] = β 0 + β 1 time

Models A and B are equivalent

One group polynomial growth curve model Similarly, if you want to fit a quadratic curve E[Y ij | U i ] = U i +  0 +  1 t j +  2 t j 2

Pigs – Marg. model, quadratic trend. xtgee weight time timesq, i(Id) corr(exch) GEE population-averaged model Number of obs = 432 Group variable: Id Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: exchangeable max = 9 Wald chi2(2) = Scale parameter: Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | timesq | _cons | Exchangeable Correlation structure results

Pigs data model 1 – GEE fit. xtcorr Estimated within-Id correlation matrix R: c1 c2 c3 c4 c5 c6 c7 c8 c9 r r r r r r r r r GEE fit – Marginal Model with Exchangeable Correlation structure results

Pigs – RE model, quadratic trend. gen timesq = time*time. xtreg weight time timesq, re i(Id) mle Random-effects ML regression Number of obs = 432 Group variable (i): Id Number of groups = 48 Random effects u_i ~ Gaussian Obs per group: min = 9 avg = 9.0 max = 9 LR chi2(2) = Log likelihood = Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | timesq | _cons | /sigma_u | /sigma_e | rho | Exchangeable Correlation structure results

Random Effects Model E[ Y i | U i ] = β 0 + β 1 time  + β 2 time 2  + U i E[ Y i ] = ?

Pigs – Marginal model: AR(1) xtgee weight time, i(Id) corr(AR1) t(time) GEE population-averaged model Number of obs = 432 Group and time vars: Id time Number of groups = 48 Link: identity Obs per group: min = 9 Family: Gaussian avg = 9.0 Correlation: AR(1) max = 9 Wald chi2(1) = Scale parameter: Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | _cons | GEE-fit Marginal Model with AR1 Correlation structure

Pigs – RE model: AR(1) xtregar weight time RE GLS regression with AR(1) disturbances Number of obs = 432 Group variable (i): Id Number of groups = 48 R-sq: within = Obs per group: min = 9 between = avg = 9.0 overall = max = 9 Wald chi2(2) = corr(u_i, Xb) = 0 (assumed) Prob > chi2 = weight | Coef. Std. Err. z P>|z| [95% Conf. Interval] time | _cons | rho_ar | (estimated autocorrelation coefficient) sigma_u | sigma_e | rho_fov | (fraction of variance due to u_i) theta | Random effects model with AR1 Correlation structure

Important Points Modeling the correlation in longitudinal data is important to be able to obtain correct inferences on regression coefficients  There are correspondences between random effect and marginal models in the linear case because the interpretation of the regression coefficients is the same as that in standard linear regression

Example using Stata:

Random intercept model –U_1j ~ Normal(0, tau2) –e_ij ~ Normal(0,sigma2) What do these random effects mean?

What is the estimate of the within child correlation? 0.92^2 / (0.73^ ^2) = Interpretation of the age coefficients are hard.

Here we have two random effects: –Random intercept: U_1j –Random slope: U_2j –We assume these two effects follow a multi- variate normal distribution with variances tau12 and tau22 and covariance tau12 e_ij ~ Normal(0, sigma2)

Significant gender effect!