Multilevel modeling in R Tom Dunn and Thom Baguley, Psychology, Nottingham Trent University

Slides:



Advertisements
Similar presentations
MCMC estimation in MlwiN
Advertisements

To go further: intra- versus interindividual variability.
Continued Psy 524 Ainsworth
Longitudinal data analysis in HLM. Longitudinal vs cross-sectional HLM Similar things: Fixed effects Random effects Difference: Cross-sectional HLM: individual,
Topic 12: Multiple Linear Regression
GENERAL LINEAR MODELS: Estimation algorithms
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 13 Nonlinear and Multiple Regression.
Lecture 6 (chapter 5) Revised on 2/22/2008. Parametric Models for Covariance Structure We consider the General Linear Model for correlated data, but assume.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Linear Modelling III Richard Mott Wellcome Trust Centre for Human Genetics.
School of Veterinary Medicine and Science Multilevel modelling Chris Hudson.
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.
Chapter 12 Multiple Regression
Lecture 19: Tues., Nov. 11th R-squared (8.6.1) Review
Log-linear and logistic models
An Introduction to Logistic Regression
Longitudinal Data Analysis: Why and How to Do it With Multi-Level Modeling (MLM)? Oi-man Kwok Texas A & M University.
Statistics 200b. Chapter 5. Chapter 4: inference via likelihood now Chapter 5: applications to particular situations.
Simple Linear Regression Analysis
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
GEE and Generalized Linear Mixed Models
Introduction to Multilevel Modeling Using SPSS
Multilevel Modeling: Other Topics
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
Stats for Engineers Lecture 9. Summary From Last Time Confidence Intervals for the mean t-tables Q Student t-distribution.
Lecture 5 Linear Mixed Effects Models
ALISON BOWLING THE GENERAL LINEAR MODEL. ALTERNATIVE EXPRESSION OF THE MODEL.
7/16/2014Wednesday Yingying Wang
Random Regressors and Moment Based Estimation Prepared by Vera Tabakova, East Carolina University.
Linear Modelling III Richard Mott Wellcome Trust Centre for Human Genetics.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
Multilevel Linear Models Field, Chapter 19. Why use multilevel models? Meeting the assumptions of the linear model – Homogeneity of regression coefficients.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
BUSI 6480 Lecture 8 Repeated Measures.
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
Adjusted from slides attributed to Andrew Ainsworth
Regression Analysis Part C Confidence Intervals and Hypothesis Testing
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
N318b Winter 2002 Nursing Statistics Specific statistical tests Chi-square (  2 ) Lecture 7.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Simple Linear Regression Analysis Chapter 13.
Machine Learning 5. Parametric Methods.
ALISON BOWLING MAXIMUM LIKELIHOOD. GENERAL LINEAR MODEL.
Tutorial I: Missing Value Analysis
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
Biostatistics Case Studies Peter D. Christenson Biostatistician Session 3: Missing Data in Longitudinal Studies.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
Nonparametric Statistics
> xyplot(cat~time|id,dd,groups=ses,lty=1:3,type="l") > dd head(dd) id ses cat pre post time
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
The “Big Picture” (from Heath 1995). Simple Linear Regression.
29 October 2009 MRC CBU Graduate Statistics Lectures 4: GLM: The General Linear Model - ANOVA & ANCOVA1 MRC Cognition and Brain Sciences Unit Graduate.
Estimating standard error using bootstrap
BINARY LOGISTIC REGRESSION
CHAPTER 7 Linear Correlation & Regression Methods
Linear Mixed Models in JMP Pro
STAT120C: Final Review.
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
CHAPTER 29: Multiple Regression*
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
BY: Mohammed Hussien Feb 2019 A Seminar Presentation on Longitudinal data analysis Bahir Dar University School of Public Health Post Graduate Program.
Andrea Friese, Silvia Artuso, Danai Laina
Presentation transcript:

Multilevel modeling in R Tom Dunn and Thom Baguley, Psychology, Nottingham Trent University

2 1. Models for repeated measures or clustered data

3 Repeated measures ANOVA Usual practice is psychology is to analyze repeated measures data using ANOVA: One-way independent measures One-way repeated measures

4 Limitations of standard approaches e.g., repeated measures ANOVA -sphericity or multi-sample sphericity assumptions -dealing with non-orthogonal predictors e.g., time-varying covariates in RM ANOVA -dealing with missing values -treating items as fixed effects (e.g., Clark, 1973) -the problem of categorization -the problem of aggregation/disaggregation

5 Avoid repeated measures regression! Repeated measures regression is one attempt to deal with limitations of ANOVA such as non-orthogonal predictors e.g., using manual dummy coding (Lorch & Myers, 1990; Pedhazur, 1982) - data hungry (each indicator requires 1 df) - assumes sphericity; fixed effects - less flexible, powerful than multilevel models (Misangyi et al., 2006)

6 Multilevel models with random intercepts A random intercept model has predictors with fixed effects only: e.g.,fixed random … or combined in single equation:

7

8 Random effects in multilevel models In the random intercept model the individual differences at level 2 are (like the random error at level 1) assumed to have a normal distribution: Individual differences in the effect of a predictor can also be modeled this way:

9

10 Example - voice pitch (1) How is male voice pitch is related to subjective attractiveness of a female face? † - 30 male participants - 32 female faces - ratings of attractiveness (1-9) in 2 contexts - potential time-varying covariates (e.g., baseline measure) Classical ANOVA: i) treats participants (but not faces) as random sample ii) can't incorporate time-varying covariates iii) aggregates data (effective n = 30 per context) † Data from Dunn, Wells and Baguley (in prep) and in Baguley (2012)

11 Example - voice pitch (1) contd. library(nlme) pitch.ri <- lme(pitch ~ base + attract, random=~ 1|Participant, data=pitch) summary(pitch.ri) Linear mixed-effects model fit by REML Data: pitch Log-restricted-likelihood: Fixed: pitch ~ base + attract (Intercept) base attract Random effects: Formula: ~1 | Participant (Intercept) Residual StdDev: Number of Observations: 1920 Number of Groups: 30

12 Modeling covariance matrices 1 In a two level random-intercept model the covariance structure is: This, in effect, assumes a form of compound symmetry of the repeated measures (with equal variances and covariances all zero)

13 Modeling covariance matrices 2 In a two level random-slope model with one predictor random at level 2 the covariance matrix is: In a repeated measures design this models the individual differences in the effect of the predictor (as well as its covariance with the intercept).

14 Unstructured covariance matrices In repeated measures it is possibly to have an unconstrained covariance matrix at the participant level (usually level 2). This is an example for four measurement occasions: In this kind of unstructured matrix there are no assumptions about the form of the matrix (e.g., sphericity, compound symmetry or multisample sphericity)

15 Example - random effect of voice pitch (2) Attractiveness might not have a fixed effect (in fact it is more likely that it varies between people) lme(pitch ~ base + attract, random=~ attract|Participant, data=pitch) AIC BIC logLik Random effects: Formula: ~1 | Participant (Intercept) Residual StdDev: Fixed effects: pitch ~ base + attract Value Std.Error DF t-value p-value (Intercept) base attract

16

17

18 2. Estimation and inference

19 Estimation in multilevel models Estimation is iterative and usually uses maximum likelihood (as with logistic regression): - IGLS or FML (iterative generalized least squares) - RIGLS or RML (restricted maximum likelihood estimation) - Parametric bootstrapping - Non-parametric bootstrapping - MCMC (Markov chain Monte Carlo methods)

20 Comparing models Confidence intervals and tests - deviance (likelihood ratio) tests (-2LL or change in -2 log likelihood has approximate χ 2 distribution) -Wald tests and CIs (estimate/SE has approximate z distribution) Information criteria -AIC, BIC or (MCMC derived) DIC (-2LL with a penalty for number of parameters)

21 Accurate inference … - for standard repeated measures ANOVA models it is possible to use t and F statistics -if a complex covariance structure (anything other than compound symmetry) or unbalanced model is used then inference is problematic owing to: a) difficulty estimating the error df b) boundary effects (for variances)

22 Possible solutions - asymptotic approximations (in large samples)* -corrections such as the Kenwood-Rogers approximation (e.g., using pbkrtest ) -bootstrapping* -MCMC estimation (e.g., using lme4 or MCMCglmm )* * e.g., see Baguley (2012) for examples

23 Requirements for accurate estimation Centering predictors -essential to use appropriate centering strategy in random slope models see Enders & Tofighi (2007) Nested versus fully-crossed structures -many experimental designs in psychology are fully crossed see Baayen et al. (2008) Estimation, sample size and bias -sample size at highest level of model is crucial see Hox (2002), Maas & Hox (2005)

24 Nested versus fully crossed structures In nested structures lower level units occur in only one higher level unit e.g., children in schools In fully crossed structures ‘lower’ level units are observed within all ‘higher’ level units e.g., same 32 faces used for all 30 participants Baayen et al. (2006) argue that many researchers incorrectly model fully crossed structures as nested

25

26 Example - fully crossed model (3) detach(package:nlme) ; library(lme4) lmer(pitch ~ base + attract + (1|Participant) + (1|Face), data=pitch) † Formula: pitch ~ base + attract + (1 | Participant) + (1 | Face) AIC BIC logLik deviance REMLdev Random effects: Groups Name Variance Std.Dev. Face (Intercept) Participant (Intercept) Residual Number of obs: 1920, groups: Face, 32; Participant, 30 Fixed effects: Estimate Std. Error t value (Intercept) base attract lmer(pitch ~ base + attract + (attract|Participant) + (1|Face), data=pitch) † Assumes that the fully crossed structure is correctly coded in the data set

27 Advantages of multilevel approaches -often greater statistical power (e.g., for RM ANOVA) -multiple random factors (nested or crossed) -copes with non-orthogonal predictors -copes with time-varying covariates -assumes missing outcomes are MAR (not MCAR) -explicitly models variances and covariances