Module 2: Bayesian Hierarchical Models

Slides:



Advertisements
Similar presentations
Questions From Yesterday
Advertisements

Hierarchical Linear Modeling: An Introduction & Applications in Organizational Research Michael C. Rodriguez.
G Lecture 10 SEM methods revisited Multilevel models revisited
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
1 SSS II Lecture 1: Correlation and Regression Graduate School 2008/2009 Social Science Statistics II Gwilym Pryce
2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
Lecture 4 Linear random coefficients models. Rats example 30 young rats, weights measured weekly for five weeks Dependent variable (Y ij ) is weight for.
2005 Hopkins Epi-Biostat Summer Institute1 Module I: Statistical Background on Multi-level Models Francesca Dominici Michael Griswold The Johns Hopkins.
Longitudinal Experiments Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 28, 2010.
Clustered or Multilevel Data
1 Module III: Profiling Health Care Providers – A Multi-level Model Application Francesca Dominici & Michael Griswold.
Department of Geography, Florida State University
Analysis of Clustered and Longitudinal Data
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
Lecture 2 Basic Bayes and two stage normal normal model…
2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.
From GLM to HLM Working with Continuous Outcomes EPSY 5245 Michael C. Rodriguez.
Model Inference and Averaging
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Introduction Multilevel Analysis
Incorporating heterogeneity in meta-analyses: A case study Liz Stojanovski University of Newcastle Presentation at IBS Taupo, New Zealand, 2009.
Multilevel Data in Outcomes Research Types of multilevel data common in outcomes research Random versus fixed effects Statistical Model Choices “Shrinkage.
2006 Summer Epi/Bio Institute1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Instructor: Elizabeth Johnson Lecture Developed: Francesca.
Three Frameworks for Statistical Analysis. Sample Design Forest, N=6 Field, N=4 Count ant nests per quadrat.
Multi-level Models Summer Institute 2005 Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg School of Public Health.
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
December 2010T. A. Louis: Basic Bayes 1 Basic Bayes.
G Lecture 71 Revisiting Hierarchical Mixed Models A General Version of the Model Variance/Covariances of Two Kinds of Random Effects Parameter Estimation.
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Parameter Estimation. Statistics Probability specified inferred Steam engine pump “prediction” “estimation”
1 Spatial assessment of deprivation and mortality risk in Nova Scotia: Comparison between Bayesian and non-Bayesian approaches Prepared for 2008 CPHA Conference,
1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Hierarchical models. Hierarchical with respect to Response being modeled – Outliers – Zeros Parameters in the model – Trends (Us) – Interactions (Bs)
Canadian Bioinformatics Workshops
Bursts modelling Using WinBUGS Tim Watson May 2012 :diagnostics/ :transformation/ :investment planning/ :portfolio optimisation/ :investment economics/
Hierarchical Models. Conceptual: What are we talking about? – What makes a statistical model hierarchical? – How does that fit into population analysis?
CHAPTER 6: SAMPLING, SAMPLING DISTRIBUTIONS, AND ESTIMATION Leon-Guerrero and Frankfort-Nachmias, Essentials of Statistics for a Diverse Society.
Prediction and Missing Data. Summarising Distributions ● Models are often large and complex ● Often only interested in some parameters – e.g. not so interested.
Module II Lecture 1: Multiple Regression
Multiple Regression.
Multilevel modelling: general ideas and uses
Data Analysis Patrice Koehl Department of Biological Sciences
Bayesian Semi-Parametric Multiple Shrinkage
Analysis for Designs with Assignment of Both Clusters and Individuals
Chapter 12 Simple Linear Regression and Correlation
Topic 10 - Linear Regression
CHAPTER 7 Linear Correlation & Regression Methods
Bayesian data analysis
Model Inference and Averaging
Linear Mixed Models in JMP Pro
Statistical Models for the Analysis of Single-Case Intervention Data
Linear and generalized linear mixed effects models
Latent Variables, Mixture Models and EM
Simple Linear Regression - Introduction
How to handle missing data values
Regression Analysis Week 4.
G Lecture 6 Multilevel Notation; Level 1 and Level 2 Equations
CHAPTER 29: Multiple Regression*
Inference about the Slope and Intercept
Multiple Regression.
Mark Rothmann U.S. Food and Drug Administration September 14, 2018
Inference about the Slope and Intercept
Module 3: An Example of a Two-stage Model; NMMAPS Study
LEARNING OUTCOMES After studying this chapter, you should be able to
From GLM to HLM Working with Continuous Outcomes
Fixed, Random and Mixed effects
An Introductory Tutorial
Bayesian Data Analysis in R
Mathematical Foundations of BME Reza Shadmehr
Factor Analysis.
Presentation transcript:

Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg School of Public Health 2005 Hopkins Epi-Biostat Summer Institute

Key Points from yesterday “Multi-level” Models: Have covariates from many levels and their interactions Acknowledge correlation among observations from within a level (cluster) Random effect MLMs condition on unobserved “latent variables” to describe correlations Random Effects models fit naturally into a Bayesian paradigm Bayesian methods combine prior beliefs with the likelihood of the observed data to obtain posterior inferences 2005 Hopkins Epi-Biostat Summer Institute

Bayesian Hierarchical Models Module 2: Example 1: School Test Scores The simplest two-stage model WinBUGS Example 2: Aww Rats A normal hierarchical model for repeated measures 2005 Hopkins Epi-Biostat Summer Institute

Example 1: School Test Scores 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute Testing in Schools Goldstein et al. (1993) Goal: differentiate between `good' and `bad‘ schools Outcome: Standardized Test Scores Sample: 1978 students from 38 schools MLM: students (obs) within schools (cluster) Possible Analyses: Calculate each school’s observed average score Calculate an overall average for all schools Borrow strength across schools to improve individual school estimates 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute Testing in Schools Why borrow information across schools? Median # of students per school: 48, Range: 1-198 Suppose small school (N=3) has: 90, 90,10 (avg=63) Suppose large school (N=100) has avg=65 Suppose school with N=1 has: 69 (avg=69) Which school is ‘better’? Difficult to say, small N  highly variable estimates For larger schools we have good estimates, for smaller schools we may be able to borrow information from other schools to obtain more accurate estimates How? Bayes 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: “Direct Estimates” Mean Scores & C.I.s for Individual Schools Model: E(Yij) = j =  + b*j b*j  2005 Hopkins Epi-Biostat Summer Institute

Fixed and Random Effects Standard Normal regression models: ij ~ N(0,2) 1. Yij =  + ij 2. Yij = j + ij =  + b*j + ij j = X (overall avg) j = Xj (shool avg) Fixed Effects = X + b*j = X + (Xj – X) 2005 Hopkins Epi-Biostat Summer Institute

Fixed and Random Effects Standard Normal regression models: ij ~ N(0,2) 1. Yij =  + ij 2. Yij = j + ij =  + b*j + ij A random effects model: 3. Yij | bj =  + bj + ij, with: bj ~ N(0,2) Random Effects j = X (overall avg) j = Xj (shool avg) Fixed Effects = X + b*j = X + (Xj – X) Represents Prior beliefs about similarities between schools! 2005 Hopkins Epi-Biostat Summer Institute

Fixed and Random Effects Standard Normal regression models: ij ~ N(0,2) 1. Yij =  + ij 2. Yij = j + ij =  + b*j + ij A random effects model: 3. Yij | bj =  + bj + ij, with: bj ~ N(0,2) Random Effects Estimate is part-way between the model and the data Amount depends on variability () and underlying truth () j = X (overall avg) j = Xj (shool avg) Fixed Effects = X + b*j = X + (Xj – X) j = X + bjblup = X + b*j = X + (Xj – X) 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: Shrinkage Plot b*j  bj 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: Winbugs Data: i=1..1978 (students), s=1…38 (schools) Model: Yis ~ Normal(s , 2y) s ~ Normal( , 2) (priors on school avgs) Note: WinBUGS uses precision instead of variance to specify a normal distribution! WinBUGS: Yis ~ Normal(s , y) with: 2y = 1 / y s ~ Normal( , ) with: 2 = 1 /  2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: Winbugs WinBUGS Model: Yis ~ Normal(s , y) with: 2y = 1 / y s ~ Normal( , ) with: 2 = 1 /  y ~ (0.001,0.001) (prior on precision) Hyperpriors Prior on mean of school means  ~ Normal(0 , 1/1000000) Prior on precision (inv. variance) of school means  ~ (0.001,0.001) Using “Vague” / “Noninformative” Priors 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: Winbugs Full WinBUGS Model: Yis ~ Normal(s , y) with: 2y = 1 / y s ~ Normal( , ) with: 2 = 1 /  y ~ (0.001,0.001)  ~ Normal(0 , 1/1000000)  ~ (0.001,0.001) 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools: Winbugs WinBUGS Code: model { for( i in 1 : N ) { Y[i] ~ dnorm(mu[i],y.tau) mu[i] <- alpha[school[i]] } for( s in 1 : M ) { alpha[s] ~ dnorm(alpha.c, alpha.tau) y.tau ~ dgamma(0.001,0.001) sigma <- 1 / sqrt(y.tau) alpha.c ~ dnorm(0.0,1.0E-6) alpha.tau ~ dgamma(0.001,0.001) 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute Example 2: Aww, Rats… A normal hierarchical model for repeated measures 2005 Hopkins Epi-Biostat Summer Institute

Improving individual-level estimates Gelfand et al (1990) 30 young rats, weights measured weekly for five weeks Dependent variable (Yij) is weight for rat “i” at week “j” Data: Multilevel: weights (observations) within rats (clusters) 2005 Hopkins Epi-Biostat Summer Institute

Individual & population growth Rat “i” has its own expected growth line: E(Yij) = b0i + b1iXj There is also an overall, average population growth line: E(Yij) = 0 + 1Xj Weight Pop line (average growth) Individual Growth Lines Study Day (centered) 2005 Hopkins Epi-Biostat Summer Institute

Improving individual-level estimates Possible Analyses Each rat (cluster) has its own line: intercept= bi0, slope= bi1 All rats follow the same line: bi0 = 0 , bi1 = 1 A compromise between these two: Each rat has its own line, BUT… the lines come from an assumed distribution E(Yij | bi0, bi1) = bi0 + bi1Xj bi0 ~ N(0 , 02) bi1 ~ N(1 , 12) “Random Effects” 2005 Hopkins Epi-Biostat Summer Institute

Bayes-Shrunk Individual Growth Lines A compromise: Each rat has its own line, but information is borrowed across rats to tell us about individual rat growth Weight Pop line (average growth) Bayes-Shrunk Individual Growth Lines 2005 Hopkins Epi-Biostat Summer Institute Study Day (centered)

Rats: Winbugs (see help: Examples Vol I) WinBUGS Model: 2005 Hopkins Epi-Biostat Summer Institute

Rats: Winbugs (see help: Examples Vol I) WinBUGS Code: 2005 Hopkins Epi-Biostat Summer Institute

Rats: Winbugs (see help: Examples Vol I) WinBUGS Results: 10000 updates 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute WinBUGS Diagnostics: MC error tells you to what extent simulation error contributes to the uncertainty in the estimation of the mean. This can be reduced by generating additional samples. Always examine the trace of the samples. To do this select the history button on the Sample Monitor Tool. Look for: Trends Correlations 2005 Hopkins Epi-Biostat Summer Institute

Rats: Winbugs (see help: Examples Vol I) WinBUGS Diagnostics: history 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute WinBUGS Diagnostics: Examine sample autocorrelation directly by selecting the ‘auto cor’ button. If autocorrelation exists, generate additional samples and thin more. 2005 Hopkins Epi-Biostat Summer Institute

Rats: Winbugs (see help: Examples Vol I) WinBUGS Diagnostics: autocorrelation 2005 Hopkins Epi-Biostat Summer Institute

Bayes-Shrunk Growth Lines Individual Growth Lines WinBUGS provides machinery for Bayesian paradigm “shrinkage estimates” in MLMs Bayes Weight Weight Pop line (average growth) Pop line (average growth) Bayes-Shrunk Growth Lines Individual Growth Lines Study Day (centered) 2005 Hopkins Epi-Biostat Summer Institute Study Day (centered)

School Test Scores Revisited 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools revisited Suppose we wanted to include covariate information in the school test scores example Student-level covariates Gender London Reading Test (LRT) score Verbal reasoning (VR) test category (1, 2 or 3) School -level covariates Gender intake (all girls, all boys or mixed) Religious denomination (Church of England, Roman Catholic, State school or other) 2005 Hopkins Epi-Biostat Summer Institute

Testing in Schools revisited Model Wow! Can YOU fit this model? Yes you can! See WinBUGS>help>Examples Vol II for data, code, results, etc. More Importantly: Do you understand this model? 2005 Hopkins Epi-Biostat Summer Institute

2005 Hopkins Epi-Biostat Summer Institute Bayesian Concepts Frequentist: Parameters are “the truth” Bayesian: Parameters have a distribution “Borrow Strength” from other observations “Shrink Estimates” towards overall averages Compromise between model & data Incorporate prior/other information in estimates Account for other sources of uncertainty Posterior  Likelihood * Prior 2005 Hopkins Epi-Biostat Summer Institute