Module 3: An Example of a Two-stage Model; NMMAPS Study

Slides:



Advertisements
Similar presentations
Bayesian Distributed Lag Models: Estimating Effects of Particulate Matter Air Pollution on Daily Mortality Leah J. Welty, PhD Department of Preventive.
Advertisements

The Simple Linear Regression Model Specification and Estimation Hill et al Chs 3 and 4.
Modeling of Data. Basic Bayes theorem Bayes theorem relates the conditional probabilities of two events A, and B: A might be a hypothesis and B might.
Chapter 9: Simple Regression Continued
Properties of Least Squares Regression Coefficients
Introduction to Smoothing and Spatial Regression
INTRODUCTION TO MACHINE LEARNING Bayesian Estimation.
1 Parametric Empirical Bayes Methods for Microarrays 3/7/2011 Copyright © 2011 Dan Nettleton.
Uncertainty and confidence intervals Statistical estimation methods, Finse Friday , 12.45–14.05 Andreas Lindén.
2005 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg.
2005 Hopkins Epi-Biostat Summer Institute1 Module I: Statistical Background on Multi-level Models Francesca Dominici Michael Griswold The Johns Hopkins.
Psychology 202b Advanced Psychological Statistics, II February 15, 2011.
Model Choice in Time Series Studies of Air Pollution and Health Roger D. Peng, PhD Department of Biostatistics Johns Hopkins Blomberg School of Public.
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Health Risks of Exposure to Chemical Composition of Fine Particulate Air Pollution Francesca Dominici Yeonseung Chung Michelle Bell Roger Peng Department.
Mapping Rates and Proportions. Incidence rates Mortality rates Birth rates Prevalence Proportions Percentages.
Are the Mortality Effects of PM 10 the Result of Inadequate Modeling of Temperature and Seasonality? Leah Welty EBEG February 2, 2004 Joint work with Scott.
1 Bayesian methods for parameter estimation and data assimilation with crop models Part 2: Likelihood function and prior distribution David Makowski and.
Modeling Menstrual Cycle Length in Pre- and Peri-Menopausal Women Michael Elliott Xiaobi Huang Sioban Harlow University of Michigan School of Public Health.
Lecture 2 Basic Bayes and two stage normal normal model…
2006 Hopkins Epi-Biostat Summer Institute1 Module 2: Bayesian Hierarchical Models Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and.
Guide to Handling Missing Information Contacting researchers Algebraic recalculations, conversions and approximations Imputation method (substituting missing.
Statistical Decision Theory
Reinhard Mechler, Markus Amann, Wolfgang Schöpp International Institute for Applied Systems Analysis A methodology to estimate changes in statistical life.
Term 4, 2005BIO656 Multilevel Models1 Hierarchical Models for Pooling: A Case Study in Air Pollution Epidemiology Francesca Dominici.
01/20151 EPI 5344: Survival Analysis in Epidemiology Maximum Likelihood Estimation: An Introduction March 10, 2015 Dr. N. Birkett, School of Epidemiology,
Statistical Applications for Meta-Analysis Robert M. Bernard Centre for the Study of Learning and Performance and CanKnow Concordia University December.
2006 Summer Epi/Bio Institute1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Instructor: Elizabeth Johnson Lecture Developed: Francesca.
Lecture Topic 5 Pre-processing AFFY data. Probe Level Analysis The Purpose –Calculate an expression value for each probe set (gene) from the PM.
Statistical Decision Theory Bayes’ theorem: For discrete events For probability density functions.
2005 Hopkins Epi-Biostat Summer Institute1 Module 3: An Example of a Two-stage Model; NMMAPS Study Francesca Dominici Michael Griswold The Johns Hopkins.
Introduction Outdoor air pollution has a negative effect on health. On days of high air pollution, rates of cardiovascular and respiratory events increase.
Multi-level Models Summer Institute 2005 Francesca Dominici Michael Griswold The Johns Hopkins University Bloomberg School of Public Health.
Bayesian Travel Time Reliability
Machine Learning 5. Parametric Methods.
Montserrat Fuentes Statistics Department NCSU Research directions in climate change SAMSI workshop, September 14, 2009.
Tutorial I: Missing Value Analysis
July Hopkins Epi-Biostat Summer Institute Module II: An Example of a Two- stage Model; NMMAPS Study Francesca Dominici and Scott L. Zeger.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
1 Part09: Applications of Multi- level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
1 Spatial assessment of deprivation and mortality risk in Nova Scotia: Comparison between Bayesian and non-Bayesian approaches Prepared for 2008 CPHA Conference,
1 Module IV: Applications of Multi-level Models to Spatial Epidemiology Francesca Dominici & Scott L Zeger.
Bootstrapping James G. Anderson, Ph.D. Purdue University.
Density Estimation in R Ha Le and Nikolaos Sarafianos COSC 7362 – Advanced Machine Learning Professor: Dr. Christoph F. Eick 1.
Assignable variation Deviations with a specific cause or source. Click here for Hint assignable variation or SPC or LTPD?
Acceptable quality level (AQL) Proportion of defects a consumer considers acceptable. Click here for Hint AQL or producer’s risk or assignable variation?
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
1 Life Cycle Assessment A product-oriented method for sustainability analysis UNEP LCA Training Kit Module k – Uncertainty in LCA.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
Canadian Bioinformatics Workshops
Regression Models for Linkage: Merlin Regress
Types of risk Market risk
Introduction to Inference
Missing data: Why you should care about it and what to do about it
usually unimportant in social surveys:
Module 2: Bayesian Hierarchical Models
AP Biology Intro to Statistics
acceptable quality level (AQL) Proportion of defects
Potential predictability, ensemble forecasts and tercile probabilities
CJT 765: Structural Equation Modeling
Chapter 3: TWO-VARIABLE REGRESSION MODEL: The problem of Estimation
Combining Random Variables
Gene mapping in mice Karl W Broman Department of Biostatistics
Figures for Jerrett et al. EDE –
Induction and latency (J-F Boivin, March 2006)
Integration of sensory modalities
5.2 Least-Squares Fit to a Straight Line
ECON734: Spatial Econometrics – Lab 3
Computing and Statistical Data Analysis / Stat 7
STA 291 Summer 2008 Lecture 12 Dustin Lueker.
Presentation transcript:

Module 3: An Example of a Two-stage Model; NMMAPS Study Instructor: Elizabeth Johnson Course Developed: Francesca Dominici and Michael Griswold The Johns Hopkins University Bloomberg School of Public Health 2006 Hopkins Epi-Biostat Summer Institute

NMMAPS Example of Two-Stage Hierarchical Model National Morbidity and Mortality Air Pollution Study (NMMAPS) Daily data on cardiovascular/respiratory mortality in 10 largest cities in U.S. Daily particulate matter (PM10) data Log-linear regression estimate relative risk of mortality per 10 unit increase in PM10 for each city Estimate and statistical standard error for each city 2006 Hopkins Epi-Biostat Summer Institute

Semi-Parametric Poisson Regression weather Log-Relative Rate Log-relative rate Season Season Weather Splines splines 2006 Hopkins Epi-Biostat Summer Institute

Relative Risks* for Six Largest Cities City RR Estimate (% per 10 micrograms/ml Statistical Standard Error Statistical Variance Los Angeles 0.25 0.13 .0169 New York 1.4 .0625 Chicago 0.60 Dallas/Ft Worth 0.55 .3025 Houston 0.45 0.40 .1600 San Diego 1.0 .2025 Approximate values read from graph in Daniels, et al. 2000. AJE 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Notation 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute City

2006 Hopkins Epi-Biostat Summer Institute City

2006 Hopkins Epi-Biostat Summer Institute City

2006 Hopkins Epi-Biostat Summer Institute City

2006 Hopkins Epi-Biostat Summer Institute Notation 2006 Hopkins Epi-Biostat Summer Institute

Estimating Overall Mean Idea: give more weight to more precise values Specifically, weight estimates inversely proportional to their variances 2006 Hopkins Epi-Biostat Summer Institute

Estimating the Overall Mean 2006 Hopkins Epi-Biostat Summer Institute

Calculations for Empirical Bayes Estimates City Log RR (bc) Stat Var (vc) Total Var (TVc) 1/TVc wc LA 0.25 .0169 .0994 10.1 .27 NYC 1.4 .0625 .145 6.9 .18 Chi 0.60 Dal .3025 .385 2.6 .07 Hou 0.45 .160 ,243 4.1 .11 SD 1.0 .2025 .285 3.5 .09 Over-all 0.65 37.3 1.00 a = .27* 0.25 + .18*1.4 + .27*0.60 + .07*0.25 + .11*0.45 + 0.9*1.0 = 0.65 Var(a) = 1/Sum(1/TVc) = 0.164^2 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Software in R beta.hat <-c(0.25,1.4,0.50,0.25,0.45,1.0) se <- c(0.13,0.25,0.13,0.55,0.40,0.45) NV <- var(beta.hat) - mean(se^2) TV <- se^2 + NV tmp<- 1/TV ww <- tmp/sum(tmp) v.alphahat <- sum(ww)^{-1} alpha.hat <- v.alphahat*sum(beta.hat*ww) 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Two Extremes Natural variance >> Statistical variances Weights wc approximately constant = 1/n Use ordinary mean of estimates regardless of their relative precision Statistical variances >> Natural variance Weight each estimator inversely proportional to its statistical variance 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

Estimating Relative Risk for Each City Disease screening analogy Test result from imperfect test Positive predictive value combines prevalence with test result using Bayes theorem Empirical Bayes estimator of the true value for a city is the conditional expectation of the true value given the data 2006 Hopkins Epi-Biostat Summer Institute

Empirical Bayes Estimation 2006 Hopkins Epi-Biostat Summer Institute

Calculations for Empirical Bayes Estimates City Log RR Stat Var (vc) Total Var (TVc) 1/TVc wc RR.EB se LA 0.25 .0169 .0994 10.1 .27 .83 0.32 0.17 NYC 1.4 .0625 .145 6.9 .18 .57 1.1 0.14 Chi 0.60 0.61 0.11 Dal .3025 .385 2.6 .07 .21 0.56 0.12 Hou 0.45 .160 ,243 4.1 .11 .34 0.58 SD 1.0 .2025 .285 3.5 .09 .29 0.75 0.13 Over-all 0.65 1/37.3= 0.027 37.3 1.00 0.16 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Maximum likelihood estimates Empirical Bayes estimates 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Key Ideas Better to use data for all cities to estimate the relative risk for a particular city Reduce variance by adding some bias Smooth compromise between city specific estimates and overall mean Empirical-Bayes estimates depend on measure of natural variation Assess sensitivity to estimate of NV 2006 Hopkins Epi-Biostat Summer Institute

2006 Hopkins Epi-Biostat Summer Institute Caveats Used simplistic methods to illustrate the key ideas: Treated natural variance and overall estimate as known when calculating uncertainty in EB estimates Assumed normal distribution or true relative risks Can do better using Markov Chain Monte Carlo methods – more to come 2006 Hopkins Epi-Biostat Summer Institute