Analysis of matched data Analysis of matched data.

Slides:



Advertisements
Similar presentations
Exploring the Shape of the Dose-Response Function.
Advertisements

The %LRpowerCorr10 SAS Macro Power Estimation for Logistic Regression Models with Several Predictors of Interest in the Presence of Covariates D. Keith.
M2 Medical Epidemiology
Logistic Regression I Outline Introduction to maximum likelihood estimation (MLE) Introduction to Generalized Linear Models The simplest logistic regression.
Simple Logistic Regression
Introduction to Survival Analysis October 19, 2004 Brian F. Gage, MD, MSc with thanks to Bing Ho, MD, MPH Division of General Medical Sciences.
Logistic Regression Example: Horseshoe Crab Data
COPYRIGHT OF: ABHINAV ANAND JYOTI ARORA SHRADDHA RAMSWAMY DISCRETE CHOICE MODELING IN HEALTH ECONOMICS.
Overview of Logistics Regression and its SAS implementation
1 If we live with a deep sense of gratitude, our life will be greatly embellished.
April 25 Exam April 27 (bring calculator with exp) Cox-Regression
Logistic Regression Multivariate Analysis. What is a log and an exponent? Log is the power to which a base of 10 must be raised to produce a given number.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
EPI 809 / Spring 2008 Final Review EPI 809 / Spring 2008 Ch11 Regression and correlation  Linear regression Model, interpretation. Model, interpretation.
Adjusting for extraneous factors Topics for today Stratified analysis of 2x2 tables Regression Readings Jewell Chapter 9.
PH6415 Review Questions. 2 Question 1 A journal article reports a 95%CI for the relative risk (RR) of an event (treatment versus control as (0.55, 0.97).
Analysis of matched data; plus, diagnostic testing.
Chapter 11 Survival Analysis Part 2. 2 Survival Analysis and Regression Combine lots of information Combine lots of information Look at several variables.
EPI 809/Spring Multiple Logistic Regression.
Nemours Biomedical Research Statistics April 23, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
1 Modeling Ordinal Associations Section 9.4 Roanna Gee.
Logistic Regression Biostatistics 510 March 15, 2007 Vanessa Perez.
BIOST 536 Lecture 12 1 Lecture 12 – Introduction to Matching.
Notes on Logistic Regression STAT 4330/8330. Introduction Previously, you learned about odds ratios (OR’s). We now transition and begin discussion of.
Adjusting for extraneous factors Topics for today More on logistic regression analysis for binary data and how it relates to the Wolf and Mantel- Haenszel.
BIOST 536 Lecture 4 1 Lecture 4 – Logistic regression: estimation and confounding Linear model.
Logistic Regression Logistic Regression - Dichotomous Response variable and numeric and/or categorical explanatory variable(s) –Goal: Model the probability.
The 2x2 table, RxCxK contingency tables, and pair-matched data July 27, 2004.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 12: Multiple and Logistic Regression Marshall University.
Conditional Logistic Regression for Matched Data HRP /25/04 reading: Agresti chapter 9.2.
Logistic Regression II Simple 2x2 Table (courtesy Hosmer and Lemeshow) Exposure=1Exposure=0 Disease = 1 Disease = 0.
Logistic Regression. Outline Review of simple and multiple regressionReview of simple and multiple regression Simple Logistic RegressionSimple Logistic.
Analysis of Categorical Data
Logistic Regression III: Advanced topics Conditional Logistic Regression for Matched Data Conditional Logistic Regression for Matched Data.
Logistic Regression I HRP 261 2/09/04 Related reading: chapters and of Agresti.
April 11 Logistic Regression –Modeling interactions –Analysis of case-control studies –Data presentation.
EIPB 698E Lecture 10 Raul Cruz-Cano Fall Comments for future evaluations Include only output used for conclusions Mention p-values explicitly (also.
Excepted from HSRP 734: Advanced Statistical Methods June 5, 2008.
Analysis of matched data HRP /02/04 Chapter 9 Agresti – read sections 9.1 and 9.2.
April 6 Logistic Regression –Estimating probability based on logistic model –Testing differences among multiple groups –Assumptions for model.
2 December 2004PubH8420: Parametric Regression Models Slide 1 Applications - SAS Parametric Regression in SAS –PROC LIFEREG –PROC GENMOD –PROC LOGISTIC.
Introduction to Logistic Regression Rachid Salmi, Jean-Claude Desenclos, Alain Moren, Thomas Grein.
Applied Epidemiologic Analysis - P8400 Fall 2002
1 היחידה לייעוץ סטטיסטי אוניברסיטת חיפה פרופ’ בנימין רייזר פרופ’ דוד פרג’י גב’ אפרת ישכיל.
LOGISTIC REGRESSION A statistical procedure to relate the probability of an event to explanatory variables Used in epidemiology to describe and evaluate.
Linear correlation and linear regression + summary of tests
When and why to use Logistic Regression?  The response variable has to be binary or ordinal.  Predictors can be continuous, discrete, or combinations.
April 4 Logistic Regression –Lee Chapter 9 –Cody and Smith 9:F.
MBP1010 – Lecture 8: March 1, Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Applied Epidemiologic Analysis - P8400 Fall 2002 Lab 9 Survival Analysis Henian Chen, M.D., Ph.D.
Logistic Regression. Linear Regression Purchases vs. Income.
1 Multivariable Modeling. 2 nAdjustment by statistical model for the relationships of predictors to the outcome. nRepresents the frequency or magnitude.
1 STA 617 – Chp10 Models for matched pairs Summary  Describing categorical random variable – chapter 1  Poisson for count data  Binomial for binary.
Multiple Logistic Regression STAT E-150 Statistical Methods.
Log-linear Models HRP /03/04 Log-Linear Models for Multi-way Contingency Tables 1. GLM for Poisson-distributed data with log-link (see Agresti.
1 Topic 4 : Ordered Logit Analysis. 2 Often we deal with data where the responses are ordered – e.g. : (i) Eyesight tests – bad; average; good (ii) Voting.
1 Chapter 16 logistic Regression Analysis. 2 Content Logistic regression Conditional logistic regression Application.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
1 Introduction to Modeling Beyond the Basics (Chapter 7)
Probability and odds Suppose we a frequency distribution for the variable “TB status” The probability of an individual having TB is frequencyRelative.
Applied Epidemiologic Analysis - P8400 Fall 2002 Labs 6 & 7 Case-Control Analysis ----Logistic Regression Henian Chen, M.D., Ph.D.
Introdcution to Epidemiology for Medical Students Université Paris-Descartes Babak Khoshnood INSERM U1153, Equipe EPOPé (Dir. Pierre-Yves Ancel) Obstetric,
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: Multiple, Logistic and Proportional Hazards Regression.
BINARY LOGISTIC REGRESSION
Notes on Logistic Regression
Multiple logistic regression
ביצוע רגרסיה לוגיסטית. פרק ה-2
Introduction to Logistic Regression
Case-control studies: statistics
Presentation transcript:

Analysis of matched data Analysis of matched data

Pair Matching: Why match? Pairing can control for extraneous sources of variability and increase the power of a statistical test. Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

Example Johnson and Johnson (NEJM 287: , 1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as…. Hodgkin’s Sib control TonsillectomyNone From John A. Rice, “Mathematical Statistics and Data Analysis. OR=1.47; chi-square=1.53 (NS)

Example But several letters to the editor pointed out that those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this: From John A. Rice, “Mathematical Statistics and Data Analysis. OR=2.14; chi-square=2.91 (p=.09) Tonsillectomy None TonsillectomyNone Case Control

Pair Matching Match each MI case to an MI control based on age and gender. Ask about history of diabetes to find out if diabetes increases your risk for MI.

Pair Matching Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls

Each pair is it’s own “age- gender” stratum Diabetes No diabetes Case (MI)Control Example: Concordant for exposure (cell “a” from before)

Diabetes No diabetes Case (MI)Control Diabetes No diabetes Case (MI)Control x 9 x 37 Diabetes No diabetes Case (MI)Control Diabetes No diabetes Case (MI)Control x 16 x 82

Mantel-Haenszel for pair- matched data We want to know the relationship between diabetes and MI controlling for age and gender. Mantel-Haenszel methods apply.

RECALL: The Mantel-Haenszel Summary Odds Ratio Exposed Not Exposed CaseControl ab c d

Diabetes No diabetes Case (MI)Control Diabetes No diabetes Case (MI)Control ad/T = 0 bc/T=0 ad/T=1/2 bc/T=0 Diabetes No diabetes Case (MI)Control Diabetes No diabetes Case (MI)Control ad/T=0 bc/T=1/2 ad/T=0 bc/T=0

Mantel-Haenszel Summary OR

Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls OR estimate comes only from discordant pairs!! OR= 37/16 = 2.31 Makes Sense!

McNemar’s Test Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls OR estimate comes only from discordant pairs! The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls P(“favors” case/discordant pair) =

Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls odds(“favors” case/discordant pair) =

Diabetes No diabetes DiabetesNo Diabetes MI cases MI controls McNemar’s Test Null hypothesis: P(“favors” case / discordant pair) =.5 (note: equivalent to OR=1.0 or cell b=cell c) By normal approximation to binomial:

McNemar’s Test: generally By normal approximation to binomial: Equivalently: exp No exp expNo exp ab c d cases controls

From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: ; Jan Example: Salmonella Outbreak in France, 1996

Epidemic Curve

Matched Case Control Study Case = Salmonella gastroenteritis. Community controls (1:1) matched for:  age group ( = 65 years)  gender  city of residence

Results

In 2x2 table form: any goat’s cheese Goat’s cheese None 2930 Goat’ cheeseNone Cases Controls

In 2x2 table form: Brand B Goat’s cheese Goat’s cheese B None 1049 Goat’ cheese BNone Cases Controls

Introduction to Logistic Regression: binary outcome!

Example : The Bernouilli (binomial) distribution Smoking (cigarettes/day) Lung cancer; yes/no y n

Could model probability of lung cancer….  =  +  1 *X Smoking (cigarettes/day) The probability of lung cancer (  ) 1 0 But why might this not be best modeled as linear? [ ]

Alternatively… log(  /1-  ) =  +  1 *X Logit function

The Logit Model Logit function (log odds) Baseline odds Linear function of risk factors for individual i:  1 x 1 +  2 x 2 +  3 x 3 +  4 x 4 …

To get back to OR’s…

“Adjusted” Odds Ratio Interpretation

Adjusted odds ratio, continuous predictor

Practical Interpretation The odds of disease increase multiplicatively by e ß for for every one-unit increase in the exposure, controlling for other variables in the model.

Example: >2 exposure levels *(dummy coding) CHD status WhiteBlackHispanicOther Present Absent2010

SAS CODE data race; input chd race_2 race_3 race_4 number; datalines; end; run; proc logistic data=race descending; weight number; model chd = race_2 race_3 race_4; run; Note the use of “dummy variables.” “Baseline” category is white here.

SAS OUTPUT – model fit Intercept Intercept and Criterion Only Covariates AIC SC Log L Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio Score Wald

SAS OUTPUT – regression coefficients Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept race_ race_ race_

SAS output – OR estimates The LOGISTIC Procedure Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race_ race_ race_ Interpretation: 8x increase in odds of CHD for black vs. white 6x increase in odds of CHD for hispanic vs. white 4x increase in odds of CHD for other vs. white