Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of matched data Analysis of matched data.

Similar presentations


Presentation on theme: "Analysis of matched data Analysis of matched data."— Presentation transcript:

1

2 Analysis of matched data Analysis of matched data

3 Pair Matching: Why match? Pairing can control for extraneous sources of variability and increase the power of a statistical test. Match 1 control to 1 case based on potential confounders, such as age, gender, and smoking.

4 Example Johnson and Johnson (NEJM 287: 1122-1125, 1972) selected 85 Hodgkin’s patients who had a sibling of the same sex who was free of the disease and whose age was within 5 years of the patient’s…they presented the data as…. Hodgkin’s Sib control TonsillectomyNone 4144 33 52 From John A. Rice, “Mathematical Statistics and Data Analysis. OR=1.47; chi-square=1.53 (NS)

5 Example But several letters to the editor pointed out that those investigators had made an error by ignoring the pairings. These are not independent samples because the sibs are paired…better to analyze data like this: From John A. Rice, “Mathematical Statistics and Data Analysis. OR=2.14; chi-square=2.91 (p=.09) Tonsillectomy None TonsillectomyNone 3715 7 26 Case Control

6 Pair Matching Match each MI case to an MI control based on age and gender. Ask about history of diabetes to find out if diabetes increases your risk for MI.

7 Pair Matching Diabetes No diabetes 25119 DiabetesNo Diabetes 937 16 82 46 98 144 MI cases MI controls

8 Each pair is it’s own “age- gender” stratum Diabetes No diabetes Case (MI)Control 11 0 0 Example: Concordant for exposure (cell “a” from before)

9 Diabetes No diabetes Case (MI)Control 11 0 0 Diabetes No diabetes Case (MI)Control 10 0 1 x 9 x 37 Diabetes No diabetes Case (MI)Control 01 1 0 Diabetes No diabetes Case (MI)Control 00 1 1 x 16 x 82

10 Mantel-Haenszel for pair- matched data We want to know the relationship between diabetes and MI controlling for age and gender. Mantel-Haenszel methods apply.

11 RECALL: The Mantel-Haenszel Summary Odds Ratio Exposed Not Exposed CaseControl ab c d

12 Diabetes No diabetes Case (MI)Control 11 0 0 Diabetes No diabetes Case (MI)Control 10 0 1 ad/T = 0 bc/T=0 ad/T=1/2 bc/T=0 Diabetes No diabetes Case (MI)Control 01 1 0 Diabetes No diabetes Case (MI)Control 00 1 1 ad/T=0 bc/T=1/2 ad/T=0 bc/T=0

13 Mantel-Haenszel Summary OR

14 Diabetes No diabetes 25119 DiabetesNo Diabetes 937 16 82 46 98 144 MI cases MI controls OR estimate comes only from discordant pairs!! OR= 37/16 = 2.31 Makes Sense!

15 McNemar’s Test Diabetes No diabetes 25119 DiabetesNo Diabetes 937 16 82 46 98 144 MI cases MI controls OR estimate comes only from discordant pairs! The question is: among the discordant pairs, what proportion are discordant in the direction of the case vs. the direction of the control. If more discordant pairs “favor” the case, this indicates OR>1.

16 Diabetes No diabetes 25119 DiabetesNo Diabetes 937 16 82 46 98 144 MI cases MI controls P(“favors” case/discordant pair) =

17 Diabetes No diabetes 25119 DiabetesNo Diabetes 937 16 82 46 98 144 MI cases MI controls odds(“favors” case/discordant pair) =

18 Diabetes No diabetes DiabetesNo Diabetes 937 16 82 MI cases MI controls McNemar’s Test Null hypothesis: P(“favors” case / discordant pair) =.5 (note: equivalent to OR=1.0 or cell b=cell c) By normal approximation to binomial:

19 McNemar’s Test: generally By normal approximation to binomial: Equivalently: exp No exp expNo exp ab c d cases controls

20 From: “Large outbreak of Salmonella enterica serotype paratyphi B infection caused by a goats' milk cheese, France, 1993: a case finding and epidemiological study” BMJ 312: 91- 94; Jan 1996. Example: Salmonella Outbreak in France, 1996

21

22 Epidemic Curve

23 Matched Case Control Study Case = Salmonella gastroenteritis. Community controls (1:1) matched for:  age group ( = 65 years)  gender  city of residence

24 Results

25 In 2x2 table form: any goat’s cheese Goat’s cheese None 2930 Goat’ cheeseNone 23 6 7 46 13 59 Cases Controls

26 In 2x2 table form: Brand B Goat’s cheese Goat’s cheese B None 1049 Goat’ cheese BNone 824 2 25 32 27 59 Cases Controls

27 Introduction to Logistic Regression: binary outcome!

28 Example : The Bernouilli (binomial) distribution Smoking (cigarettes/day) Lung cancer; yes/no y n

29 Could model probability of lung cancer….  =  +  1 *X Smoking (cigarettes/day) The probability of lung cancer (  ) 1 0 But why might this not be best modeled as linear? [ ]

30 Alternatively… log(  /1-  ) =  +  1 *X Logit function

31 The Logit Model Logit function (log odds) Baseline odds Linear function of risk factors for individual i:  1 x 1 +  2 x 2 +  3 x 3 +  4 x 4 …

32 To get back to OR’s…

33 “Adjusted” Odds Ratio Interpretation

34 Adjusted odds ratio, continuous predictor

35 Practical Interpretation The odds of disease increase multiplicatively by e ß for for every one-unit increase in the exposure, controlling for other variables in the model.

36 Example: >2 exposure levels *(dummy coding) CHD status WhiteBlackHispanicOther Present5201510 Absent2010

37 SAS CODE data race; input chd race_2 race_3 race_4 number; datalines; 0 0 0 0 20 1 0 0 0 5 0 1 0 0 10 1 1 0 0 20 0 0 1 0 10 1 0 1 0 15 0 0 0 1 10 1 0 0 1 10 end; run; proc logistic data=race descending; weight number; model chd = race_2 race_3 race_4; run; Note the use of “dummy variables.” “Baseline” category is white here.

38 SAS OUTPUT – model fit Intercept Intercept and Criterion Only Covariates AIC 140.629 132.587 SC 140.709 132.905 -2 Log L 138.629 124.587 Testing Global Null Hypothesis: BETA=0 Test Chi-Square DF Pr > ChiSq Likelihood Ratio 14.0420 3 0.0028 Score 13.3333 3 0.0040 Wald 11.7715 3 0.0082

39 SAS OUTPUT – regression coefficients Analysis of Maximum Likelihood Estimates Standard Wald Parameter DF Estimate Error Chi-Square Pr > ChiSq Intercept 1 -1.3863 0.5000 7.6871 0.0056 race_2 1 2.0794 0.6325 10.8100 0.0010 race_3 1 1.7917 0.6455 7.7048 0.0055 race_4 1 1.3863 0.6708 4.2706 0.0388

40 SAS output – OR estimates The LOGISTIC Procedure Odds Ratio Estimates Point 95% Wald Effect Estimate Confidence Limits race_2 8.000 2.316 27.633 race_3 6.000 1.693 21.261 race_4 4.000 1.074 14.895 Interpretation: 8x increase in odds of CHD for black vs. white 6x increase in odds of CHD for hispanic vs. white 4x increase in odds of CHD for other vs. white


Download ppt "Analysis of matched data Analysis of matched data."

Similar presentations


Ads by Google