Presentation is loading. Please wait.

Presentation is loading. Please wait.

More Contingency Tables & Paired Categorical Data Lecture 8.

Similar presentations


Presentation on theme: "More Contingency Tables & Paired Categorical Data Lecture 8."— Presentation transcript:

1 More Contingency Tables & Paired Categorical Data Lecture 8

2 A Larger Contingency Table A 4-by-2 contingency table. (Made-up data filled into empty cells from last class.) Exercise LevelCold/FluNo Cold/FluTotal No Exercise*79138217 Light Exercise96126222 Mod. Exercise304373 Heavy Exercise*115123238 Totals (Marginal)320430750

3 Estimated Distributions The Conditional Distributions are the distributions of the response within each level of the predictor. For example, No Exercise: 79/217=.364 experienced cold/flu 138/217=.636 didn’t Light Exercise: 96/222=.432, 126/222=.568 Etc. The Marginal Distribution is the distribution of the responses if we ignore information about the predictor. Colds/flu: 320/750 =.427 No cold/flu: 430/750 =.573

4 To Summarize Distributions in a Table Exercise LevelCold/FluNo Cold/FluTotal No Exercise*79/217 =.364138/217 =.636 Light Exercise96/222 =.432126/222 =.568 Mod. Exercise30/73 =.41143/73 =.589 Heavy Exercise*115/238 =.483123/238 =.517 Totals (Marginal)320/750 =.427430/750 =.573

5 Expected Values Under the Null Exercise LevelCold/FluNo Cold/FluTotal No Exercise*217*.427 ≈ 92.59217*.573 ≈ 124.41217 Light Exercise222*.427 ≈ 94.72222*.573 ≈ 127.28222 Mod. Exercise73*.427 ≈ 31.1573*.573 ≈ 41.8573 Heavy Exercise*238*.427 ≈ 101.55238*.573 ≈ 136.45238 Totals (Marginal)320430750 The approximate values are due to round-off error in the estimated probabilities. Note that we avoided some round-off error by calculating 92.59 directly from the totals as 217*320/750.

6 Test Statistic and Sampling Distribution A test of independence of the two variables (Exercise Level and Cold/Flu) will be carried out using a chi-square test statistic with (r-1)(c-1)=(4-1)(2-1)=3 degrees of freedom. The test statistic is calculated as

7 Hypothesis Test Assumptions Random Independent Sample Groups collected independently “Large Sample” Hypotheses H 0 : conditional distributions equal H A : conditional distributions not all equal Test Statistic Chi-square = 6.69 compared to chi-square dist’n with 3 d.f.

8 Hypothesis Test, cont. P-value/Rejection Region Critical Values are 7.815 for.05 significance, 7.407 for.06, 7.060 for.07 and 6.251 for.10. Since 6.69 < 7.815, we fail to reject at the 0.05 level. The p-value is between.07 and.10. Conclusion At the type 1 error rate of.05, we fail to reject the null hypothesis. There is not enough evidence to say that the probability of whether or not someone gets a cold depends on the exercise level.

9 Matched Categorical Data Data may be matched/paired with respect to the risk factor or the response Matching on risk factor (not directly discussed in text) Differences of proportions, relative risks, and odds ratios are all appropriate. The formulas and the set-up of the contingency table will be different. We will focus on odds ratios, which will be calculated in the same way as for the matched case-control study. Matching on response (Matched Case-Control Study) Only the odds ratio is an appropriate measure of the association between the risk factor and the response. In both cases, inference focuses on the pair.

10 A Matched Case-Control Study on CAD Each of 59 adults with Coronary Artery Disease (CAD) were matched with an adult who did not develop CAD but was of the same gender, age, ethnicity, and socio- economic status. Of interest was whether drinking 2 or more glasses of red wine (on average) per week was associated with development of CAD.

11 Table for Matched Case-Control Data Do NOT use the standard contingency table that summarizes information about the individual subjects. Instead, use the following table to summarize information about the pairs. Cases (CAD) >= 2< 2 Controls>= 21514 < 21020

12 Physician Adherence Study - Matching on Predictor Suppose that investigators were interested in whether a particular educational intervention had an effect on whether physicians prescribe a particular treatment plan for their asthma patients. 75 physicians are rated on whether they prescribe the treatment plan both before and after the educational intervention. Before YesNo AfterYes2225 No1216

13 Estimation and Inference for Matched Categorical Data CANNOT use formulas for CI of odds ratio given before because the two groups of subjects (whether “exposure” groups or case/control groups) are not chosen independently. Inferences will be based on the discordant pairs, that is, the pairs in which the members “disagree” on the predictor variable for case-control studies on the response variable when subjects are matched with respect to the predictor

14 Labeling Cell (Pair) Counts & Estimation of Odds Ratio Odds ratio is estimated as R/S Interpretation: The odds that a person in group 2 is “exposed” is R/S times the odds that a group 1 member is “exposed.” Or: The odds that an “exposed” person is in group 2 is R/S times the odds that an “unexposed” person is in group Group 1 YesNo Group 2YesR NoS

15 CI for Odds Ratio The 95% confidence interval for the (natural) log of the odds ratio is

16 CAD Example – Odds Ratio There are more pairs in which a case drinks less than 2 and a control drinks more than 2 than pairs in which a case drinks more than 2 and a control drinks less than 2. Thus, >=2 has a “protective effect. ” The odds ratio is 14/10=1.4 The odds of someone who has at least two drinks per week not developing CAD is 1.4 times the odds of someone The odds of developing CAD for those who drink less than two drinks per week are 1.4 times the odds for someone who drinks more than 2 drinks per week. Cases (CAD) >= 2< 2 Controls>= 21514 < 21020

17 CAD Eg. – CI for Odds Ratio The 95% CI for the log of the OR is log(1.4) +/- 1.96*sqrt(1/14 + 1/10) = (-.475, 1.148) 95% CI for OR is (.622, 3.152) With 95% confidence, the odds of developing CAD for those who drink less than two drinks per week are between.622 and 3.152 times the odds for someone who drinks more than 2 drinks per week. This interval includes 1, therefore, the effect of drinking at least two drinks per week is not a significant effect! However, the interval is very wide, so…

18 Physician Intervention: Odds Ratio Note that there are more pairs in which the physician prescribes the treatment plan after the intervention but not before than in which the physician prescribes the treatment plan before but not after. The odds ratio is calculated as 25/12=2.083 The odds that a physician will prescribe the treatment plan after the intervention are 2.083 times the odds that a physician will prescribe it before the intervention. Before YesNo AfterYes2225 No1216

19 Physicians – CI for Odds Ratio The 95% CI for log of the odds ratio is ln(2.083) +/- 1.96*sqrt(1/25 + 1/12) = (.046, 1.422) The 95% CI for the odds ratio is (1.047, 4.145) There is a significant effect of the intervention since 1 is not included in the interval.

20 Hypothesis Testing in Matched Designs Again, the test involves comparing the discordant pairs. In particular, if the predictor and response are independent, one would expect the population proportion of each type of discordant pairs to be equal. If there is inequality in the sample, is it possible that the inequality is just due to chance?

21 Hypothesis Test – The Steps Assumptions Random, independent selection of pairs Large Sample (R+S > 10) Hypotheses H 0 : Predictor and Response are independent variables H A : Predictor and Response are associated Test Statistic With Yates’ continuity correction, P-value: Compare to chi-square dist’n with 1 d.f. Conclusion: per usual

22 CAD – Hypothesis Test Assumptions Random, independent selection of pairs Large Sample (R+S=24 > 10) Hypotheses H 0 : Drinking and CAD are independent variables H A : Drinking and CAD are associated Test Statistic (14-10) 2 /(14+10) = 16/24 =.667 P-value: Table A5.7: p-value is between.4386 and.4028. Conclusion: Insufficient evidence to reject the null that says that Drinking is not associated with CAD.

23 Physician – Hypothesis Test Assumptions Random, independent selection of pairs Large Sample (R+S = 37 > 10) Hypotheses H 0 : Participation in intervention and prescription of treatment plan are independent variables H A : Participation in intervention and prescription of treatment plan are associated Test Statistic (25-12) 2 /(25+12) = 4.568 P-value: between.0339 and.0320. Conclusion: At the 0.05 significance level, reject the null in favor of the alternative that the intervention does have an effect on whether physicians prescribe the treatment plan.

24 Homework Textbook Reading Chapter 29, first two sections Repeat Chapter 9 (has info about OR for paired case-control studies) (Last time: Chapter 8, Chapter 26) When doing calculations for this class, you may ignore the Yates’ continuity correction. Homework Problems


Download ppt "More Contingency Tables & Paired Categorical Data Lecture 8."

Similar presentations


Ads by Google