Inference (CI / Tests) for Comparing 2 Proportions
To compare proportions from distinct populations p 1 vs.p 2 Conditions 1. Appropriate randomization. Without this, there’s rarely any way to proceed making sensible conclusions; and when there are ways, they are very complex (and often still produced results for which in general conclusions may be easily disputed)
To compare proportions from distinct populations p 1 vs.p 2 Conditions 2. For the particular method we will use Large populations (20 x larger than sample size). Large enough samples so that all cells of a data summary table are at least 5 (ie, 5 units). Without either of these, there are methods that still give conclusions similar in scope to our method.
To compare proportions from distinct populations p 1 vs.p 2 There are numerous ways to compare two quantities. Here are two (in red – what we will do): Assess the difference What is the value of p 1 – p 2 ? (CI) Is p 1 – p 2 = 0 or not? (Test) Assess the ratio: What is the value of p 1 / p 2 ? (CI) Is p 1 / p 2 = 1 or not? (Test)
Sample proportions (possibly error margins) Confidence interval for the value of p 1 – p 2 Point estimate of difference: Error margin: If you know the error margin for each, use the Pythagorean Theorem.
Sample proportions (possibly error margins) Confidence interval for the value of p 1 – p 2 Point estimate of difference: Error margin: If you don’t know the error margin for each
Hypothesis test about the difference p 1 – p 2 H 0 : p 1 = p 2 (equivalent to) H 0 : p 1 – p 2 = 0 Test statistic: is the “pooled proportion” (H 0 true)
The P-value depends on the direction(s) specified in the alternative hypothesis. H A : p 1 > p 2 (equivalent to) H 0 : p 1 – p 2 > 0 P-value = probability right of Z H A : p 1 < p 2 (equivalent to) H 0 : p 1 – p 2 < 0 P-value= probability left of Z H A : p 1 p 2 (equivalent to) H 0 : p 1 – p 2 0 P-value = probability outside of ±Z = twice probability outside of Z
Pregnant women with AIDS were involved in a study. Each woman was randomly allocated to either treatment with AZT or a Placebo. Researchers want to know if there is any difference in the proportions of babies born HIV+ for (all) women who might be treated with AZT and (all) women if not treated with AZT. p AZT = proportion of all babies born to mothers with AIDS using AZT that are HIV+. p PBO = proportion of all babies born to mothers with AIDS using a placebo that are HIV+. H 0 : p AZT = p PBO H 1 : p AZT p PBO
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV+ HIV- Total
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV+ HIV- Total164
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV+13 HIV- Total164
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV+1340 HIV- Total164160
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV HIV- Total
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV HIV Total
Condition Check: Appropriate randomization Large populations All cells at least 5 Treatment AZTPlaceboTotal Outcome HIV HIV Total
Of 164 babies of women getting AZT, 13 were HIV+. Of 160 babies of women getting the Placebo, 40 were HIV+. Treatment AZTPlaceboTotal Outcome HIV HIV Total
Pooled over both conditions: Of 324 babies, 53 were HIV+. Treatment AZTPlaceboTotal Outcome HIV HIV Total
Result of hypothesis test: Z = P-value = 1 / Reject H 0. The (population) proportions differ. And in fact the mothers treated with AZT had babies with better outcomes. There is sufficient evidence in the sample data to conclude that the proportion of HIV+ babies born to mothers treated with AZT is less than that for mothers treated with a Placebo.
Result of hypothesis test: Z = P-value = 1 / Suppose AZT and Placebo had the same efficacy (effectiveness): In only 1 in studies would the difference in HIV rates between the groups be as large as observed here. In only 1 in randomizations would the difference in HIV rates between the groups be as large as observed here.
Result of hypothesis test: Z = P-value = 1 / % CI for the difference in proportions Point Estimate: – = – Error Margin: Bounds: – ± – < p AZT – p PBO < – OR (better?) < p PBO – p AZT <
95% CI for the difference in proportions – < p AZT – p PBO < – add p PBO to all “3” sides p PBO – < p AZT < p PBO – We are 95% confident that the proportion of babies born HIV+ is between and lower for mothers treated with AZT.
Result of hypothesis test: Z = P-value = 1 / % CI for the difference in proportions Point Estimate: Error Margin: < p PBO – p AZT < We are 95% confident that the percent of babies born HIV+ is between 9.12% and 25.02% lower for mothers treated with AZT.
We can be quite sure that it is the AZT that caused the observed reduction in HIV+ rates. The randomization alone is very unlikely to have been responsible for such a difference. (The P-value quantifies this likelihood.) The randomization also insures that whatever other variables might be important are similarly distributed among the women in the AZT and Placebo groups. This is the advantage of an experimental study over an observational one. It allows us to infer causation.
95% CI for the difference in proportions Point Estimate: Error Margin: < p PBO – p AZT < Number Needed to Treat Estimated NNT = = =
58582 pregnant women with AIDS If not treated with AZT: = HIV+ babies If treated with AZT: = 4657 HIV+ babies Treating women with AZT results in a reduction of10000 HIV+ babies Treating women results in 1 fewer HIV+ baby
NNT: The number of units that must be switched from one level of the explanatory variable (one “group” or “treatment”) to the other in order to result in a change of 1 unit from one level of the response variable to the other. How many units must be “treated” in order to produce 1 additional favorable outcome.
95% CI for the difference in proportions Point Estimate: Error Margin: < p PBO – p AZT < Number Needed to Treat Estimated NNT = We estimate that for every women treated with AZT (instead of nothing/placebo) there is one fewer HIV+ baby.
95% CI for the difference in proportions Point Estimate: Error Margin: < p PBO – p AZT < Number Needed to Treat Estimated NNT = A CI for NNT can be formed by taking reciprocals of the CI for the difference: 1/ = / = < NNT (population) < 11
Motorcycle Helmet Color Percent of accidents resulting in harm… …among those wearing dark helmets: 190/681 = % …among those wearing light helmets: 116/493 = % Point Estimate of Difference: 4.4% = (The P-value also happens to be 0.044!) Estimated NNT = 1/0.044 = 22.7