2 Upcoming work Part 3 of Data Project due today Quiz #5 in class this WednesdayHW #10 due Sunday
3 Comparing Two Proportions Chapter 22Comparing Two Proportions
4 Comparing Two Proportions Comparisons between two percentages are much more common than questions about isolated percentages. And they are more interesting.We often want to know how two groups differ,a treatment is better than a placebo control,this year’s results are better than last year’s.Team A is better than Team B
5 ExamplesCompare the survival rate from cancer for those that use traditional medicine treatments to the survival rate from cancer for those that use alternative treatment methods.Compare the percentage of white adults that smoke to the percentage of black adults that smoke.Compare the proportion of women who have an abortion that need mental health treatment to the proportion of women who have a small infant that need mental health treatment.
6 The hypothesisThe typical hypothesis test for the difference in two proportions is the one of no difference. In symbols, H0: p1 – p2 = 0.The alternatives:Ha: p1 –p2 > 0Ha: p1 –p2 < 0Ha: p1 –p2 ≠ 0
7 The Sampling Distribution - Theory Provided that the sampled values are independent, the samples are independent, and the samples sizes are large enough, the sampling distribution of is modeled by a Normal model withMean:Standard deviation:
8 Estimates for the SD Confidence Intervals P-value and Z* testing Use each groups individual success rate to calculate the SEP-value and Z* testingUse pooled proportion to calculate the SE
9 SE for Confidence Intervals When the conditions are met, we are ready to find the confidence interval for the difference of two proportions:The confidence interval iswhereThe critical value z* depends on the particular confidence level, C, that you specify.Recall that standard deviations don’t add, but variances do. In fact, the variance of the sum or difference of two independent random variables is the sum of their individual variances.
10 ProblemCompany X invents a new drug called NoZits that they believe cures acne problems.To see if NoZits works they run a test. They have 100 people (Group A) wash their face with NoZits, and 120 people (Group B) wash their face with the other leading brand.Group A – 45 people’s skin clears up the next dayGroup B – 52 people’s skin clears up the next dayCreate a 95% confidence interval for the difference between the two groups.
11 What does pA represent? The proportion of people with better skin. The proportion of people with better skin who used NoZits.The proportion of people with better skin who used the leading brand.
12 Company X hopes to show that NoZits is a better drug than the leading brand. What hypotheses test should it run?Ho: pA = pB Ha: pA ≠ pBHo: pA = pB Ha: pA < pBHo: pA = pB Ha: pA > pB
13 Critical Z* and significance level two-sided α= .20 CI = 80% z*=1.282α= .10 CI = 90% z*=1.645α= .05 CI = 95% z*=1.96α= .02 CI = 98%z*=2.326α= .01 CI = 99% z*=2.576
14 Interpretation of a confidence interval We are 95% confidence that the true difference between the two test groups falls within( , )Or(-11.17%, 15.17%)Our data fail to reject the null hypothesis. We do NOT have enough evidence to suggest a difference between success rates. NoZits is NOT better at getting rid of acne
15 SE for p-value and Z* hypothesis testing We then put this pooled value into the formula, substituting it for both sample proportions in the standard error formula:
16 Calculating the Pooled Proportion The pooled proportion iswhere andIf the numbers of successes are not whole numbers, round them first. (This is the only time you should round values in the middle of a calculation.)
17 SE for p-value and Z* hypothesis testing We use the pooled value to estimate the standard error:Now we find the test statistic:When the conditions are met and the null hypothesis is true, this statistic follows the standard Normal model, so we can use that model to obtain a P-value.
18 Homework Problem A clinic reported the following statistics For women under the age of 3847 live births to 165 womenFor women over the age of 384 live births to 78 womenIs there a difference in the effectiveness of the clinic’s methods for older women?Use a both a z* test and a 95% confidence interval to test your hypothesis
19 What does p1 represent? The proportion of live births The proportion of women over 38 with live birthsThe proportion of women under 38 with live births
20 Is there evidence of a difference between the two groups? Ho: p1 = p2 Ha: p1 ≠ p2Ho: p1 = p2 Ha: p1 < p2Ho: p1 = p2 Ha: p1 > p2
21 What is our test ‘rule’?If z > critical z* , then reject null hypothesis.If z > critical z* , then accept null hypothesis.If z < critical z* , then reject null hypothesis.
23 What is the result of this hypothesis test at a significance level of 0.05? Do not reject the null hypothesis b/c there IS NOT sufficient evidence to make the claim of a differenceAccept the alternative hypothesis b/c there IS sufficient evidence to support claim of a differenceReject the null hypothesis b/c there IS sufficient evidence to support the claim of a difference
24 Would we get a different result with a CI? Let’s calculate the CI at 95%
25 Our CI (14.9%, 31.7%). How would we interpret this CI? There is 95% confidence the prop. of live births for clients of this clinic is greater for women under 38There is 95% confidence the prop. of live births is greater for women under 38There is 95% confidence the prop. of live births for a sample of clients of this clinic is greater for women under 38
26 Homework Problem A survey of older Americans reveals, 420 out of 1002 men suffer from arthritis543 out of 1070 women suffer from arthritisCreate a 95% CI to test the hypothesis that the proportion of adults suffering from arthritis is greater for women than for men.
27 Does this suggest that arthritis is more likely to afflict women than men? H0: pw-pm=0 Ha: pw-pm>0H0: pw-pm=0 Ha: pw-pm<0H0: pw-pm=0 Ha: pw-pm≠0
28 Does this suggest that arthritis is more likely to afflict women than men? No. No conclusion can be made based on the confidence interval.No. The interval is too close to 0.Yes. The entire interval lies above 0.Yes, we are 95% confident, based on these samples, that about 13.1% of senior women suffer from arthritis, while only 4.6% of senior men suffer from arthritis.
29 HW - Problem 8 One country reported 84 out of 3157 white women had multiple births20 out of 625 black women had multiple birthsDoes this indicate any racial difference of the likelihood of multiple births?
30 Defining the proportions affects the hypothesis P1 = proportion of multiple births from white womenP2 = proportion of multiple births from black women
32 Do white women typically have more multiple births? Ho: p1 – p2 =0 Ha: p1 – p2>0Ho: p1 – p2 =0 Ha: p1 – p2<0Ho: p1 – p2 =0 Ha: p1 – p2≠0
33 Do black women typically have more multiple births? Ho: p1 – p2 =0 Ha: p1 – p2>0Ho: p1 – p2 =0 Ha: p1 – p2<0Ho: p1 – p2 =0 Ha: p1 – p2≠0
34 Suppose our question is ‘is there a difference between black vs white Suppose our question is ‘is there a difference between black vs white?’ What is our test ‘rule’?If z > critical z* , then reject null hypothesis.If z > critical z* , then accept null hypothesis.If z < critical z* , then reject null hypothesis.
36 What is the result of the hypothesis test? Reject the null hypothesis b/c there is sufficient evidence to support the claim of a difference.Do not reject the null hypothesis b/c there is NOT sufficient evidence to support the claim of a differenceAccept the null hypothesis b/c there is NOT sufficient evidence to support the claim of a difference
37 Errors Type 1 Error – Rejecting when Null is True Type 2 Error - Failing to Reject when the Null is False
38 Suppose our last test was incorrect, what type of error did we make? This cannot be determined