Presentation on theme: "Comparing Two Proportions"— Presentation transcript:
1 Comparing Two Proportions SECTION 13.2Comparing Two Proportions
2 In this scenario, we desire to compare two populations or the responses to two treatments based on two independent samples.We compare the populations by doing inference about the difference p1 - p2The statistic that estimates this difference is the difference between the two sample proportions,
3 The sampling distribution of The variance of the difference is the sum of the variances of and , which isNote that the variances add. The standard deviations do not.When the samples are large, the distribution is approximately normal.The mean of this distribution is p1-p2
4 Assumptions Data are from two independent SRSs from the populations The populations are at least ten times as large as the samplesA. For a significance test:Where is the combined sample proportion.B. For a confidence interval:
5 Confidence Intervals for p1 - p2 Draw an SRS of size n1 from a population having proportion p1 of successes and draw an independent SRS of size n2 from another population having proportion p2 of successes. When n1 and n2 are large, an approximate level C confidence interval for p1 – p2 is ( ) ± z*SEIn this formula the standard error SE of isAnd z* is the upper (1 – C)/2 standard normal critical value. Follow the same assumptions as for single proportion confidence intervals.
6 Our z test statistic Significance Tests for p1 – p2 Where is the combined sample proportion.
7 The Steps for a Two Proportion z-test State the hypothesis and name testHo: p1 = p2Ha: p1 ‹, ›, or ≠ p2State and verify your assumptionsCalculate the P value and other important valuesDone in calculator or…Using the formulas and tablesState Conclusions (Both statistically and contextually)- The smaller the p-value, the greater the evidence is to reject Ho
8 CALCULATOR FUNCTIONSYou may be able to find these on your own by now, but just in case, you will be looking for:6: 2-PropZTestB: 2-PropZIntNote: x is your number of successes while n is your total trials
9 + 4 Confidence Interval for 2 Proportions Just like before, this helps us overcome the lack of Normality when the sample sizes are too small for the large-sample procedures.These methods cannot save us from the fact that small samples produce wide confidence intervals.The plus four interval may be conservative for very small samples and population p’s close to 0 or 1.It is generally much more accurate than the large-sample interval when the samples are small or the population p is close to 0 or 1.Add 4 imaginary observations, one success and one failure in each of the two samples.Use the large-sample procedures with the new sample sizes and counts of successes.Use this when the sample size is at least 5 in each group, with any counts of successes and failures.
10 Example of Two-Proportion Confidence Interval A surprising number of young adults (ages 19-25) still live at home with their parents. A random sample by the National Institutes of Health included 2253 men and 2629 women in this age group. The survey found that 986 of the men and 923 of the women lived at home. Is this good evidence that different proportions of young men and young women live at home? How large is the difference between the proportions of young men and young women who live at home?
11 Step 1—Parameters Population 1—young men Population 2—young women p1 = proportion of young men who live at homep2 = proportion of young women who live at homeWe will construct a 95% confidence interval for the difference between men and women, p1- p2
12 Step 2—ConditionsSRSs—The data were obtained from a random sample, so we should be safe generalizing to the respective populations of interest.Normality—To check that the large-sample confidence interval is safe, look at counts of successes and failures (show calculations) for both samples. All of these are much larger than 5, so the large-sample method will be accurate.Independence—The sample survey in this example selected a single random sample of young adults, not two separate random samples of men and women. We divide the one sample by gender. The two-sample z procedures for comparing proportions are valid in such situations. This is an important fact about these methods.
13 Step 3—Calculations Here are the needed calculations: z*=1.96 So, our interval is (0.059 , 0.114)Calculator: ( , )
14 Step 4—InterpretationWe are 95% confident that the percent of young men living at home is between 5.9 and 11.4 percentage points higher than the percent of young women who live at home. This is definitely good evidence that a different proportion of young men and young women live at home.We have this level of confidence, because if we repeated our procedures over and over with new samples, 95% of our intervals would capture the true difference.
15 Testing a ClaimConsidering the previous example, someone makes the claim that young men are more likely to live at home. Does our data support this claim?Ho: p1 = p2Ha: p1 › p2We need to check the Normal assumption again using the combined sample proportion.
17 InterpretationBased on our extremely low P-value, we would reject the null hypothesis.Essentially, a difference in proportions this high would rarely every occur by chance if there is truly no difference between the proportion of young men and women that live at home.We are comfortable agreeing with the claim that more young men live at home.