Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two-Sample Proportions Inference. Sampling Distributions for the difference in proportions When tossing pennies, the probability of the coin landing.

Similar presentations


Presentation on theme: "Two-Sample Proportions Inference. Sampling Distributions for the difference in proportions When tossing pennies, the probability of the coin landing."— Presentation transcript:

1 Two-Sample Proportions Inference

2

3 Sampling Distributions for the difference in proportions When tossing pennies, the probability of the coin landing on heads is 0.5. However, when spinning the coin, the probability of the coin landing on heads is 0.4. Let’s investigate. Pairs of students will be given pennies and assigned to either flip or spin the penny

4 Looking at the sampling distribution of the difference in sample proportions: What is the mean of the difference in sample proportions (flip - spin)? What is the standard deviation of the difference in sample proportions (flip - spin)? Can the sampling distribution of difference in sample proportions (flip - spin) be approximated by a normal distribution? What is the probability that the difference in proportions (flipped – spun) is at least.25? Yes, since n 1 p 1 =12.5, n 1 (1-p 1 )=12.5, n 2 p 2 =10, n 2 (1-p 2 )=15 – so all are at least 5)

5 Assumptions: TwoindependentTwo, independent SRS’s from populations ( or randomly assigned treatments) Populations at least 10n Normal approximation for both

6 The National Sleep Foundation asked a random sample of U.S. adults questions about their sleep habits. One of the questions asked about snoring. Of the 995 respondents, 37% of adults reported that they snored at least a few nights a week during the past year. Would you expect that percentage to be the same for all age groups? Split into two age categories, 26% of the 184 people under 30 snored, compared with 39% of the 811 in the older group. Is this difference of 13% real, or due only to natural fluctuations in the sample we’ve chose? 18-2930 and overTotal Snore48318366 Don’t snore136493629 Total184811995 HYPOTHESIS TEST! For the true DIFFERENCE between snoring rates!

7 Steps: 1)Assumptions 2)Hypothesis statements & define parameters 3)Calculations 4)Conclusion, in context

8 Assumptions: TwoindependentTwo, independent SRS’s from populations ( or randomly assigned treatments) Populations at least 10n Normal approximation for both

9 18-2930 and overTotal Snore48 (26%)318 (39%)366 Don’t snore136493629 Total184811995 Assumptions: Ages are independent of each other in random sampleAges are independent of each other in random sample 995 is less than 10% of all adults Normal approximation for both

10 Hypothesis statements: H 0 : p 1 - p 2 = 0 H a : p 1 - p 2 > 0 H a : p 1 - p 2 < 0 H a : p 1 - p 2 ≠ 0 Be sure to define both p 1 & p 2 ! H 0 : p 1 = p 2 H a : p 1 > p 2 H a : p 1 < p 2 H a : p 1 ≠ p 2

11 Hypothesis statements: H 0 : There is no difference in snoring rates in the two age groups p old – p young = 0 p 1 = p 2 H a : There is a difference in snoring rates in the two age groups p old – p young ≠ 0 p 1 ≠ p 2

12 Since we assume that the population proportions are equal in the null hypothesis, the variances are equal. Therefore, we combine the variances!

13 Formula for Hypothesis test: Test statistic:

14 P-value = 2(area to right of z = 3.33) =.0008

15 “Since the p-value ) , I reject (fail to reject) the H 0. There is (is not) sufficient evidence to suggest that H a.” Conclusion: Since the p-value is less that my significance level, I reject the null hypothesis of no difference. There is sufficient evidence to suggest that there is a difference in the rate of snoring between older adults and younger adults.

16 Formula for Hypothesis test: Usually p 1 – p 2 =0

17 Example - Student Retention A group of college students were asked what they thought the “issue of the day”. Without a pause the class almost to a person said “student retention”. The class then went out and obtained a random sample (questionable) and asked the question, “Do you plan on returning next year?” The responses along with the gender of the person responding are summarized in the following table. Test to see if the proportion of students planning on returning is the same for both genders at the 0.05 level of significance.

18 Assumptions: The two samples are independently chosen random samples. Sample is less than 10% of college population. The sample sizes are large enough since n 1 p 1 = 211  10, n 1 (1- p 1 ) = 64  10 n 2 p 2 = 141  10, n 2 (1- p 2 ) = 41  10 so an approx normal model can be used. Significance level:  = 0.05 Example - Student Retention

19 p m = true proportion of males who plan on returning p f = true proportion of females who plan on returning n m = 275 (number of males surveyed) n f = 182 (number of females surveyed) (sample proportion of males who plan on returning) (sample proportion of females who plan on returning) Null hypothesis: H 0 : p m – p f = 0 Alternate hypothesis: H a : p m – p f  0 Example - Student Retention

20 Calculations : Test statistic:

21 P-value: The P-value for this test is 2 times the area under the z curve to the left of the computed z = -0.19. P-value = 2(0.4247) = 0.8494 Conclusion: Since P-value = 0.849 > 0.05 = , the hypothesis H 0 is not rejected at significance level 0.05. There is no evidence that the return rate is different for males and females.

22 Example 4: A forest in Oregon has an infestation of spruce moths. In an effort to control the moth, one area has been regularly sprayed from airplanes. In this area, a random sample of 495 spruce trees showed that 81 had been killed by moths. A second nearby area receives no treatment. In this area, a random sample of 518 spruce trees showed that 92 had been killed by the moth. Do these data indicate that the proportion of spruce trees killed by the moth is different for these areas?

23 Assumptions: Have 2 independent SRS of spruce trees Both distributions are approximately normal since n 1 p 1 =81, n 1 (1-p 1 )=414, n 2 p 2 =92, n 2 (1-p 2 )=426 and all > 10 Population of spruce trees is at least 10,130. H 0 : p 1 = p 2 where p 1 is the true proportion of trees killed by moths H a : p 1 ≠ p 2 in the treated area p 2 is the true proportion of trees killed by moths in the untreated area P-value = 0.5547  = 0.05 Since p-value > , I fail to reject H 0. There is not sufficient evidence to suggest that the proportion of spruce trees killed by the moth is different for these areas

24 18-2930 and overTotal Snore48318366 Don’t snore136493629 Total184811995 Back to snoring…. What if I wanted to know the true difference in the population proportion of young adults who snore and the proportion of older adults who snore? CONFIDENCE INTERVAL! For the true DIFFERENCE between snoring rates!

25 What are the steps for performing a confidence interval? 1.) Identify the interval by name or formula (CI for two-sample proportion) 2.) Assumptions TwoindependentTwo, independent SRS’s from populations ( or randomly assigned treatments) Populations at least 10n Normal approximation for both 3.) Calculations 4.) Conclusion (in context of problem)

26 Formula for confidence interval: Note: use p-hat when p is not known Standard error! Margin of error!

27 Two-sample Confidence Interval for Proportions Conditions for inference have previously been met We are 95% confidence that the proportion of people who snore is between 5.92% and 20.28% higher for older adults than for younger adults.

28 Example 1: At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is the shape & standard error of the sampling distribution of the difference in the proportions of people with no visible scars between the two groups? Since n 1 p 1 =259, n 1 (1-p 1 )=57, n 2 p 2 =94, n 2 (1-p 2 )=325 and all > 5, then the distribution of difference in proportions is approximately normal.

29 Example 1: At Community Hospital, the burn center is experimenting with a new plasma compress treatment. A random sample of 316 patients with minor burns received the plasma compress treatment. Of these patients, it was found that 259 had no visible scars after treatment. Another random sample of 419 patients with minor burns received no plasma compress treatment. For this group, it was found that 94 had no visible scars after treatment. What is a 95% confidence interval of the difference in proportion of people who had no visible scars between the plasma compress treatment & control group?

30 Assumptions: Have 2 independent randomly assigned treatment groups Both distributions are approximately normal since n 1 p 1 =259, n 1 (1-p 1 )=57, n 2 p 2 =94, n 2 (1-p 2 )=325 and all > 5 Population of burn patients is at least 7350. Since these are all burn patients, we can add 316 + 419 = 735. If not the same – you MUST list separately. We are 95% confident that the true proportion of people who had no visible scars between the plasma compress treatment is between 53.7% and 65.4% higher than for the control group.

31 Example 2: Suppose that researchers want to estimate the difference in proportions of people who are against the death penalty in Texas & in California. If the two sample sizes are the same, what size sample is needed to be within 2% of the true difference at 90% confidence? Since both n’s are the same size, you have common denominators – so add! n = 3383

32 Do you think that the proportion of defective PEANUT M&Ms is higher than The proportion of defective PLAIN M&MS?


Download ppt "Two-Sample Proportions Inference. Sampling Distributions for the difference in proportions When tossing pennies, the probability of the coin landing."

Similar presentations


Ads by Google