Presentation is loading. Please wait.

Presentation is loading. Please wait.

Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.

Similar presentations


Presentation on theme: "Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the."— Presentation transcript:

1 Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample proportions. DETERMINE whether the conditions are met for doing inference about p 1 − p 2. CONSTRUCT and INTERPRET a confidence interval to compare two proportions. PERFORM a significance test to compare two proportions. Comparing Two Proportions

2 The Practice of Statistics, 5 th Edition2 Introduction Suppose we want to compare the proportions of individuals with a certain characteristic in Population 1 and Population 2. Let’s call these parameters of interest p 1 and p 2. The ideal strategy is to take a separate random sample from each population and to compare the sample proportions with that characteristic. What if we want to compare the effectiveness of Treatment 1 and Treatment 2 in a completely randomized experiment? This time, the parameters p 1 and p 2 that we want to compare are the true proportions of successful outcomes for each treatment. We use the proportions of successes in the two treatment groups to make the comparison.

3 The Practice of Statistics, 5 th Edition3 The Sampling Distribution of a Difference Between Two Proportions To explore the sampling distribution of the difference between two proportions, let’s start with two populations having a known proportion of successes. At School 1, 70% of students did their homework last night At School 2, 50% of students did their homework last night. Suppose the counselor at School 1 takes an SRS of 100 students and records the sample proportion that did their homework. School 2’s counselor takes an SRS of 200 students and records the sample proportion that did their homework.

4 The Practice of Statistics, 5 th Edition4 The Sampling Distribution of a Difference Between Two Proportions Using Fathom software, we generated an SRS of 100 students from School 1 and a separate SRS of 200 students from School 2. The difference in sample proportions was then be calculated and plotted. We repeated this process 1000 times.

5 The Practice of Statistics, 5 th Edition5 The Sampling Distribution of the Difference Between Sample Proportions Choose an SRS of size n 1 from Population 1 with proportion of successes p 1 and an independent SRS of size n 2 from Population 2 with proportion of successes p 2. The Sampling Distribution of a Difference Between Two Proportions

6 The Practice of Statistics, 5 th Edition6 The Sampling Distribution of a Difference Between Two Proportions

7 The Practice of Statistics, 5 th Edition7 The Sampling Distribution of a Difference Between Two Proportions Suppose that there are two large high schools, each with more than 2000 students, in a certain town. At School 1, 70% of students did their homework last night. Only 50% of the students at School 2 did their homework last night. The counselor at School 1 takes an SRS of 100 students and records the proportion that did homework. School 2’s counselor takes an SRS of 200 students and records the proportion that did homework

8 The Practice of Statistics, 5 th Edition8

9 9 Confidence Intervals for p 1 – p 2 If the Normal condition is met, we find the critical value z* for the given confidence level from the standard Normal curve.

10 The Practice of Statistics, 5 th Edition10 Confidence Intervals for p 1 – p 2 Conditions For Constructing A Confidence Interval About A Difference In Proportions Random: The data come from two independent random samples or from two groups in a randomized experiment. o 10%: When sampling without replacement, check that n 1 ≤ (1/10)N 1 and n 2 ≤ (1/10)N 2.

11 The Practice of Statistics, 5 th Edition11 Confidence Intervals for p 1 – p 2 Two-Sample z Interval for a Difference Between Two Proportions

12 The Practice of Statistics, 5 th Edition12

13 The Practice of Statistics, 5 th Edition13 Significance Tests for p 1 – p 2 An observed difference between two sample proportions can reflect an actual difference in the parameters, or it may just be due to chance variation in random sampling or random assignment. Significance tests help us decide which explanation makes more sense. The null hypothesis has the general form H 0 : p 1 - p 2 = hypothesized value We’ll restrict ourselves to situations in which the hypothesized difference is 0. Then the null hypothesis says that there is no difference between the two parameters: H 0 : p 1 - p 2 = 0 or, alternatively, H 0 : p 1 = p 2 The alternative hypothesis says what kind of difference we expect. H a : p 1 - p 2 > 0, H a : p 1 - p 2 < 0, or H a : p 1 - p 2 ≠ 0

14 The Practice of Statistics, 5 th Edition14 Significance Tests for p 1 – p 2 Conditions For Performing a Significance Test About A Difference In Proportions Random: The data come from two independent random samples or from two groups in a randomized experiment. o 10%: When sampling without replacement, check that n 1 ≤ (1/10)N 1 and n 2 ≤ (1/10)N 2.

15 The Practice of Statistics, 5 th Edition15 Significance Tests for p 1 – p 2 If H 0 : p 1 = p 2 is true, the two parameters are the same. We call their common value p. We need a way to estimate p, so it makes sense to combine the data from the two samples. This pooled (or combined) sample proportion is:

16 The Practice of Statistics, 5 th Edition16 Significance Tests for p 1 – p 2 Two-Sample z Test for the Difference Between Two Proportions

17 The Practice of Statistics, 5 th Edition17 Inference for Experiments Many important statistical results come from randomized comparative experiments. Defining the parameters in experimental settings is more challenging. Most experiments on people use recruited volunteers as subjects. When subjects are not randomly selected, researchers cannot generalize the results of an experiment to some larger populations of interest. Researchers can draw cause-and-effect conclusions that apply to people like those who took part in the experiment. Unless the experimental units are randomly selected, we don’t need to check the 10% condition when performing inference about an experiment.

18 Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition18 DESCRIBE the shape, center, and spread of the sampling distribution of the difference of two sample means. DETERMINE whether the conditions are met for doing inference about µ 1 − µ 2. CONSTRUCT and INTERPRET a confidence interval to compare two means. PERFORM a significance test to compare two means. DETERMINE when it is appropriate to use two-sample t procedures versus paired t procedures. Comparing Two Means

19 The Practice of Statistics, 5 th Edition19 Introduction What if we want to compare the mean of some quantitative variable for the individuals in Population 1 and Population 2? Our parameters of interest are the population means µ 1 and µ 2. The best approach is to take separate random samples from each population and to compare the sample means. Suppose we want to compare the average effectiveness of two treatments in a completely randomized experiment. We use the mean response in the two groups to make the comparison.

20 The Practice of Statistics, 5 th Edition20 The Sampling Distribution of a Difference Between Two Means To explore the sampling distribution of the difference between two means, let’s start with two Normally distributed populations having known means and standard deviations. Based on information from the U.S. National Health and Nutrition Examination Survey (NHANES), the heights (in inches) of ten-year-old girls follow a Normal distribution N(56.4, 2.7). The heights (in inches) of ten-year-old boys follow a Normal distribution N(55.7, 3.8). Suppose we take independent SRSs of 12 girls and 8 boys of this age and measure their heights.

21 The Practice of Statistics, 5 th Edition21 The Sampling Distribution of a Difference Between Two Means Using Fathom software, we generated an SRS of 12 girls and a separate SRS of 8 boys and calculated the sample mean heights. The difference in sample means was then be calculated and plotted. We repeated this process 1000 times. The results are below:

22 The Practice of Statistics, 5 th Edition22 The Sampling Distribution of the Difference Between Sample Means Choose an SRS of size n 1 from Population 1 with mean µ 1 and standard deviation σ 1 and an independent SRS of size n 2 from Population 2 with mean µ 2 and standard deviation σ 2. The Sampling Distribution of a Difference Between Two Means

23 The Practice of Statistics, 5 th Edition23 The Sampling Distribution of a Difference Between Two Means

24 The Practice of Statistics, 5 th Edition24 The Two-Sample t Statistic If the Normal condition is met, we standardize the observed difference to obtain a t statistic that tells us how far the observed difference is from its mean in standard deviation units.

25 The Practice of Statistics, 5 th Edition25 The Two-Sample t Statistic The two-sample t statistic has approximately a t distribution. We can use technology to determine degrees of freedom OR we can use a conservative approach, using the smaller of n 1 – 1 and n 2 – 1 for the degrees of freedom.

26 The Practice of Statistics, 5 th Edition26 The Two-Sample t Statistic Conditions for Performing Inference About µ 1 - µ 2 Random: The data come from two independent random samples or from two groups in a randomized experiment. o 10%: When sampling without replacement, check that n 1 ≤ (1/10)N 1 and n 2 ≤ (1/10)N 2. Normal/Large Sample: Both population distributions (or the true distributions of responses to the two treatments) are Normal or both sample sizes are large (n 1 ≥ 30 and n 2 ≥ 30). If either population (treatment) distribution has unknown shape and the corresponding sample size is less than 30, use a graph of the sample data to assess the Normality of the population (treatment) distribution. Do not use two-sample t procedures if the graph shows strong skewness or outliers.

27 The Practice of Statistics, 5 th Edition27 Confidence Intervals for µ 1 – µ 2 Two-Sample t Interval for a Difference Between Two Means

28 The Practice of Statistics, 5 th Edition28 Significance Tests for µ 1 – µ 2 An observed difference between two sample means can reflect an actual difference in the parameters, or it may just be due to chance variation in random sampling or random assignment. Significance tests help us decide which explanation makes more sense. The null hypothesis has the general form H 0 : µ 1 - µ 2 = hypothesized value We’re often interested in situations in which the hypothesized difference is 0. Then the null hypothesis says that there is no difference between the two parameters: H 0 : µ 1 - µ 2 = 0 or, alternatively, H 0 : µ 1 = µ 2 The alternative hypothesis says what kind of difference we expect. H a : µ 1 - µ 2 > 0, H a : µ 1 - µ 2 < 0, or H a : µ 1 - µ 2 ≠ 0

29 The Practice of Statistics, 5 th Edition29 Significance Tests for µ 1 – µ 2 To find the P-value, use the t distribution with degrees of freedom given by technology or by (df = smaller of n 1 - 1 and n 2 - 1).

30 The Practice of Statistics, 5 th Edition30 Significance Tests for µ 1 – µ 2 Two-Sample t Test for the Difference Between Two Means

31 The Practice of Statistics, 5 th Edition31 Using Two-Sample t Procedures Wisely In planning a two-sample study, choose equal sample sizes if you can. Do not use “pooled” two-sample t procedures! We are safe using two-sample t procedures for comparing two means in a randomized experiment. Do not use two-sample t procedures on paired data! Beware of making inferences in the absence of randomization. The results may not be generalized to the larger population of interest.


Download ppt "Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the."

Similar presentations


Ads by Google