The Practice of Statistics Third Edition Daniel S. Yates Chapter 13: Comparing Two Population Parameters Copyright © 2008 by W. H. Freeman & Company.

Slides:



Advertisements
Similar presentations
Inference for a population mean BPS chapter 18 © 2006 W. H. Freeman and Company.
Advertisements

Introduction Comparing Two Means
Chapter 11: Inference for Distributions
CHAPTER 19: Two-Sample Problems
C HAPTER 11 Section 11.2 – Comparing Two Means. C OMPARING T WO M EANS Comparing two populations or two treatments is one of the most common situations.
Objective: To test claims about inferences for two sample means, under specific conditions.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
Ch 10 Comparing Two Proportions Target Goal: I can determine the significance of a two sample proportion. 10.1b h.w: pg 623: 15, 17, 21, 23.
Chapter 10: Comparing Two Populations or Groups
+ DO NOW What conditions do you need to check before constructing a confidence interval for the population proportion? (hint: there are three)
Lesson Comparing Two Means.
Comparing 2 population parameters Chapter 13. Introduction: Two Sample problems  Ex: How do small businesses that fail differ from those that succeed?
AP STATISTICS LESSON 11 – 2 (DAY 1) Comparing Two Means.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 10: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Chapter 10 Comparing Two Means Target Goal: I can use two-sample t procedures to compare two means. 10.2a h.w: pg. 626: 29 – 32, pg. 652: 35, 37, 57.
Section 10.2 Comparing Two Means Mrs. Daniel- AP Stats.
CHAPTER 18: Inference about a Population Mean
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
The Practice of Statistics Third Edition Chapter 13: Comparing Two Population Parameters Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
AP Statistics Section 13.1 A. Which of two popular drugs, Lipitor or Pravachol, helps lower bad cholesterol more? 4000 people with heart disease were.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Chapter 8: Estimating with Confidence Section 8.3 Estimating a Population Mean.
+ Chapter 12: More About Regression Section 12.1 Inference for Linear Regression.
Example 1: a) Describe the shape, center, and spread of the sampling distribution of. The sampling distribution of is Normal because both population distributions.
Two-sample Proportions Section Starter One-sample procedures for proportions can also be used in matched pairs experiments. Here is an.
Comparing Two Population Parameters Comparing 2 Means.
CONFIDENCE INTERVAL FOR 2-SAMPLE MEANS
Lesson Comparing Two Means. Knowledge Objectives Describe the three conditions necessary for doing inference involving two population means. Clarify.
Section 11.2 Comparing Two Means AP Statistics
Section 11.2 Comparing Two Means AP Statistics Winter 2006.
SECTION 11.2 COMPARING TWO MEANS AP Statistics. Comparing Two Means AP Statistics, Section  Very useful compare two populations  Two population.
+ The Practice of Statistics, 4 th edition – For AP* STARNES, YATES, MOORE Unit 5: Estimating with Confidence Section 11.1 Estimating a Population Mean.
+ Unit 6: Comparing Two Populations or Groups Section 10.2 Comparing Two Means.
Learning Objectives After this section, you should be able to: The Practice of Statistics, 5 th Edition1 DESCRIBE the shape, center, and spread of the.
Comparing Two Means Ch. 13. Two-Sample t Interval for a Difference Between Two Means.
+ Unit 5: Estimating with Confidence Section 8.3 Estimating a Population Mean.
CHAPTER 19: Two-Sample Problems ESSENTIAL STATISTICS Second Edition David S. Moore, William I. Notz, and Michael A. Fligner Lecture Presentation.
2 sample hypothesis testing. Does increasing the amount of calcium in our diet reduce blood pressure? Examination of a large sample of people revealed.
Class Six Turn In: Chapter 15: 30, 32, 38, 44, 48, 50 Chapter 17: 28, 38, 44 For Class Seven: Chapter 18: 32, 34, 36 Chapter 19: 26, 34, 44 Quiz 3 Read.
The Practice of Statistics, 5 th Edition1 Check your pulse! Count your pulse for 15 seconds. Multiply by 4 to get your pulse rate for a minute. Write that.
The Practice of Statistics, 5th Edition Starnes, Tabor, Yates, Moore Bedford Freeman Worth Publishers CHAPTER 10 Comparing Two Populations or Groups 10.2.
Chapter 10: Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
AP Stats Check In Where we’ve been…
Unit 6: Comparing Two Populations or Groups
CHAPTER 19: Two-Sample Problems
Section 10.2 Comparing Two Means
Comparing Two Means.
Lesson Comparing Two Means.
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Which of the popular two drugs – Lipitor or Pravachol – helps lower.
Chapter 10: Comparing Two Populations or Groups
Chapter 10 Comparing Two Populations or Groups
CHAPTER 10 Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
CHAPTER 18: Inference about a Population Mean
Chapter 10: Comparing Two Populations or Groups
CHAPTER 18: Inference about a Population Mean
Chapter 10: Comparing Two Populations or Groups
CHAPTER 19: Two-Sample Problems
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Section 10.2 Comparing Two Means.
Chapter 10: Comparing Two Populations or Groups
Chapter 10: Comparing Two Populations or Groups
Presentation transcript:

The Practice of Statistics Third Edition Daniel S. Yates Chapter 13: Comparing Two Population Parameters Copyright © 2008 by W. H. Freeman & Company

Chapter Objectives Identify the conditions needed to do inference for comparing two population means or proportions. Perform a significance test for the difference of two population means or proportions. Construct a confidence interval for the difference between two population means or proportions.

Two-Sample Problems Comparing two populations or two treatments is one of the most common situations encountered in statistical practice. Unlike the matched pairs designs, there is no matching of the units in the two samples. The two samples can be different sizes.

13.1 – Comparing Two Means

Notation ParametersStatistics PopulationVariableMean Standard Deviation Sample sizeMean Standard Deviation 1x1x1 µ1µ1 σ1σ1 n1n1 s1s1 2x2x2 µ2µ2 σ2σ2 n2n2 s2s2 There are 4 unknown parameters, the two means and the two standard deviations. We want to compare the two population means, either by giving a confidence interval for their difference µ 1 - µ 2 or by testing the hypothesis of no difference, H 0 : µ 1 = µ 2 or H 0 : µ 1 - µ 2 = 0

The Two-Sample z Statistic The mean of is µ1 - µ2. The difference of sample means is an unbiased estimator of the difference of population means. The variance of the differences is the sum of the variances of which is Note: variances add, not standard deviations If the population distributions are both Normal, then the distribution of is also Normal

When the statistic has a Normal distribution, we can standardize it to obtain a standard Normal z distribution. It is very rare that we would know both population standard deviations, so we would have to use the t procedures.

The Two-Sample t Procedures The standard error, or estimated standard deviation is When we standardize our estimate ( ), the result is the two- sample t statistic.

The Two-Sample t Procedures The two-sample t statistic has approximately a t distribution. It does not have exactly a t distribution even if the populations are both exactly Normal. The approximation, however, is very accurate. The catch is calculating the degrees of freedom….it can be messy.

Calculating degrees of freedom There are three options: 1.Use technology to calculate degrees of freedom 2.If n 1 = n 2, then df = n 1 + n 2 – 2 3.Df = the smaller of n 1 – 1 and n 2 – 1. This is a very conservative method. There is a much more complicated method to calculate the degrees of freedom by hand. Most statistical software programs use this method (built into their program)

These two-sample procedures always err on the safe side. They report higher P-values and lower confidence than may actually be true. The gap between what is reported and the truth is quite small –Unless the sample sizes are both small and unequal As the sample sizes increase, probability values based on t with degrees of freedom equal to the smaller of n 1 – 1 and n 2 – 1 become more accurate.

Example. Calcium and blood pressure. Does increasing the amount of calcium in our diet reduce blood pressure? Examination of a large sample of people revealed a relationship between calcium intake and blood pressure. The relationship was strongest for black men. Such observational studies do not establish causation. Researchers therefore designed a randomized comparative experiment. The subjects in part of the experiment were 21 healthy black men. A randomly chosen group of 11 men received a placebo pill that looked identical. The experiment was double-blind. The response variable is the decrease in systolic (top number) blood pressure for a subject after 12 weeks, in millimeters of mercury. An increase appears as a negative response.

Take Group 1 to be the calcium group and Group 2 the placebo group. Group 1 Group 2 From the data, calculate the summary statistics: GroupTreatmentns 1Calcium Placebo

The calcium group shows a drop in blood pressure, = 5.000, while the placebo group shows a small increase, = Is this outcome good evidence that calcium decreases blood pressure in the entire population of healthy black men more than a placebo does? Step 1: Hypotheses. H 0 : µ 1 = µ 2 H 0 : µ 1 - µ 2 = 0 or H a :µ 1 > µ 2 H a : µ 1 - µ 2 > 0 H 0 : There is no difference in blood pressure between the two treatments H a :The calcium treatment shows a decrease in blood pressure.

Step 2: Conditions. SRS – The 21 subjects were not obtained by random selection from a larger population. As a result, it may be difficult to generalize our findings to all healthy black men. However, the random assignment of subjects to treatments should help ensure that any significant difference in mean blood pressure between the two groups is due to the treatment. Independence – Because of the randomization, the calcium group and the placebo group are two independent samples. We cannot use the 10n ≤ N here because we are not sampling from different populations. Normality – We must check for serious non-Normality (outliers). We will use a Normal probability plot.

Although the calcium group shows a slightly irregular distribution, there are no outliers. We should feel comfortable using t procedures because they are robust against non-Normality.

Step 3: Calculations. Test statistic. The two-sample t statistic is P-value. Df = 9 tcdf(1.604, 1000, 9) =

Step 4: Interpretation. The experiment provided some evidence that calcium reduces blood pressure, but the evidence falls short of the traditional 5% and 1% levels. We would fail to reject H 0 at either of these significance levels. We can estimate the difference in the mean decrease in blood pressure for the hypothetical calcium and placebo populations using a two- sample t interval.

For a 90% confidence interval, and df = 9, t* = We are 90% confident that the mean advantage of calcium over placebo, u 1 – u 2 lies between and Since the 90% confidence interval includes 0, we would fail to reject H 0 : u 1 – u 2 = 0 against the two-sided alternative at the α = 0.10 level of significance. HW: pg. 785 #13,1, 13.2, 13.5 / pg. 791 #13.7, 13.9 (goes with 13.5)

Software Approximation for the Degrees of Freedom Note: The degrees of freedom do not have to be a whole number.

Calcium and blood pressure continued…….. Here is the data summary again. GroupTreatmentns 1Calcium Placebo

Computer Outputs

Using your Graphing Calculator for a Two-Sample T Test Enter data for calcium group in L1 and placebo group in L2. Go to STAT/TESTS and choose 4: 2-SampTTest In the 2-SampTTest screen, specify “Data” and adjust your inequality to match your alternative hypothesis. Arrow down and highlight “Calculate” and press ENTER. –Always pick “No” for pooling If you pick “Draw” the t(k) distribution will be displayed. –It will only display the t test statistic and the p-value.

Using your Graphing Calculator for a Two-Sample T Interval Enter data for calcium group in L1 and placebo group in L2. Go to STAT/TESTS and choose 2-SampTInt In the 2-SampTTest screen, specify “Data” and desired level of confidence Arrow down and highlight “Calculate” and press ENTER. –Always pick “No” for pooling If you are given a data summary instead of actual data values, select the “Stats” option instead. Then provide the values requested.

Example. DDT poisoning. Poisoning by the pesticide DDT causes convulsions in humans and other mammals. Researchers seek to understand how the convulsions are caused. In a randomized comparative experiment, they compared 6 white rats poisoned with DDT with a control group of 6 unpoisoned rats. Electrical measurements of nerve activity are the main clue to the nature of DDT poisoning. When a nerve is stimulated, its electrical response shows a sharp spike followed by a much smaller second spike. The experiment found that the second spike is larger in rats fed DDT than in normal rats. This finding helped biologists understand how DDT poisoning works.

The researchers measured the height of the second spike as a percent of the first spike when a nerve in the rat’s leg was stimulated. For the poisoned rats the results were The control group data were

Researchers didn’t conjecture in advance that the size of the second spike would be higher in rats fed DDT, they only conjectured that it would be different. Step 1: Hypotheses. H 0 : µ DDT = µ NORMAL H a : µ DDT ≠ µ NORMAL

Step 2: Conditions. SRS – The researchers used a randomized comparative experiment. The rats were randomly assigned to the two treatments. Independence – Due to the random assignment, the researchers can treat the two groups of rats as independent samples. Normality – Normal probability plots show no outliers.

Step 3: Calculations. Use your calculator to determine the missing values. = 17.6 = s 1 = 6.340s 2 = t = p-value = df = 5.938

Step 4: Interpretation. The low P-value provides strong evidence against the null. We can reject H o at the 5% significance level. We conclude that the mean size of the secondary spike is larger in rats fed DDT. HW: pg. 801 #13.13, 13.16

13.2 – Comparing Two Proportions Population proportion Sample size Sample proportion 1p1p1 n1n1 2p2p2 n2n2 We do inference about the difference p1 – p2 between the population proportions to compare the populations. The statistic that estimates this difference is the difference between the sample proportions,

The Sampling Distribution of Center: the mean of is Spread: The standard deviation of is Shape: When the samples are large, the distribution of is approximately Normal. –This will happen if n 1 (p 1 ), n 1 (1 - p 1 ), n 2 (p 2 ), and n 2 (1 - p 2 ) are all ≥ 10.

The Sampling Distribution of

Confidence Intervals for To obtain a confidence interval, replace the population proportions p 1 and p 2 with the sample proportions. The result is the standard error. The confidence interval again has the form estimate ± z*SE estimate

Example. How much does preschool help? To study the long term effects of preschool programs for poor children, the High/Scope Educational Research Foundation has followed two groups of Michigan children since early childhood. A control groups of 61 children represents Population 1, poor children with no preschool. Another group of 62 children from the same area and similar backgrounds attended preschool as 3- and 4-year-olds. This is a sample from Population 2, poor children who attend preschool. The response variable of interest is the need for social services as adults. In the past 10 years, 38 of the preschool sample and 49 of the control group have needed social services (mainly welfare) Does this study provide significant evidence that preschool reduces the later need for social services?

Step 1: Hypotheses. H o : p 1 = p 2 H a : p 1 > p 2 p 1 = proportion of poor children who don’t attend preschool and who need social services as adults p 2 = proportion of poor children who attend preschool and who need social services as adults. We will start by calculating a two-proportion z interval.

Step 2: Conditions SRS: We are not told how the two samples were selected. We must use caution when drawing conclusions about the corresponding population. Normality: –These are all at least 5, so the interval based on Normal calculations will be reasonably accurate. Independence: We can be fairly confident that there are at least 610 poor children who did not attend preschool and 620 poor children who did in our population of interest.

Step 3: Calculations. To compute a 95% confidence interval first calculate the standard error. The 95% confidence interval is

Computer Outputs

Step 4: Interpretation. We are 95% confident that the percent needing social services is somewhere between 3.3% and 34.7% lower among people who attended preschool. The confidence interval is wide because the sample sizes are a bit small for estimating an unknown proportion with precision. The researchers selected two separate samples from the two populations they wanted to compare. Many comparative studies start with just one sample, the divide it into two groups based on data gathered from the subjects. The two-proportion z procedures are valid in such situations. HW: pg. 813 #13.27

Significance Tests for p 1 – p 2 The null hypothesis says that there is no difference between the two populations: H 0 : p 1 = p 2 The alternative hypothesis says what kind of difference we expect. Checking Normality: must all be at least 10. The test statistic formula uses the combined sample proportion.

Notice how this formula is different than the SE for confidence intervals. has replaced both and in the formula When checking normality you can check that are all greater than 10 (some books use 5)

Example. How much does preschool help? continued… Recall our H 0 : p 1 = p 2 and H a : p 1 > p 2 Population DescriptionSample Size Number needing social services 1Control6149 2Preschool6238

Check Normality condition.

Calculations P-value: P(z > 2.31) = P(z < -2.31) =

Interpretation. Our P-value, , tells us that it is unlikely that we would obtain a difference in sample proportions as large as we did if the null hypothesis is true. Since our P-value is less than 0.05, we can reject H 0. We can conclude poor children who did not attend preschool are more likely to need social services than poor children who did attend preschool.

Using your Graphing Calculator for a Two-Proportion Z Test Go to STAT/TESTS and choose 6: 2-PropZTest In the 2-PropZTest screen, enter x 1, n 1, x 2, n 2 and adjust your inequality to match your alternative hypothesis. –Where x 1 and x 2 represent the successes for the samples. Arrow down and highlight “Calculate” and press ENTER. If you pick “Draw” the z distribution will be displayed. –It will only display the z test statistic and the p-value. Now try it on Example on pg HW: pg. 819 #13.30