Presentation on theme: "AP STATISTICS LESSON 11 – 2 (DAY 1) Comparing Two Means."— Presentation transcript:
AP STATISTICS LESSON 11 – 2 (DAY 1) Comparing Two Means
ESSENTIAL QUESTION: When can procedures for comparing two means be used and what are those procedures? Objectives: To determine if procedures for comparing two means should be used. To construct two means significance tests To construct confidence intervals to make inferences when comparing two samples.
Comparing Two Means Comparing two populations or two treatments is one of the most common situations encountered in statistical practice. We call such situations two-sample problems. A two sample problem can arise from a randomized comparative experiment that randomly divides subjects into two groups and exposes each group to a different treatment. Comparing random samples separately selected from two populations is also a two sample problem. Unlike the matched pairs designs studied earlier, there is no matching of the units in the two samples and the two samples can be of different sizes.
Two – Sample Problems The goal of inference is to compare the responses to two treatments or to compare the characteristics of two populations. We have a separate sample from each treatment or each population.
Example 11.9 Page 648 Two-Sample Problems 1.A medical researcher is interested in the effect on blood pressure of added calcium in our diet. She conducts a randomized comparative experiment in which on group of subjects receives a calcium supplement and a control group receives a placebo. 2.A psychologist develops a test that measures social insight. He compares the social insight of male college students with that of female college students by giving the test to a sample of students of each gender. 3.A bank wants to know which of two incentive plans will most increase the use of its credit cards. It offers each incentive to a random sample of credit card customers and compares the amount charged during the following six months.
Conditions for Comparing Two Means We have two SRSs, from two distinct populations. The samples are independent (That is, one sample has no influence on the other.) Matching violates independence, for example. We measure the same variable for both samples. Both populations are normally distributed. The means and standard deviations of the populations are unknown.
Organizing the Data Call the variable we measure x 1 in the first population and x 2 in the second. We know parameters in this situation. PopulationVariableMeanStandard deviation 1x1x1 μ1μ1 σ1σ1 2x2x2 μ2μ2 σ2σ2
Organizing Data (part 2) There are four unknown parameters, the two means and the two standard deviations. Population Sample MeanSample size Standard deviation 1 n 1 x 1 s 1 2 n 2 x 2 s 2
Example 11.10 Page 650 Calcium and Blood Pressure Does increasing the amount of calcium in our diet reduce blood pressure? A randomized comparative experiment was designed. Subjects: 21 Healthy Black Men A randomly chosen group of 10 of the men received a calcium supplement for 12 weeks. The control group of 11 men received a placebo. The experiment was double-blind.
The Sampling Distribution of x 1 – x 2 The mean of x 1 – x 2 is μ 1 – μ 2. That is, the difference of sample means is an unbiased estimator of the difference of population means. The variance of the difference is the sum of the variance of x 1 – x 2 which is σ 1 + σ 2 Note that the variance add. The standard deviations do not. If the two populations are both normal n1n1 n2n2
The Sampling Distribution of x 1 – x 2 (continued…) Then the distribution of x – x is also normal. The two-sample z statistic is standardized by z = (x 1 – x 2 ) – ( μ 1 – μ 2 ) √ σ 1 2 /n 1 + σ 2 2 /n 2
Standard Deviation of Two-Sample Means Whether an observed difference between two samples is surprising depends on the spread of the observations as well as on the two means. This standard deviation is √ σ 1 2 /n 1 + σ 2 2 /n 2
Standard Error Because we don’t know the population standard deviations, we estimate them by the sample standard deviations from our two samples. SE = √ s 1 2 /n 1 + s 2 2 /n 2 The two-sample t statistic: t = (x 1 – x 2 ) – ( μ 1 – μ 2 ) √ s 1 2 /n 1 + s 2 2 /n 2
Two-Sample t Distributions The statistic t has the same interpretation as any z or t statistic: it says how far x 1 – x 2 is from its mean in standard deviation units. When we replace just one standard deviation in a z statistic by a standard error we must replace the z distribution with the t distribution.
Degrees of Freedom for Two-Sample Problems Two methods for calculating degrees of freedom: Option 1: Use procedures based on the statistic t with critical values from a t distribution (used by calculator). Option 2: Use procedures based on the based on the statistic t with critical from the smaller n – 1.
Confidence Interval for a Two-Sample t ( μ 1 – μ 2 ) ± t*√ s 1 2 /n 1 + s 2 2 /n 2 Compute the two-sample t statistic t = (x 1 – x 2 ) √ s 1 2 /n 1 + s 2 2 /n 2
Example 11.11 Page 655 Calcium and Blood Pressure, continued The P-value. This example uses the conservative method which leads to the t distribution with 9 degrees of freedom.
Example 11.12 Page 656 Two-Sample t Confidence Interval Sample size strongly influences the P- value of a test. An effect that fails to be significant at a s specified level a in a small sample will be significant in a larger sample.
Robustness Again The two-sample t procedures are more robust than the one-sample t methods, particularly when the distributions are not symmetric. When the sizes of the two samples are equal and the two populations being compared have distributions with similar shapes, probability values from the t table are quite accurate. When the two populations distributions have different shapes, larger samples are needed.