 # Statistical Inference About Means and Proportions With Two Populations

## Presentation on theme: "Statistical Inference About Means and Proportions With Two Populations"— Presentation transcript:

Statistical Inference About Means and Proportions With Two Populations
Statistics Statistical Inference About Means and Proportions With Two Populations

STATISTICS in PRACTICE
Statistics plays a major role in pharmaceutical research. Statistical methods are used to test and develop new drugs. In most studies, the statistical method involves hypothesis testing for the difference between the means of the new drug population and the standard drug population.

STATISTICS in PRACTICE
In this chapter you will learn how to construct interval estimates and make hypothesis tests about means and proportions with two populations.

Contents Inferences About the Difference Between Two Population Means: s 1 and s 2 Known Inferences About the Difference Between Two Population Means: s 1 and s2 Unknown Inferences About the Difference Between Two Population Means: Matched Samples Inferences About the Difference Between Two Population Proportions

Inferences About the Difference Between Two Population Means: s 1 and s 2 Known
Point and Interval Estimation of m 1 – m 2 The Point Estimator of m 1 – m 2 is Interval Estimation of m 1 – m 2 is

Inferences About the Difference Between Two Population Means: s 1 and s 2 Known
Hypothesis Tests About m 1 – m 2 D0: hypothesized difference between m 1 – m 2

Estimating the Difference Between Two Population Means
Let 1 equal the mean of population 1 and 2 equal the mean of population 2. The difference between the two population means is 1 - 2. To estimate 1 - 2, we will select a simple random sample of size n1 from population 1 and a simple random sample of size n2 from population 2.

Estimating the Difference Between Two Population Means
Let equal the mean of sample 1 and equal the mean of sample 2. The point estimator of the difference between the means of the populations 1 and 2 is

Sampling Distribution of
Expected Value Standard Deviation (Standard Error) where: 1 = standard deviation of population 1 2 = standard deviation of population 2 n1 = sample size from population 1 n2 = sample size from population 2

Interval Estimation of 1 - 2: s 1 and s 2 Known
Interval Estimate where: 1 -  is the confidence coefficient

Interval Estimation of 1 - 2: s 1 and s 2 Known
Example: Par, Inc. Par, Inc. is a manufacturer of golf equipment and has developed a new golf ball that has been designed to provide “extra distance.” In a test of driving distance using a mechanical driving device, a sample of Par golf balls was compared with a sample of golf balls made by Rap, Ltd., a competitor. The sample statistics appear on the next slide.

Interval Estimation of 1 - 2: s 1 and s 2 Known
Example: Par, Inc. Sample #1 Par, Inc. Sample #2 Rap, Ltd. Sample Size 120 balls balls Sample Mean 275 yards yards Based on data from previous driving distance tests, the two population standard deviations are known with s 1 = 15 yards and s 2 = 20 yards.

Interval Estimation of 1 - 2: s 1 and s 2 Known
Example: Par, Inc. Let us develop a 95% confidence interval estimate of the difference between the mean driving distances of the two brands of golf ball.

Estimating the Difference Between Two Population Means
Rap, Ltd. Golf Balls m2 = mean driving distance of Rap golf balls Population 1 Par, Inc. Golf Balls m1 = mean driving distance of Par golf balls μ1– μ2 = difference between the mean distances Simple random sample of n1 Par golf balls x1 = sample mean distance for the Par golf balls Simple random sample of n2 Rap golf balls x2 = sample mean distance for the Rap golf balls = Point Estimate of μ1– μ2

Point Estimate of 1 - 2 Point estimate of 1 - 2 = = 275 - 258
= 17 yards where: 1 = mean distance for the population of Par, Inc. golf balls 2 = mean distance for the population of Rap, Ltd. golf balls

Interval Estimation of 1 - 2:  1 and  2 Known
or yards to yards We are 95% confident that the difference between the mean driving distances of Par, Inc. balls and Rap, Ltd. balls is to yards.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
Hypotheses Left-tailed Right-tailed Two-tailed Test Statistic

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
Example: Par, Inc. Can we conclude, using α = .01, that the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls?

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
p –Value and Critical Value Approaches 1. Develop the hypotheses. H0: 1 - 2 < 0  Ha: 1 - 2 > 0 where: 1 = mean distance for the population of Par, Inc. golf balls 2 = mean distance for the population of Rap, Ltd. golf balls 2. Specify the level of significance. a = .01

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
p –Value and Critical Value Approaches 3. Compute the value of the test statistic.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
p –Value Approach 4. Compute the p–value. For z = 6.49, the p –value < 5. Determine whether to reject H0. Because p–value < a = .01, we reject H0. At the .01 level of significance, the sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap,Ltd. golf balls.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
Critical Value Approach 4. Determine the critical value and rejection rule. For a = .01, z.01 = 2.33 Reject H0 if z > 2.33 5. Determine whether to reject H0. Because z = 6.49 > 2.33, we reject H0.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Known
Critical Value Approach The sample evidence indicates the mean driving distance of Par, Inc. golf balls is greater than the mean driving distance of Rap, Ltd. golf balls.

Inferences About the Difference Between Two Population Means:
s 1 and s 2 Unknown Interval Estimation of m 1 – m 2 Hypothesis Tests About m 1 – m 2 H0: m 1 – m 2 =0 Ha: m 1 – m 2 0

Interval Estimation of 1 - 2: s 1 and s 2 Unknown
When s 1 and s 2 are unknown, we will: use the sample standard deviations s1 and s2 as estimates of s 1 and s 2 , and replace za/2 with ta/2

Interval Estimation of 1 - 2: s 1 and s 2 Unknown
Interval Estimate where the degrees of freedom for ta/2 are:

Difference Between Two Population Means : s 1 and s 2 Unknown
Example: Specific Motors Specific Motors of Detroit has developed a new automobile known as the M car. 24 M cars and 28 J cars (from Japan) were road tested to compare miles-per-gallon (mpg) performance. The sample statistics are shown on the next slide.

Difference Between Two Population Means : s 1 and s 2 Unknown
Example: Specific Motors Sample #1 M Cars Sample #2 J Cars 24 cars 28 cars Sample Size 29.8 mpg mpg Sample Mean 2.56 mpg mpg Sample Std. Dev.

Difference Between Two Population Means : s 1 and s 2 Unknown
Example: Specific Motors Let us develop a 90% confidence interval estimate of the difference between the mpg performances of the two models of automobile.

Point Estimate of m 1 - m 2 Point estimate of 1 - 2 = = 29.8 - 27.3
= mpg where: 1 = mean miles-per-gallon for the population of M cars 2 = mean miles-per-gallon for the population of J cars

Interval Estimation of m 1 - m 2: s 1 and s 2 Unknown
The degrees of freedom for ta/2 are: With a/2 = .05 and df = 24, ta/2 =

Interval Estimation of m 1 - m 2: s 1 and s 2 Unknown
or to mpg We are 90% confident that the difference between the miles-per-gallon performances of M cars and J cars is to mpg.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
Hypotheses Left-tailed Right-tailed Two-tailed Test Statistic

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
Example: Specific Motors Can we conclude, using a .05 level of significance, that the miles-per-gallon (mpg) performance of M cars is greater than the miles-per- gallon performance of J cars?

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
p –Value and Critical Value Approaches 1. Develop the hypotheses. H0: 1 - 2 < 0  Ha: 1 - 2 > 0 where: 1 = mean mpg for the population of M cars 2 = mean mpg for the population of J cars

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
p –Value and Critical Value Approaches 2. Specify the level of significance. a = .05 3. Compute the value of the test statistic.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
p –Value Approach 4. Compute the p –value. The degrees of freedom for ta/2 are: Because t = > t.005 = 2.797, the p–value < .005.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
p –Value Approach 5. Determine whether to reject H0. Because p–value < a = .05, we reject H0. We are at least 95% confident that the miles-per-gallon (mpg) performance of M cars is greater than the miles-per-gallon performance of J cars?.

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
Critical Value Approach Determine the critical value and rejection rule. For a = .05 and df = 24, t.05 = 1.711 Reject H0 if t > 1.711

Hypothesis Tests About m 1 - m 2: s 1 and s 2 Unknown
Critical Value Approach 5. Determine whether to reject H0. Because > 1.711, we reject H0. We are at least 95% confident that the miles-per-gallon (mpg) performance of M cars is greater than the miles-per-gallon performance of J cars?.

Inferences About the Difference Between Two Population Means: Matched Samples
With a matched-sample design each sampled item provides a pair of data values. This design often leads to a smaller sampling error than the independent-sample design because variation between sampled items is eliminated as a source of sampling error.

Inferences About the Difference Between Two Population Means: Matched Samples
Example: Express Deliveries A Chicago-based firm has documents that must be quickly distributed to district offices throughout the U.S. The firm must decide between two delivery services, UPX (United Parcel Express) and INTEX (International Express), to transport its documents.

Inferences About the Difference Between Two Population Means: Matched Samples
Example: Express Deliveries In testing the delivery times of the two services, the firm sent two reports to a random sample of its district offices with one report carried by UPX and the other report carried by INTEX. Do the data on the next slide indicate a difference in mean delivery times for the two services? Use a .05 level of significance.

Inferences About the Difference Between Two Population Means: Matched Samples
Delivery Time (Hours) District Office UPX INTEX Difference Seattle Los Angeles Boston Cleveland New York Houston Atlanta St. Louis Milwaukee Denver 32 30 19 16 15 18 14 10 7 25 24 15 13 8 9 11 7 6 4 1 2 3 -1 -2 5

Inferences About the Difference Between Two Population Means: Matched Samples
p –Value and Critical Value Approaches 1. Develop the hypotheses. H0: d = 0  Ha: d  Let d = the mean of the difference values for the two delivery services for the population of district offices

Inferences About the Difference Between Two Population Means: Matched Samples
p –Value and Critical Value Approaches a = .05 2. Specify the level of significance. 3. Compute the value of the test statistic.

Inferences About the Difference Between Two Population Means: Matched Samples
p –Value Approach 4. Compute the p –value. For t = 2.94 and df = 9, the p–value is between.02 and (This is a two-tailed test, so we double the upper-tail areas of .01 and .005.)

Inferences About the Difference Between Two Population Means: Matched Samples
p –Value Approach 5. Determine whether to reject H0. Because p–value < a = .05, we reject H0. We are at least 95% confident that there is a difference in mean delivery times for the two services?

Inferences About the Difference Between Two Population Means: Matched Samples
Critical Value Approach Determine the critical value and rejection rule. For a = .05 and df = 9, t.025 = Reject H0 if t > 2.262

Inferences About the Difference Between Two Population Means: Matched Samples
Critical Value Approach 5. Determine whether to reject H0. Because t = 2.94 > 2.262, we reject H0. We are at least 95% confident that there is a difference in mean delivery times for the two services?

Inferences About the Difference Between Two Population Proportions
Interval Estimation of p1 - p2 Hypothesis Tests About p1 - p2

Sampling Distribution of
Expected Value Standard Deviation (Standard Error) where: n1 = size of sample taken from population 1 n2 = size of sample taken from population 2

Sampling Distribution of
If the sample sizes are large, the sampling distribution of can be approximated by a normal probability distribution. The sample sizes are sufficiently large if all of these conditions are met: n1p1 > 5 n1(1 - p1) > 5 n2p2 > 5 n2(1 - p2) > 5

Sampling Distribution of
p1 – p2

Interval Estimation of p1 - p2
Interval Estimate

Interval Estimation of p1 - p2
Example: Market Research Associates Market Research Associates is conducting research to evaluate the effectiveness of a client’s new adver- tising campaign. Before the new campaign began, a telephone survey of 150 households in the test market area showed 60 households “aware” of the client’s product.

Interval Estimation of p1 - p2
Example: Market Research Associates The new campaign has been initiated with TV and newspaper advertisements running for three weeks.

Interval Estimation of p1 - p2
Example: Market Research Associates A survey conducted immediately after the new campaign showed 120 of 250 households “aware” of the client’s product. Does the data support the position that the advertising campaign has provided an increased awareness of the client’s product?

Point Estimator of the Difference Between Two Population Proportions
p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign = sample proportion of households “aware” of the product after the new campaign of the product before the new campaign

Interval Estimation of p1 - p2
For α= .05, z.025 = 1.96 (.0510) Hence, the 95% confidence interval for the difference in before and after awareness of the product is -.02 to +.18.

Hypothesis Tests about p1 - p2
Hypotheses We focus on tests involving no difference between the two population proportions (i.e. p1 = p2) Left-tailed Right-tailed Two-tailed

Hypothesis Tests about p1 - p2
Pooled Estimate of Standard Error of where:

Hypothesis Tests about p1 - p2
Test Statistic

Hypothesis Tests about p1 - p2
Example: Market Research Associates Can we conclude, using a .05 level of significance, that the proportion of households aware of the client’s product increased after the new advertising campaign?

Hypothesis Tests about p1 - p2
p -Value and Critical Value Approaches 1. Develop the hypotheses. H0: p1 - p2 < 0 Ha: p1 - p2 > 0 p1 = proportion of the population of households “aware” of the product after the new campaign p2 = proportion of the population of households “aware” of the product before the new campaign

Hypothesis Tests about p1 - p2
p -Value and Critical Value Approaches 2. Specify the level of significance. a = .05 3. Compute the value of the test statistic.

Hypothesis Tests about p1 - p2
p –Value Approach 4. Compute the p –value. For z = 1.56, the p–value = .0594 5. Determine whether to reject H0. Because p–value > a = .05, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.

Hypothesis Tests about p1 - p2
Critical Value Approach Determine the critical value and rejection rule. For a = .05, z.05 = 1.645 Reject H0 if z > 1.645

Hypothesis Tests about p1 - p2
Critical Value Approach 5. Determine whether to reject H0. Because 1.56 < 1.645, we cannot reject H0. We cannot conclude that the proportion of households aware of the client’s product increased after the new campaign.