Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations.

Similar presentations


Presentation on theme: "Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations."— Presentation transcript:

1 Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations

2 Copyright © Cengage Learning. All rights reserved. 10.2 Inferences Concerning the Mean Difference Using Two Dependent Samples

3 3 The procedures for comparing two population means are based on the relationship between two sets of sample data, one sample from each population. When dependent samples are involved, the data are thought of as “paired data.” The data may be paired as a result of being obtained from “before” and “after” studies or from matching two subjects with similar traits to form “matched pairs.”

4 4 Inferences Concerning the Mean Difference Using Two Dependent Samples The pairs of data values are compared directly to each other by using the difference in their numerical values. The resulting difference is called a paired difference. Using paired data this way has a built-in ability to remove the effect of otherwise uncontrolled factors.

5 5 Inferences Concerning the Mean Difference Using Two Dependent Samples The wearing ability of a tire is greatly affected by a multitude of factors: the size, weight, age, and condition of the car; the driving habits of the driver; the number of miles driven; the condition and types of roads driven on; the quality of the material used to make the tire; and so on. We create paired data by mounting one tire from each brand on the same car. Since one tire of each brand will be tested under the same conditions, using the same car, same driver, and so on, the extraneous causes of wear are neutralized.

6 6 Procedures and Assumptions for Inferences Involving Paired Data

7 7 A test was conducted to compare the wearing quality of the tires produced by two tire companies. All the aforementioned factors had an equal effect on both brands of tires, car by car. One tire of each brand was placed on each of six test cars. The position (left or right side, front or back) was determined with the aid of a random-number table. Procedures and Assumptions for Inferences Involving Paired Data

8 8 Table 10.1 lists the amounts of wear (in thousandths of an inch) that resulted from the test. Since the various cars, drivers, and conditions were the same for each tire of a paired set of data, it makes sense to use a third variable, the paired difference d. Procedures and Assumptions for Inferences Involving Paired Data Amount of Tire Wear [TA10-01] Table 10.1

9 9 Procedures and Assumptions for Inferences Involving Paired Data Our two dependent samples of data may be combined into one set of d values, where d = B – A. The difference between the two population means, when dependent samples are used (often called dependent means), is equivalent to the mean of the paired differences.

10 10 Procedures and Assumptions for Inferences Involving Paired Data Therefore, when an inference is to be made about the difference of two means and paired differences are used, the inference will in fact be about the mean of the paired differences. The sample mean of the paired differences will be used as the point estimate for these inferences. In order to make inferences about the mean of all possible paired differences  d, we need to know the sampling distribution of

11 11 Procedures and Assumptions for Inferences Involving Paired Data When paired observations are randomly selected from normal populations, the paired difference, d = x 1 – x 2 will be approximately normally distributed about a mean  d with a standard deviation of  d. This is another situation in which the t-test for one mean is applied; namely, we wish to make inferences about an unknown mean (  d ) where the random variable (d) involved has an approximately normal distribution with an unknown standard deviation (  d ).

12 12 Procedures and Assumptions for Inferences Involving Paired Data Inferences about the mean of all possible paired differences  d are based on samples of n dependent pairs of data and the t-distribution with n – 1 degrees of freedom (df), under the following assumption: Assumption for inferences about the mean of paired differences  d The paired data are randomly selected from normally distributed populations.

13 13 Confidence Interval Procedure

14 14 Confidence Interval Procedure The 1 –  confidence interval for estimating the mean difference  d is found using this formula: where is the mean of the sample differences: (10.3)

15 15 Confidence Interval Procedure and s d is the standard deviation of the sample differences: (10.4)

16 16 Example 4 – Constructing a Confidence Interval for  d Construct the 95% confidence interval for the mean difference in the paired data on tire wear, as reported in Table 10.1. The sample information is n = 6 pieces of paired data, = 6.3, and s d = 5.1. Assume the amounts of wear are approximately normally distributed for both brands of tires. Amount of Tire Wear [TA10-01] Table 10.1

17 17 Example 4 – Solution Step 1 Parameter of interest:  d, the mean difference in the amounts of wear between the two brands of tires Step 2 a. Assumptions: Both sampled populations are approximately normal. b. Probability distribution: The t -distribution with df = 6 – 1 = 5 and formula (10.2) will be used. c. Level of confidence: 1 –  = 0.95.

18 18 Example 4 – Solution Step 3 Sample information: n = 6, = 6.3, and s d = 5.1 The mean: The standard deviation: cont’d

19 19 Example 4 – Solution Step 4 a. Confidence coefficient: This is a two-tailed situation with  2 = 0.025 in one tail. From Table 6 in Appendix B, t(df,  2) = t(5,0.025) = 2.57. b. Maximum error of estimate: Using the maximum error part of formula (10.2), we have cont’d

20 20 Example 4 – Solution c. Lower/upper confidence limits: 6.3  5.4 6.3 – 5.4 = 0.9 to 6.3 + 5.4 = 11.7 cont’d

21 21 Example 4 – Solution Step 5 a. Confidence interval: 0.9 to 11.7 is the 95% confidence interval for  d. b. That is, with 95% confidence we can say that the mean difference in the amounts of wear is between 0.9 and 11.7 thousandths of an inch. Or, in other words, the population mean tire wear for brand B is between 0.9 and 11.7 thousandths of an inch greater than the population mean tire wear for brand A. cont’d

22 22 Confidence Interval Procedure Note: This confidence interval is quite wide, in part because of the small sample size. We know from the central limit theorem that as the sample size increases, the standard error (estimated by ) decreases.

23 23 Hypothesis-Testing Procedure

24 24 Hypothesis-Testing Procedure When we test a null hypothesis about the mean difference, the test statistic used will be the difference between the sample mean and the hypothesized value of  d, divided by the estimated standard error. This statistic is assumed to have a t-distribution when the null hypothesis is true and the assumptions for the test are satisfied.

25 25 Hypothesis-Testing Procedure The value of the test statistic is calculated as follows: Note: A hypothesized mean difference,  d, can be any specified value. The most common value specified is zero; however, the difference can be nonzero.

26 26 Example 5 – One-Tailed Hypothesis Test for  d In a study on high blood pressure and the drugs used to control it, the effect of calcium channel blockers on pulse rate was one of many specific concerns. Twenty-six patients were randomly selected from a large pool of potential subjects, and their pulse rates were recorded. A calcium channel blocker was administered to each patient for a fixed period of time, and then each patient’s pulse rate was again determined.

27 27 Example 5 – One-Tailed Hypothesis Test for  d The two resulting sets of data appeared to have approximately normal distributions, and the statistics were = 1.07 and s d = 1.74 (d = before – after). Does the sample information provide sufficient evidence to show that the pulse rate is lower after the medication is taken? Use  = 0.05. cont’d

28 28 Example 5 – Solution Step 1 a. Parameter of interest:  d, the mean difference (reduction) in pulse rate from before to after using the calcium channel blocker for the time period of the test b. Statement of hypotheses: H 0 :  d = 0 (  )(did not lower rate) Remember: d = before – after H a :  d > 0 (did lower rate)

29 29 Example 5 – Solution Step 2 a. Assumptions: Since the data in both sets are approximately normal, it seems reasonable to assume that the two populations are approximately normally distributed. b. Test statistic: The t-distribution with df = n – 1 = 25, and the test statistic is t from formula (10.5). c. Level of confidence: 1 –  = 0.05. cont’d

30 30 Example 5 – Solution Step 3 a. Sample information: n = 26, = 1.07, and s d = 1.74 b. Calculated test statistic: cont’d

31 31 Example 5 – Solution Step 4 The Probability Distribution: p-Value: a. Use the right-hand tail because H a expresses concern for values related to “greater than.” P = P(t > 3.14, with df = 25), as shown in the figure. cont’d

32 32 Example 5 – Solution To find the p-value, you have three options: 1. Use Table 6 (Appendix B): P < 0.005. 2. Use Table 7 (Appendix B) to read the value directly: P = 0.002. 3. Use a computer or calculator to find the p-value: P = 0.0022. b. The p-value is smaller than the level of significance, . cont’d

33 33 Example 5 – Solution Classical: a. The critical region is the right-hand tail because H a expresses concern for values related to “greater than.” The critical value is obtained from Table 6: t(25,0.05) = 1.71. b. is in the critical region, as shown in red in the figure. cont’d

34 34 Example 5 – Solution Step 5 a. Decision: Reject H o. b. Conclusion: At the 0.05 level of significance, we can conclude that the average pulse rate is lower after the administration of the calcium channel blocker. cont’d

35 35 Hypothesis-Testing Procedure Statistical significance does not always have the same meaning when the “practical” application of the results is considered. In the preceding detailed hypothesis test, the results showed a statistical significance with a p-value of 0.002—that is, 2 chances in 1000. However, a more practical question might be: “Is lowering the pulse rate by this small average amount, estimated to be 1.07 beats per minute, worth the risks of possible side effects of this medication?” Actually, the whole issue is much broader than just this one issue of pulse rate.

36 36 Example 6 – Two-Tailed Hypothesis Test for  d Suppose the sample data in Table 10.1 were collected with the hope of showing that the two tire brands do not wear equally. Do the data provide sufficient evidence for us to conclude that the two brands show unequal wear, at the 0.05 level of significance? Assume the amounts of wear are approximately normally distributed for both brands of tires. Amount of Tire Wear [TA10-01] Table 10.1

37 37 Example 6 – Solution Step 1 a. Parameter of interest:  d, the mean difference (reduction) in pulse rate from before to after using the calcium channel blocker for the time period of the test b. Statement of hypotheses: H 0 :  d = 0 (no difference) Remember: d = B – A H 0 :  d  0 (difference) Step 2 a. Assumptions: The assumption of normality is included in the statement of this problem.

38 38 Example 6 – Solution b. Test statistic: The t-distribution with df = n – 1 = 6 – 1 = 5 and c. Level of confidence:  = 0.05. Step 3 a. Sample information: n = 6, = 6.3, and s d = 5.1 b. Calculated test statistic: cont’d

39 39 Example 6 – Solution Step 4 The Probability Distribution: p-Value: a. Use the both tail because H a expresses concern for values related to “different from.” P = p-value = P (t  –3.03) + P (t  3.03) = 2  P (t  3.03), as shown in the figure. cont’d

40 40 Example 6 – Solution To find the p-value, you have three options: 1. Use Table 6 (Appendix B): 0.02  P  0.05. 2. Use Table 7 (Appendix B) to place bounds on the p-value: 0.026  P  0.030. 3. Use a computer or calculator to find the p-value: P = 2  0.0145 = 0.029 b. The p-value is smaller than . cont’d

41 41 Example 6 – Solution Classical: a. The critical region is the right-hand tail because H a expresses concern for values related to “different form.” The critical value is obtained from Table 6: t(5, 0.025) = 2.57. b. t is in the critical region, as shown in red in the figure. cont’d

42 42 Example 6 – Solution Step 5 a. Decision: Reject H o. b. Conclusion: There is a significant mean difference in the amounts of wear at the 0.05 level of significance. cont’d


Download ppt "Copyright © Cengage Learning. All rights reserved. 10 Inferences Involving Two Populations."

Similar presentations


Ads by Google