Download presentation

Presentation is loading. Please wait.

Published byWesley Chapman Modified over 4 years ago

1
Chapter 19: Two-Sample Problems STAT 1450

2
Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid when appropriate simple conditions apply! 19.0 Two-Sample Problems Population Parameter Point Estimate Confidence Interval Test Statistic μ (σ known) μ (σ unknown)s

3
Connecting Chapter 19 to our Current Knowledge of Statistics ▸ Matched pairs were covered at the end of Chapter 18. A common situation requiring matched pairs is when before-and-after measurements are taken on individual subjects. ▸ Example: Prices for a random sample of tickets to a 2008 Katy Perry concert were compared with the ticket prices (for the same seats) to her 2013 concert.. The data could be consolidated into 1 column of differences in ticket prices. A test of significance, or, a confidence interval would then occur for “1 sample of data.” 19.0 Two-Sample Problems

4
The Two-Sample Problems ▸ Two-sample problems require us to compare: the response to two treatments - or - the characteristics of two populations. ▸ We have a separate sample from each treatment or population. 19.1 The Two-Sample Problem

5
Two-Sample Problems ▸ The end of Chapter 18 described inference procedures for the mean difference in two measurements on one group of subjects (e.g., pulse rates for 12 students before-and-after listening to music). ▸ Given our answer from above, and the likelihood that each sample has different sample sizes, variances, etc… Chapter 19 focuses on the difference in means for 2 different groups. Population Parameter Point Estimate Confidence Interval Test Statistic 19.1 The Two-Sample Problem

6
Sampling Distribution of Two Sample Means 19.2 Comparing Two Population Means

7
Sampling Distribution of Two Sample Means ▸ The following table stems from the above comment on standard error and statistical theory. 19.2 Comparing Two Population Means VariableParameterPoint EstimatePopulation Standard Deviation Standard Error x1x1 11 11 x2x2 22 22 Diff = x 1 - x 2 1 - 2

8
Example: SSHA Scores ▸ The Survey of Study Habits and Attitudes (SSHA) is a psychological test designed to measure various academic behaviors (motivation, study habits, attitudes, etc…) of college students. Scores on the SSHA range from 0 to 200. The data for random samples 17 women (**the outlier from the original data set was removed**) and 20 men yielded the following summary statistics. ▸ Is there a difference in SSHA performance based upon gender? 19.2 Comparing Two Population Means

9
Example: SSHA Scores ▸ Summary statistics for the two groups are below: There is a difference in these two groups. The women’s average was 17 points > than the men’s average. Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

10
Example: SSHA Scores ▸ Summary statistics for the two groups are below: There is a difference in these two groups. The women’s average was 17 points > than the men’s average. Yet, the standard deviations are larger than this sample difference, and the sample sizes are about the same. Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

11
Example: SSHA Scores Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

12
Example: SSHA Scores Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

13
The Two-sample t Procedures: Derived ▸ Now that we have a point estimate and a formula for the standard error, we can determine the confidence interval for the difference in two population means. ChapterParameter of InterestPoint Estimate Standard Error Confidence Interval 18 (σ unknown; 1-sample) 19 μ 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) pt. estimate ± t*(standard error) 19.3 Two-Sample t Procedures

14
The Two-sample t Procedures: Derived ▸ Now that we have a point estimate and a formula for the standard error, we can determine the confidence interval for the difference in two population means. ChapterParameter of InterestPoint Estimate Standard Error Confidence Interval 18 (σ unknown; 1-sample) 19 μ 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) 19.3 Two-Sample t Procedures

15
The Two-sample t Procedures: Derived 19.3 Two-Sample t Procedures ChapterParameter of Interest Point Estimate Standard Error Test Statistic 18 μ (σ unknown; 1-sample) 19 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) pt. estimate – 0 standard error Note: H 0 for our purposes will be that 1 = 2 ; which is equivalent to there being a mean difference of ‘0.’

16
The Two-sample t Procedures: Derived 19.3 Two-Sample t Procedures ChapterParameter of Interest Point Estimate Standard Error Test Statistic 18 μ (σ unknown; 1-sample) 19 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) Note: H 0 for our purposes will be that 1 = 2 ; which is equivalent to their being a mean difference of ‘0.’

17
The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

18
The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

19
The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

20
The Two-sample t Procedures: Confidence Intervals 19.3 Two-Sample t Procedures

21
The Two-sample t Procedures: Significance Tests 19.3 Two-Sample t Procedures

22
Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures ▸ The general structure of our necessary conditions is an extension of the one-sample cases. Simple Random Samples: Do we have 2 simple random samples? Population : Sample Ratio: The samples must be independent and from two large populations of interest. 19.0 Two-Sample Problems

23
Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures Large enough sample: Both populations will be assumed to be from a Normal distribution and when the sum of the sample sizes is less than 15, t procedures can be used if the data close to Normal (roughly symmetric, single peak, no outliers)? If there is clear skewness or outliers then, do not use t. when the sum of the sample sizes is between 15 and 40, t procedures can be used except in the presences of outliers or strong skewness. when the sum of the sample sizes is at least 40, the t procedures can be used even for clearly skewed distributions. 19.0 Two-Sample Problems

24
Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures ▸ Note: In practice it is enough that the two distributions have similar shape with no strong outliers. The two-sample t procedures are even more robust against non-Normality than the one-sample procedures. 19.0 Two-Sample Problems

25
Example: SSHA Scores ▸ The summary statistics for the SSHA scores for random samples of men and women are below. There was neither significant skewness, nor, strong outliers, in either data set. Use this information to construct a 90% confidence interval for the mean difference. 19.3 Two-Sample t Procedures Group Sample Mean Sample Standard Deviation Sample Size Women139.58820.36317 Men122.532.13220

26
Example: 90% CI for SSHA Scores 1. Components Do we have two simple random samples? Yes. It was stated. Large enough population: sample ratio? Yes. N W > 20*17 = 340 N M > 20*20 = 400 (Independence) Large enough sample? Yes. n W + n M =37 < 40 but outlier has been removed. No skewness. Steps for Success- Constructing Confidence Intervals for . 1.Confirm that the 3 key conditions are satisfied (SRS?, N:n?, t-distribution?). 18.3 One-Sample t Confidence Intervals

27
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

28
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

29
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

30
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

31
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

32
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

33
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

34
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

35
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

36
Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

37
Example: SSHA Scores ▸ Let’s continue with this example by now conducting a test of significance for the mean difference in SSHA by gender at =0.10. 19.3 Two-Sample t Procedures Group Sample Mean Sample Standard Deviation Sample Size Women139.58820.36317 Men122.532.13220

38
Example: SSHA Scores 19.3 Two-Sample t Procedures

39
Example: SSHA Scores 19.3 Two-Sample t Procedures

40
Example: SSHA Scores 19.3 Two-Sample t Procedures

41
Example: SSHA Scores 19.3 Two-Sample t Procedures

42
Example: SSHA Scores 19.3 Two-Sample t Procedures

43
Example: SSHA Scores Plan: f) Sketch the region(s) of “extremely unlikely” test statistics. 19.3 Two-Sample t Procedures

44
Example: SSHA Scores Solve: a)Check the conditions for the test you plan to use. Two Simple Random Samples? Large enough population: sample ratios? Large enough samples? 19.3 Two-Sample t Procedures

45
Example: SSHA Scores Solve: a)Check the conditions for the test you plan to use. Two Simple Random Samples? Yes. Stated as a random sample. Large enough population: sample ratios? Yes. Both populations are arbitrarily large; much greater than, N W > 20*17 = 340; N M > 20*20 = 400 Large enough samples? Yes. n W + n M =37 < 40 outlier has been removed. No skewness. 19.3 Two-Sample t Procedures

46
Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 19.3 Two-Sample t Procedures

47
Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 19.3 Two-Sample t Procedures

48
Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 1.96 DF = 17 - 1 1.746 < 1.96 < 2.12 .05 < P-value <.10 P-value 19.3 Two-Sample t Procedures

49
Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). 19.3 Two-Sample t Procedures

50
Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). Because the approximate P-value is smaller than 0.10, we reject the null hypothesis. b) Interpret the decision in the context of the original claim. 19.3 Two-Sample t Procedures

51
Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). Because the approximate P-value is smaller than 0.10, we reject the null hypothesis. b) Interpret the decision in the context of the original claim. There is enough evidence (at =.10) that there is a difference in the mean SSHA score between men and women. 19.3 Two-Sample t Procedures

53
▸ JMP Enter the quantitative data into one of the columns. In the next column, enter an abridged description of the categorical variable associated with each row of quantitative data. (Note: Pay attention to the spelling and capitalization of the abridged descriptions.) Analyze Fit Y by X. “Click-and-Drag” (the quantitative variable) into the ‘Y, Response’ box. “Click-and- Drag” (the categorical variable) into the ‘X, Factor’ box. Click on OK. Click on the red upside-down triangle next to the title “Oneway Analysis of …” Proceed to ‘Means and Std Dev.’ Click on the red upside-down triangle next to the title “Oneway Analysis of …” Proceed to ‘t Test.’ 19.3 Two-Sample t Procedures

55
▸ JMP Enter the quantitative data into one of the columns. In the next column, enter an abridged description of the categorical variable associated with each row of quantitative data. ▸ (Note: Pay attention to the spelling and capitalization of the abridged descriptions.) Analyze Fit Y by X. “Click-and-Drag” (the quantitative variable) into the ‘Y, Response’ box. “Click-and-Drag” (the categorical variable) into the ‘X, Factor’ box. Click on OK. Click on the red upside-down triangle next to the title “Oneway Analysis of …” Proceed to ‘Means and Std Dev.’ Click on the red upside-down triangle next to the title “Oneway Analysis of …” Proceed to ‘t Test.’ 19.3 Two-Sample t Procedures

56
SSHA Scores (via Technology) ▸ Use technology to compute a 98% confidence interval for the mean difference in SSHA scores between women and men. ▸ Use technology to conduct the test of significance for the mean difference in SSHA scores at =.02. 19.3 Two-Sample t Procedures

57
98% confidence interval for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

58
98% confidence interval for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

59
Test of Significance for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

60
Test of Significance for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

61
Example: SSHA Scores ▸ Technology output for Two Sample Means: 19.3 Two-Sample t Procedures

62
Closing Caveats and Comments ▸ The two-sample t statistic has an approximate (but accurate) t distribution. The approximate distribution of the two-sample t has an elaborate degrees of freedom computation (p.480). Computers use this formula in determining degrees of freedom. ▸ We will use Option 2 (p.470). This has df= smaller of (n 1 -1) and (n 2 -1). ▸ Because of the above fact, output in JMP (or other software packages) might have different df and p-values from manual analyses. 19.3 Two-Sample t Procedures

63
Closing Caveats and Comments ▸ We will not use “pooled” two-sample procedures. This assumes that the population variance is known and is equal for both variables. Use of our “Option 1” for two-sample t procedures yields more accurate results than the “pooled t.” The only caveat is when the sample sizes are equal; then our results and the “pooled t” would be equal. ▸ Do not use two-sample t procedures for inference regarding standard deviations. The F-test is more appropriate in those cases. 19.3 Two-Sample t Procedures

64
Closing Caveats and Comments ▸ Practitioners prefer having equal sample sizes for the two groups when possible. ▸ Exercises for this chapter will all assume that the SRS is from a Normal distribution. 19.3 Two-Sample t Procedures

65
Five-Minute Summary ▸ List at least 3 concepts that had the most impact on your knowledge of two-sample problems. _________________________________________

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google