Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 19: Two-Sample Problems STAT 1450. Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid.

Similar presentations


Presentation on theme: "Chapter 19: Two-Sample Problems STAT 1450. Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid."— Presentation transcript:

1 Chapter 19: Two-Sample Problems STAT 1450

2 Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid when appropriate simple conditions apply! 19.0 Two-Sample Problems Population Parameter Point Estimate Confidence Interval Test Statistic μ (σ known) μ (σ unknown)s

3 Connecting Chapter 19 to our Current Knowledge of Statistics ▸ Matched pairs were covered at the end of Chapter 18. A common situation requiring matched pairs is when before-and-after measurements are taken on individual subjects. ▸ Example: Prices for a random sample of tickets to a 2008 Katy Perry concert were compared with the ticket prices (for the same seats) to her 2013 concert..  The data could be consolidated into 1 column of differences in ticket prices.  A test of significance, or, a confidence interval would then occur for “1 sample of data.” 19.0 Two-Sample Problems

4 The Two-Sample Problems ▸ Two-sample problems require us to compare:  the response to two treatments - or -  the characteristics of two populations. ▸ We have a separate sample from each treatment or population. 19.1 The Two-Sample Problem

5 Two-Sample Problems ▸ The end of Chapter 18 described inference procedures for the mean difference in two measurements on one group of subjects (e.g., pulse rates for 12 students before-and-after listening to music). ▸ Given our answer from above, and the likelihood that each sample has different sample sizes, variances, etc… Chapter 19 focuses on the difference in means for 2 different groups. Population Parameter Point Estimate Confidence Interval Test Statistic 19.1 The Two-Sample Problem

6 Sampling Distribution of Two Sample Means 19.2 Comparing Two Population Means

7 Sampling Distribution of Two Sample Means ▸ The following table stems from the above comment on standard error and statistical theory. 19.2 Comparing Two Population Means VariableParameterPoint EstimatePopulation Standard Deviation Standard Error x1x1 11 11 x2x2 22 22 Diff = x 1 - x 2  1 -  2

8 Example: SSHA Scores ▸ The Survey of Study Habits and Attitudes (SSHA) is a psychological test designed to measure various academic behaviors (motivation, study habits, attitudes, etc…) of college students. Scores on the SSHA range from 0 to 200. The data for random samples 17 women (**the outlier from the original data set was removed**) and 20 men yielded the following summary statistics. ▸ Is there a difference in SSHA performance based upon gender? 19.2 Comparing Two Population Means

9 Example: SSHA Scores ▸ Summary statistics for the two groups are below:  There is a difference in these two groups. The women’s average was 17 points > than the men’s average. Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

10 Example: SSHA Scores ▸ Summary statistics for the two groups are below:  There is a difference in these two groups. The women’s average was 17 points > than the men’s average.  Yet, the standard deviations are larger than this sample difference, and the sample sizes are about the same. Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

11 Example: SSHA Scores Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

12 Example: SSHA Scores Group Sample Mean Sample Standard Deviation Sample Size Women**139.58820.36317 Men122.532.13220 19.2 Comparing Two Population Means

13 The Two-sample t Procedures: Derived ▸ Now that we have a point estimate and a formula for the standard error, we can determine the confidence interval for the difference in two population means. ChapterParameter of InterestPoint Estimate Standard Error Confidence Interval 18  (σ unknown; 1-sample) 19 μ 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) pt. estimate ± t*(standard error) 19.3 Two-Sample t Procedures

14 The Two-sample t Procedures: Derived ▸ Now that we have a point estimate and a formula for the standard error, we can determine the confidence interval for the difference in two population means. ChapterParameter of InterestPoint Estimate Standard Error Confidence Interval 18  (σ unknown; 1-sample) 19 μ 1 - μ 2 (σ 1, σ 2 unknown; 2-samples) 19.3 Two-Sample t Procedures

15 The Two-sample t Procedures: Derived 19.3 Two-Sample t Procedures ChapterParameter of Interest Point Estimate Standard Error Test Statistic 18 μ (σ unknown; 1-sample) 19  1 - μ 2 (σ 1, σ 2 unknown; 2-samples) pt. estimate –  0 standard error Note: H 0 for our purposes will be that  1 =  2 ; which is equivalent to there being a mean difference of ‘0.’

16 The Two-sample t Procedures: Derived 19.3 Two-Sample t Procedures ChapterParameter of Interest Point Estimate Standard Error Test Statistic 18 μ (σ unknown; 1-sample) 19  1 - μ 2 (σ 1, σ 2 unknown; 2-samples) Note: H 0 for our purposes will be that  1 =  2 ; which is equivalent to their being a mean difference of ‘0.’

17 The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

18 The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

19 The Two-sample t Procedures ▸ Now we can complete the table from earlier: t* is the critical value for confidence level C for the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Find P-values from the t distribution with df = smaller of (n 1 -1) and (n 2 -1). Population Parameter Point EstimateConfidence IntervalTest Statistic 19.3 Two-Sample t Procedures

20 The Two-sample t Procedures: Confidence Intervals 19.3 Two-Sample t Procedures

21 The Two-sample t Procedures: Significance Tests 19.3 Two-Sample t Procedures

22 Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures ▸ The general structure of our necessary conditions is an extension of the one-sample cases.  Simple Random Samples:  Do we have 2 simple random samples?  Population : Sample Ratio:  The samples must be independent and from two large populations of interest. 19.0 Two-Sample Problems

23 Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures  Large enough sample: Both populations will be assumed to be from a Normal distribution and  when the sum of the sample sizes is less than 15, t procedures can be used if the data close to Normal (roughly symmetric, single peak, no outliers)? If there is clear skewness or outliers then, do not use t.  when the sum of the sample sizes is between 15 and 40, t procedures can be used except in the presences of outliers or strong skewness.  when the sum of the sample sizes is at least 40, the t procedures can be used even for clearly skewed distributions. 19.0 Two-Sample Problems

24 Conditions for Inference Comparing Two- Sample Means and Robustness of t Procedures ▸ Note: In practice it is enough that the two distributions have similar shape with no strong outliers. The two-sample t procedures are even more robust against non-Normality than the one-sample procedures. 19.0 Two-Sample Problems

25 Example: SSHA Scores ▸ The summary statistics for the SSHA scores for random samples of men and women are below. There was neither significant skewness, nor, strong outliers, in either data set. Use this information to construct a 90% confidence interval for the mean difference. 19.3 Two-Sample t Procedures Group Sample Mean Sample Standard Deviation Sample Size Women139.58820.36317 Men122.532.13220

26 Example: 90% CI for SSHA Scores 1. Components Do we have two simple random samples? Yes. It was stated. Large enough population: sample ratio? Yes. N W > 20*17 = 340 N M > 20*20 = 400 (Independence) Large enough sample? Yes. n W + n M =37 < 40 but outlier has been removed. No skewness. Steps for Success- Constructing Confidence Intervals for     . 1.Confirm that the 3 key conditions are satisfied (SRS?, N:n?, t-distribution?). 18.3 One-Sample t Confidence Intervals

27 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

28 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

29 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

30 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

31 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

32 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

33 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

34 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

35 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

36 Example: 90% CI for SSHA Scores 18.3 One-Sample t Confidence Intervals

37 Example: SSHA Scores ▸ Let’s continue with this example by now conducting a test of significance for the mean difference in SSHA by gender at  =0.10. 19.3 Two-Sample t Procedures Group Sample Mean Sample Standard Deviation Sample Size Women139.58820.36317 Men122.532.13220

38 Example: SSHA Scores 19.3 Two-Sample t Procedures

39 Example: SSHA Scores 19.3 Two-Sample t Procedures

40 Example: SSHA Scores 19.3 Two-Sample t Procedures

41 Example: SSHA Scores 19.3 Two-Sample t Procedures

42 Example: SSHA Scores 19.3 Two-Sample t Procedures

43 Example: SSHA Scores Plan: f) Sketch the region(s) of “extremely unlikely” test statistics. 19.3 Two-Sample t Procedures

44 Example: SSHA Scores Solve: a)Check the conditions for the test you plan to use.  Two Simple Random Samples?  Large enough population: sample ratios?  Large enough samples? 19.3 Two-Sample t Procedures

45 Example: SSHA Scores Solve: a)Check the conditions for the test you plan to use.  Two Simple Random Samples? Yes. Stated as a random sample.  Large enough population: sample ratios? Yes. Both populations are arbitrarily large; much greater than, N W > 20*17 = 340; N M > 20*20 = 400  Large enough samples? Yes. n W + n M =37 < 40 outlier has been removed. No skewness. 19.3 Two-Sample t Procedures

46 Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 19.3 Two-Sample t Procedures

47 Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 19.3 Two-Sample t Procedures

48 Example: SSHA Scores Solve: b)Calculate the test statistic c)Determine (or approximate) the P-Value. 1.96 DF = 17 - 1  1.746 < 1.96 < 2.12 .05 < P-value <.10 P-value 19.3 Two-Sample t Procedures

49 Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). 19.3 Two-Sample t Procedures

50 Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). Because the approximate P-value is smaller than 0.10, we reject the null hypothesis. b) Interpret the decision in the context of the original claim. 19.3 Two-Sample t Procedures

51 Example: SSHA Scores Conclude: a) Make a decision about the null hypothesis (Reject H 0 or Fail to reject H 0 ). Because the approximate P-value is smaller than 0.10, we reject the null hypothesis. b) Interpret the decision in the context of the original claim. There is enough evidence (at  =.10) that there is a difference in the mean SSHA score between men and women. 19.3 Two-Sample t Procedures

52

53 ▸ JMP  Enter the quantitative data into one of the columns.  In the next column, enter an abridged description of the categorical variable associated with each row of quantitative data. (Note: Pay attention to the spelling and capitalization of the abridged descriptions.)  Analyze  Fit Y by X.  “Click-and-Drag” (the quantitative variable) into the ‘Y, Response’ box. “Click-and- Drag” (the categorical variable) into the ‘X, Factor’ box. Click on OK.  Click on the red upside-down triangle next to the title “Oneway Analysis of …”  Proceed to ‘Means and Std Dev.’  Click on the red upside-down triangle next to the title “Oneway Analysis of …”  Proceed to ‘t Test.’ 19.3 Two-Sample t Procedures

54

55 ▸ JMP  Enter the quantitative data into one of the columns.  In the next column, enter an abridged description of the categorical variable associated with each row of quantitative data. ▸ (Note: Pay attention to the spelling and capitalization of the abridged descriptions.)  Analyze  Fit Y by X.  “Click-and-Drag” (the quantitative variable) into the ‘Y, Response’ box.  “Click-and-Drag” (the categorical variable) into the ‘X, Factor’ box. Click on OK.  Click on the red upside-down triangle next to the title “Oneway Analysis of …”  Proceed to ‘Means and Std Dev.’  Click on the red upside-down triangle next to the title “Oneway Analysis of …”  Proceed to ‘t Test.’ 19.3 Two-Sample t Procedures

56 SSHA Scores (via Technology) ▸ Use technology to compute a 98% confidence interval for the mean difference in SSHA scores between women and men. ▸ Use technology to conduct the test of significance for the mean difference in SSHA scores at  =.02. 19.3 Two-Sample t Procedures

57 98% confidence interval for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

58 98% confidence interval for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

59 Test of Significance for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

60 Test of Significance for the mean difference in SSHA scores between women & men. 19.3 Two-Sample t Procedures

61 Example: SSHA Scores ▸ Technology output for Two Sample Means: 19.3 Two-Sample t Procedures

62 Closing Caveats and Comments ▸ The two-sample t statistic has an approximate (but accurate) t distribution. The approximate distribution of the two-sample t has an elaborate degrees of freedom computation (p.480). Computers use this formula in determining degrees of freedom. ▸ We will use Option 2 (p.470). This has df= smaller of (n 1 -1) and (n 2 -1). ▸ Because of the above fact, output in JMP (or other software packages) might have different df and p-values from manual analyses. 19.3 Two-Sample t Procedures

63 Closing Caveats and Comments ▸ We will not use “pooled” two-sample procedures. This assumes that the population variance is known and is equal for both variables. Use of our “Option 1” for two-sample t procedures yields more accurate results than the “pooled t.” The only caveat is when the sample sizes are equal; then our results and the “pooled t” would be equal. ▸ Do not use two-sample t procedures for inference regarding standard deviations. The F-test is more appropriate in those cases. 19.3 Two-Sample t Procedures

64 Closing Caveats and Comments ▸ Practitioners prefer having equal sample sizes for the two groups when possible. ▸ Exercises for this chapter will all assume that the SRS is from a Normal distribution. 19.3 Two-Sample t Procedures

65 Five-Minute Summary ▸ List at least 3 concepts that had the most impact on your knowledge of two-sample problems. _________________________________________


Download ppt "Chapter 19: Two-Sample Problems STAT 1450. Connecting Chapter 18 to our Current Knowledge of Statistics ▸ Remember that these formulas are only valid."

Similar presentations


Ads by Google