 # Stat 301- Day 32 More on two-sample t- procedures.

## Presentation on theme: "Stat 301- Day 32 More on two-sample t- procedures."— Presentation transcript:

Stat 301- Day 32 More on two-sample t- procedures

Last Time – Comparing Two Means What did the simulation do?  Repeatedly took independent random samples from each population and calculated the difference in the sample means  Repeat many times  sampling distribution

Last Time – Comparing Two Means The theory: When sampling from normal populations, distribution of difference in sample means will follow a normal distribution with mean equal to  1 -  2 and standard deviation    

Last Time – Comparing Two Means In practice  Don’t know   and   So use s 1 and s 2 and then use the t distribution  Tends to work pretty well with non-normal populations as long as the samples sizes are large (especially with similar shapes, sample sizes)

PP 5.3.1 (p. 437) n 1 =5 and n 2 =5 Works pretty well even with small sample sizes!

PP 5.3.1 n 1 =20 and n 2 =5 Works better when populations are similar in shape and similar sample sizes

Last Time – Comparing Two Means In practice  Don’t know   and   So use s 1 and s 2 and then use the t distribution  Tends to work pretty well with non-normal populations as long as the samples sizes are large (especially with similar shapes, sample sizes)  Valid for infinite populations Works pretty well if N > 20n  Also consider whether comparisons of means is really the right question! Better if shape and spread are similar!

PP 5.3.2 H 0 :  M -  F = 0 (the average body temperature is the same for men and women) H a :  M -  F ≠ 0 (the average body temperature differs for men and women) Samples sizes both large. Might worry about what populations of adults we are willing to generalize these results to. Difference = mu (female) - mu (male) Estimate for difference: -0.289 95% CI for difference: (-0.540, -0.039) T-Test of difference = 0 (vs not =): T-Value = -2.29 P-Value = 0.024 DF = 127

Conclusions There is significant evidence (p-value <.05) that the average body temperature differs between men and women in these populations. We are 95% confident that the (population) mean body temperature for men is.036 to.542 degrees (Fahrenheit) higher than the mean body temperature for women.

By the way (m) Medians

Investigation 5.4.1 (p. 446) What did the (Chapter 2) simulation do?  Take existing sample data  Repeatedly shuffled up the “group A” and “group B” designations, computed the difference in groups means with each random shuffle  Repeat many times  randomization distribution Can we apply the normal model?  Advantages?

Pooling the variance estimate Want to know Can estimate with If believe  1 =  2 = , then a better estimate of  is to combine the information from the two samples (p. 437)  “Pooled t-test”

Pooling the variance estimate But then need to add “population variances equal” to your list of technical conditions and is a condition that is rather difficult to verify  Ad hoc: ratio of SDs less than 2 The only time I would say it’s easily justified to pool is when you know the data are from the same population… a randomized experiment.  Check the “assume equal variance” box

Investigation 5.3.2 (p. 438)