Copyright (c) Bani K. Mallick1 STAT 651 Lecture 9
Copyright (c) Bani K. Mallick2 Topics in Lecture #9 Comparing two population means Output: detailed look The t-test
Copyright (c) Bani K. Mallick3 Book Sections Covered in Lecture #9 Chapter 6.2
Copyright (c) Bani K. Mallick4 Relevant SPSS Tutorials Transformations of Data 2-sample t-test Paired t-test
Copyright (c) Bani K. Mallick5 Lecture 8 Review: Comparing Two Populations There a two populations Take a sample from each population The sample sizes need not be the same Population 1: Population 2:
Copyright (c) Bani K. Mallick6 Lecture 8 Review: Comparing Two Populations Each will have a sample standard deviation Population 1: Population 2:
Copyright (c) Bani K. Mallick7 Lecture 8 Review: Comparing Two Populations Each sample with have a sample mean Population 1: Population 2: That’s the statistics. What are the parameters?
Copyright (c) Bani K. Mallick8 Lecture 8 Review: Comparing Two Populations Each sample with have a population standard deviation Population 1: Population 2:
Copyright (c) Bani K. Mallick9 Lecture 8 Review: Comparing Two Populations Each sample with have a population mean Population 1: Population 2:
Copyright (c) Bani K. Mallick10 Lecture 8 Review: Comparing Two Populations How do we compare the population means and ???? The usual way is to take their difference: If the population means are equal, what is their difference?
Copyright (c) Bani K. Mallick11 Lecture 8 Review: Comparing Two Populations The usual way is to take their difference: If the population means are equal, their difference = 0 Suppose we form a confidence interval for the difference. From this we learn whether 0 is in the confidence interval, and hence can make decisions about the hypothesis
Copyright (c) Bani K. Mallick12 NHANES Comparison
Copyright (c) Bani K. Mallick13 NHANES Comparison: what the output looks like
Copyright (c) Bani K. Mallick14 NHANES Comparison: the variable Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick15 NHANES Comparison: The method. If you think the varianes are wildly different, try a transformation Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick16 NHANES Comparison: the p-value. Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick17 NHANES Comparison: the difference in sample means Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick18 NHANES Comparison: the standard error of difference in sample means Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick19 NHANES Comparison: the 95% confidence interval Independent Samples Test E Equal variances assumed Equal variances not assumed FSig. Levene's Test for Euality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick20 NHANES Comparison The “Mean Difference” is Since the healthy cases had a higher mean, this is Mean(Healthy) – Mean(Cancer) The 95% CI is from to What is this a CI for? The difference in population mean log(saturated fat) intake between cancer cases and healthy controls: (Healthy) – (Cancer)
Copyright (c) Bani K. Mallick21 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to The null hypothesis of interest is that the population means are equal, i.e., (Healthy) – (Cancer) = 0
Copyright (c) Bani K. Mallick22 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to Is the p-value p 0.05?
Copyright (c) Bani K. Mallick23 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to = Hypothesized value Confidence Interval
Copyright (c) Bani K. Mallick24 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to Is the p-value p 0.05? Answer: p < 0.05 since the 95% CI does not cover zero.
Copyright (c) Bani K. Mallick25 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to Is the p-value p 0.01? Answer: You cannot tell from a 95% CI. However, from the SPSS output, p = (see next slide)
Copyright (c) Bani K. Mallick26 NHANES Comparison: the 95% confidence interval Independent Samples Test E Equal variances assumed Equal variances not assumed FSig. Levene's Test for Euality of Variances tdfSig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick27 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to What do we conclude from this confidence interval?
Copyright (c) Bani K. Mallick28 NHANES Comparison Mean(Healthy) – Mean(Cancer) The 95% CI is from to What do we conclude from this confidence interval? The population mean log(saturated fat) intake is greater in the Healthy cases by between and (exponentiate to get in terms of grams of saturated fat), with 95% confidence
Copyright (c) Bani K. Mallick29 Comparing Two Population Means: the Formulas The data: The populations: The aim: CI for
Copyright (c) Bani K. Mallick30 Comparing Two Populations Does it matter which one you call population 1 and which one you call population 2? Not at all. The key is to interpret the difference properly.
Copyright (c) Bani K. Mallick31 Comparing Two Populations The aim: CI for This is the difference in population means The estimate of the difference in population means is the difference in sample means This is a random variable: it has sample to sample variability
Copyright (c) Bani K. Mallick32 Comparing Two Populations Difference of sample means “Population” mean from repeated sampling is The s.d. from repeated sampling is
Copyright (c) Bani K. Mallick33 Comparing Two Populations Difference of sample means The s.d. from repeated sampling is You need reasonably large samples from BOTH populations
Copyright (c) Bani K. Mallick34 Comparing Two Populations If you can reasonably believe that the population sd’s are nearly equal, it is customary to pick the equal variance assumption and estimate the common standard deviation by
Copyright (c) Bani K. Mallick35 Comparing Two Populations The standard error then of is the value The number of degrees of freedom is
Copyright (c) Bani K. Mallick36 Comparing Two Populations A (1 100% CI for is Note how the sample sizes determine the CI length
Copyright (c) Bani K. Mallick37 Comparing Two Populations Generally, you should make your sample sizes nearly equal, or at least not wildly unequal. Consider a total sample size of 100 = 1 if n 1 = 1, n 2 = 99 = 0.20 if n 1 = 50, n 2 = 50 Thus, in the former case, your CI would be 5 times longer!
Copyright (c) Bani K. Mallick38 Comparing Two Populations The CI can of course be used to test hypotheses This is the same as So we just need to check whether 0 is in the interval, just as we have done
Copyright (c) Bani K. Mallick39 Comparing Two Populations: The t- test There is something called a t-test, which gives you the information as to whether 0 is in the CI. It does not tell you where the means lie however, so it is of limited use. P-values tell you the same thing.
Copyright (c) Bani K. Mallick40 Comparing Two Populations: The t- test The t-statistic is defined by
Copyright (c) Bani K. Mallick41 Comparing Two Populations: The t- test You reject equality of means if In this case, is p ?
Copyright (c) Bani K. Mallick42 Comparing Two Populations: The t- test You reject equality of means if p <
Copyright (c) Bani K. Mallick43 NHANES Comparison: the t-test Independent Samples Test E E Equal variances assumed Equal variances not assumed Log(Saturated Fat) FSig. Levene's Test for Equality of Variances t df Sig. (2-tailed) Mean Difference Std. Error DifferenceLowerUpper 95% Confidence Interval of the Difference t-test for Equality of Means
Copyright (c) Bani K. Mallick44 Comparing Two Populations SPSS Demonstrations: bluebonnets and Framingham Heart Disease and Blood Pressure, as time permits