2 Comparing Means for Several Populations When we wish to test for differences in means for only 1 or 2 populations, we use one- or two-sample t inference.(We did two-sample t inferences in MAT 212)Testing for differences in 2 or more populations, or at several different levels (values) of a variable involves a different approach.This is called Analysis of Variance, or ANOVA.ANOVA partitions the total sum of squares into two parts:within treatment variabilitybetween treatment variability
3 Comparing Means for Several Populations Example: Test 5 types of concrete for differences in moisture absorption.The 5 types of concrete are the five levels of the treatment.Within Variability – this seeks to quantify the variability in absorption for one particular type of concrete.Between Variability – this seeks to quantify the differences between the types of concrete.ANOVA seeks to answer the question “Are the differences between the 5 sample means what is expected purely from random variation alone?”
4 DefinitionsAn experimental unit is an object, or subject, that produces a sample measurement.The experimental conditions that define the different populations in a completely randomized design are called treatments.Testing for differences in the treatments is equivalent to testing for differences in the population means.
5 Graphical demonstration: Employing two types of variability
6 Graphical demonstration: Employing two types of variability 20253017Treatment 1Treatment 2Treatment 3101219920Graphical demonstration:Employing two types of variability16151411109A small variability withinthe samples makes it easierto draw a conclusion about thepopulation means.The sample means are the same as before,but the larger within-sample variabilitymakes it harder to draw a conclusionabout the population means.Treatment 1Treatment 2Treatment 3
7 Assumptions for ANOVA 1. The samples are independent and random Selection of objects from any one population is unrelated to the selection of objects from any of the other populations. Selections are random (one individual has as much chance of being selected as another.)ExamplesDifferent groups of people (no person in more than one group)Different types of musicDifferent concentrations of chemicalsDifferent models of automobiles
8 Assumptions for ANOVA 2. All populations are normal 3. Each population has the same standard deviation, s, (which implies the same variance, σ2)But the values of the population standard deviations is not known before testing.4. Each sample has a mean that can be calculated. This mean is somehow representative of the population mean for its population.
9 Assumptions for ANOVAThe following assumptions are required for a 1-way ANOVA:The k populations are independent.Each population is normally distributed.Each population has common standard deviation, s.Each population has a mean, mi for i = 1, 2, …, k.So we now are testing whether all the treatment means are equal.H0: m1 = m2 = … = mkHa: At least two of the population means are not equal
10 Test StatisticIf the null hypothesis is true, we expect the k sample means to have reasonably similar values.In other words, if the population means are equal, we would expect the variability among the sample means to be relatively small.Variability among the sample means is one of the things we will be testing for.
11 Test StatisticIf the null hypothesis is true, we do not expect the population means to be exactly the same, because there is a chance factor in our choice of sample experimental units.We need to take into account the variability due to chance among the sample means.
12 Test StatisticThis method is called “analysis of variance” of ANOVA because we are comparing two sources of variance: the variance among the sample means and the variation expected by chance among the sample means when the null hypothesis is true.
13 Test Statistic Our test statistic is called F. F = Variability among the sample means Variability expected by chance
14 Degrees of freedom For a sample, (or group) (k) df = n – 1 Total df = total number of units in the experiment – 1Error df = Total df – Group dfOrError df = N - k
15 TechnologyWe will use Minitab (or StatCrunch or Excel) to do our calculations.A typical Minitab display is on the next slide.
16 ANOVA Table: Tensile Strength for 6 Machines Analysis of Variance for Tensile-StrengthSource DF SS MS F PMachineErrorTotalSST = Sum of squares of treatment =SSMachine = 5.34 (sample mean variability), k = 6 machinesSSError = (variability due to chance)Notice how much larger the “chance” variability is than the other.There is little to no evidence that the machines differ in mean tensile-strength. Look at that HUGE p-value!
17 Another ExampleA sociologist conducts an experiment to compare the mean grade-point averages of first-year college students associated with four socioeconomic groups. The sociologist defines the four categories of interest to be: Poor, Lower Middle Class, Upper Middle Class, and Well-to-do. The experimenter knows that the populations of grade-point averages are normally distributed with equal standard deviations. At the end of the school year, the sociologist selects independent random samples of 10 grade-point averages for first year students in each of the four socioeconomic groups.Do the data provide sufficient evidence to indicate a difference in mean grade-point averages for at least two of the four socioeconomic groups?
18 Socioeconomics and GPA Treatments = 4 SOE groupsResponse variable = GPAH0: μ1= μ2= μ3= μ4H1: At least two of the population means are not equalDecision rule:Accept H1 if the p-value < .05Test statistic: FF = Variability among the sample means Variability expected by chance
19 Socioeconomics and GPA Variability among sample means = MST = SST / k-1Variability due to chance = MSE = SSE / n-1
20 One-way ANOVA: GPA versus Group Source DF SS MS F PGroupErrorTotal
21 Socioeconomics and GPA F = P-value = p=value < .05There is sufficient evidence to say that there is a difference in the mean grade-point averages for at least two of the socioeconomic groups. We reach this conclusion at the 0.05 level of significance.Since we accepted the alternative hypothesis, we now need to state which means are different.
22 Socioeconomics and GPA We already have enough data to say that of the four groups, the Well-to-do have the highest mean GPA with 2.576, the Upper Middle is next with 2.717, followed by the Lower Middle with The Poor have the lowest mean GPA withBut are these differences statistically significant?
23 Which means are different? We need to test each of the following pairs of hypotheses.Pair 1: Ho: μ1-μ2=0 Ha: μ1-μ2≠0Pair 2: Ho: μ1-μ3=0 Ha: μ1-μ3≠0Pair 3: Ho: μ1-μ4=0 Ha: μ1-μ4≠0Pair 4: Ho: μ2-μ3=0 Ha: μ2-μ3≠0Pair 5: Ho: μ2-μ4=0 Ha: μ2-μ4≠0Pair 6: Ho: μ3-μ4=0 Ha: μ3-μ4≠0
24 Which means are different? To test each pair of hypothesis, we are only testing two means for a difference between them.This is the two-sample t-statistic that we used in Chapter 13 in MAT 212.
25 Which means are different However, it takes less time to calculate the confidence intervals for each pair and use these to make our inferences.If a confidence interval contains only positive numbers, we may conclude that the first mean is larger than the secondIf a confidence interval contains only negative numbers, we may conclude that the first mean is smaller than the second.If a confidence interval contains the number zero, there is insufficient evidence to conclude which mean is larger.
26 Which means are different To do this, we use StatCrunch, and the Tukey’s Multiple Comparisons(The notes in blue on the following slide are the conclusions drawn, these are not a result of StatCrunch.)
27 Group = Lower Middle subtracted from: Group Lower UpperPoor (-,+) Not significantUpper Middle (-,+) Not significantWell-to-do (-,+) Not significantGroup = Poor subtracted from:Group Lower Upper -Upper Middle (+,+) μ1 > μ2Well-to-do (+,+) μ1 > μ2Group = Upper Middle subtracted from:Group Lower UpperWell-to-do (-,+) Not significant
28 Socioeconomics and GPA This shows that both Upper Middle Class and Well-to-do have higher mean GPA than Poor. There are no other statistically significant differences.
29 ANOVA – What is expected from you? Be able to complete each of the following exercises:State the two hypotheses.State the decision rule.What is the test stat, and what is its formula.What is the observed value of this test statistic?Is this valid?What is the p-value?State a conclusion.If you accepted the alternate hypothesis, you then need to find out which means are different.
30 Another ExampleIs hair color related to pain sensitivity? To study this, an experimenter divides men and women of various ages into four hair color categories: light blond, dark blond, light brunette, and dark brunette. There are six people in each of the four categories. Each participant in the study receives a pain threshold score based upon his or her performance in a pain sensitivity test (the higher the score, the lower the person’s pain tolerance.)
31 Hair Color vs Pain Sensitivity The treatments are the hair colorThe response variable is the sensitivity to pain score.H0: μ1= μ2= μ3= μ4H1: At least two of the population means of the scores are not equalDecision rule:Accept H1 if the p-value < .05Test statistic: FF = Variability among the sample means Variability expected by chance
32 Hair Color vs Pain Sensitivity One-way ANOVA: Score versus Hair ColorSource DF SS MS F PHair ColorErrorTotalH0: mlight_blond = mdark_blond = … = mdark_brunetteHa: At least two population means are different.F = 5.44 p-value = 0.007At the .05 level of significance, there is overwhelming evidence to conclude that there is a difference among mean pain thresholds for people possessing these four hair colors.
33 Minitab One-way ANOVA: Score versus Hair Color Source DF SS MS F P ErrorTotalS = R-Sq = 44.95% R-Sq(adj) = 36.69%
34 Hair Color = Dark Blond subtracted from: Hair Color Lower UpperDark BrunetteLight BlondLight BrunetteHair Color = Dark Brunette subtracted from:Hair Color Lower UpperLight Blond Light BrunetteHair Color = Dark Brunette subtracted from:Hair Color Lower UpperLight Blond Light Brunette
35 Hair Color vs Pain Sensitivity Examine Minitab’s output to make the following table:Pair From To ConclusionD Brun – D Blon NS (No difference)L Blon – D Blon L Blon > D BlonL Brun – D Blon NS (No difference)L Blon – D Brun L Blon > D BrunL Brun – D Brun L Brun > D BrunL Brun – L Blon NS (No difference)Summarize the results.
36 When should we use the multiple comparison method? The sample data are obtained from the k populations using a completely randomized designAn analysis of variance F-test indicates that there are some differences among the k population means.The objective is to determine which of the k population means differ. It is usually of interest to determine which mean might be the largest (or smallest).