Presentation on theme: "Chapter 15 ANOVA. Comparing Means for Several Populations When we wish to test for differences in means for only 1 or 2 populations, we use one- or two-sample."— Presentation transcript:
Chapter 15 ANOVA
Comparing Means for Several Populations When we wish to test for differences in means for only 1 or 2 populations, we use one- or two-sample t inference. (We did two-sample t inferences in MAT 212) Testing for differences in 2 or more populations, or at several different levels (values) of a variable involves a different approach. This is called Analysis of Variance, or ANOVA. ANOVA partitions the total sum of squares into two parts: 1.within treatment variability 2.between treatment variability
Comparing Means for Several Populations Example: Test 5 types of concrete for differences in moisture absorption. The 5 types of concrete are the five levels of the treatment. Within Variability – this seeks to quantify the variability in absorption for one particular type of concrete. Between Variability – this seeks to quantify the differences between the types of concrete. ANOVA seeks to answer the question “Are the differences between the 5 sample means what is expected purely from random variation alone?”
Definitions An experimental unit is an object, or subject, that produces a sample measurement. The experimental conditions that define the different populations in a completely randomized design are called treatments. Testing for differences in the treatments is equivalent to testing for differences in the population means.
Graphical demonstration: Employing two types of variability
Graphical demonstration: Employing two types of variability Treatment 1Treatment 2 Treatment Treatment 1Treatment 2Treatment The sample means are the same as before, but the larger within-sample variability makes it harder to draw a conclusion about the population means. A small variability within the samples makes it easier to draw a conclusion about the population means.
Assumptions for ANOVA 1. The samples are independent and random –Selection of objects from any one population is unrelated to the selection of objects from any of the other populations. Selections are random (one individual has as much chance of being selected as another.) –Examples Different groups of people (no person in more than one group) Different types of music Different concentrations of chemicals Different models of automobiles
Assumptions for ANOVA 2. All populations are normal 3. Each population has the same standard deviation, (which implies the same variance, σ 2 ) But the values of the population standard deviations is not known before testing. 4. Each sample has a mean that can be calculated. This mean is somehow representative of the population mean for its population.
Assumptions for ANOVA The following assumptions are required for a 1-way ANOVA: The k populations are independent. Each population is normally distributed. Each population has common standard deviation, . Each population has a mean, i for i = 1, 2, …, k. So we now are testing whether all the treatment means are equal. H 0 : 1 = 2 = … = k H a : At least two of the population means are not equal
Test Statistic If the null hypothesis is true, we expect the k sample means to have reasonably similar values. In other words, if the population means are equal, we would expect the variability among the sample means to be relatively small. Variability among the sample means is one of the things we will be testing for.
Test Statistic If the null hypothesis is true, we do not expect the population means to be exactly the same, because there is a chance factor in our choice of sample experimental units. We need to take into account the variability due to chance among the sample means.
Test Statistic This method is called “analysis of variance” of ANOVA because we are comparing two sources of variance: the variance among the sample means and the variation expected by chance among the sample means when the null hypothesis is true.
Test Statistic Our test statistic is called F. F = Variability among the sample means Variability expected by chance
Degrees of freedom For a sample, (or group) (k) df = n – 1 Total df = total number of units in the experiment – 1 Error df = Total df – Group df –Or Error df = N - k
Technology We will use Minitab (or StatCrunch or Excel) to do our calculations. A typical Minitab display is on the next slide.
ANOVA Table: Tensile Strength for 6 Machines Analysis of Variance for Tensile-Strength Source DF SS MS F P Machine Error Total SST = Sum of squares of treatment = SSMachine = 5.34 (sample mean variability), k = 6 machines SSError = (variability due to chance) Notice how much larger the “chance” variability is than the other. There is little to no evidence that the machines differ in mean tensile-strength. Look at that HUGE p-value!
Another Example A sociologist conducts an experiment to compare the mean grade-point averages of first-year college students associated with four socioeconomic groups. The sociologist defines the four categories of interest to be: Poor, Lower Middle Class, Upper Middle Class, and Well-to-do. The experimenter knows that the populations of grade-point averages are normally distributed with equal standard deviations. At the end of the school year, the sociologist selects independent random samples of 10 grade-point averages for first year students in each of the four socioeconomic groups. Do the data provide sufficient evidence to indicate a difference in mean grade-point averages for at least two of the four socioeconomic groups?
Socioeconomics and GPA Treatments = 4 SOE groups Response variable = GPA H 0 : μ 1 = μ 2 = μ 3 = μ 4 H 1 : At least two of the population means are not equal Decision rule:Accept H 1 if the p-value <.05 Test statistic: F F = Variability among the sample means Variability expected by chance
Socioeconomics and GPA Variability among sample means = MST = SST / k-1 Variability due to chance = MSE = SSE / n-1
One-way ANOVA: GPA versus Group Source DF SS MS F P Group Error Total
Socioeconomics and GPA F = P-value =.044 p=value <.05 There is sufficient evidence to say that there is a difference in the mean grade-point averages for at least two of the socioeconomic groups. We reach this conclusion at the 0.05 level of significance. Since we accepted the alternative hypothesis, we now need to state which means are different.
Socioeconomics and GPA We already have enough data to say that of the four groups, the Well-to-do have the highest mean GPA with 2.576, the Upper Middle is next with 2.717, followed by the Lower Middle with The Poor have the lowest mean GPA with But are these differences statistically significant?
Which means are different? We need to test each of the following pairs of hypotheses. Pair 1: H o : μ 1 -μ 2 =0 H a : μ 1 -μ 2 ≠0 Pair 2: H o : μ 1 -μ 3 =0 H a : μ 1 -μ 3 ≠0 Pair 3: H o : μ 1 -μ 4 =0 H a : μ 1 -μ 4 ≠0 Pair 4: H o : μ 2 -μ 3 =0 H a : μ 2 -μ 3 ≠0 Pair 5: H o : μ 2 -μ 4 =0 H a : μ 2 -μ 4 ≠0 Pair 6: H o : μ 3 -μ 4 =0 H a : μ 3 -μ 4 ≠0
Which means are different? To test each pair of hypothesis, we are only testing two means for a difference between them. This is the two-sample t-statistic that we used in Chapter 13 in MAT 212.
Which means are different However, it takes less time to calculate the confidence intervals for each pair and use these to make our inferences. If a confidence interval contains only positive numbers, we may conclude that the first mean is larger than the second If a confidence interval contains only negative numbers, we may conclude that the first mean is smaller than the second. If a confidence interval contains the number zero, there is insufficient evidence to conclude which mean is larger.
Which means are different To do this, we use StatCrunch, and the Tukey’s Multiple Comparisons (The notes in blue on the following slide are the conclusions drawn, these are not a result of StatCrunch.)
Group = Lower Middle subtracted from: Group Lower Upper Poor (-,+) Not significant Upper Middle (-,+) Not significant Well-to-do (-,+) Not significant Group = Poor subtracted from: Group Lower Upper - Upper Middle (+,+) μ1 > μ2 Well-to-do (+,+) μ1 > μ2 Group = Upper Middle subtracted from: Group Lower Upper Well-to-do (-,+) Not significant
Socioeconomics and GPA This shows that both Upper Middle Class and Well-to-do have higher mean GPA than Poor. There are no other statistically significant differences.
ANOVA – What is expected from you? Be able to complete each of the following exercises: State the two hypotheses. State the decision rule. What is the test stat, and what is its formula. What is the observed value of this test statistic? Is this valid? What is the p-value? State a conclusion. If you accepted the alternate hypothesis, you then need to find out which means are different.
Another Example Is hair color related to pain sensitivity? To study this, an experimenter divides men and women of various ages into four hair color categories: light blond, dark blond, light brunette, and dark brunette. There are six people in each of the four categories. Each participant in the study receives a pain threshold score based upon his or her performance in a pain sensitivity test (the higher the score, the lower the person’s pain tolerance.)
Hair Color vs Pain Sensitivity The treatments are the hair color The response variable is the sensitivity to pain score. H 0 : μ 1 = μ 2 = μ 3 = μ 4 H 1 : At least two of the population means of the scores are not equal Decision rule:Accept H 1 if the p-value <.05 Test statistic: F F = Variability among the sample means Variability expected by chance
Hair Color vs Pain Sensitivity One-way ANOVA: Score versus Hair Color Source DF SS MS F P Hair Color Error Total H 0 : light_blond = dark_blond = … = dark_brunette H a : At least two population means are different. F = 5.44p-value = At the.05 level of significance, there is overwhelming evidence to conclude that there is a difference among mean pain thresholds for people possessing these four hair colors.
Minitab One-way ANOVA: Score versus Hair Color Source DF SS MS F P Hair Color Error Total S = R-Sq = 44.95% R-Sq(adj) = 36.69%
Hair Color = Dark Brunette subtracted from: Hair Color Lower Upper Light Blond Light Brunette Hair Color = Dark Blond subtracted from: Hair Color Lower Upper Dark Brunette Light Blond Light Brunette Hair Color = Dark Brunette subtracted from: Hair Color Lower Upper Light Blond Light Brunette
Hair Color vs Pain Sensitivity Examine Minitab’s output to make the following table: PairFrom To Conclusion D Brun – D Blon NS (No difference) L Blon – D Blon L Blon > D Blon L Brun – D Blon NS (No difference) L Blon – D Brun L Blon > D Brun L Brun – D Brun L Brun > D Brun L Brun – L Blon NS (No difference) Summarize the results.
When should we use the multiple comparison method? The sample data are obtained from the k populations using a completely randomized design An analysis of variance F-test indicates that there are some differences among the k population means. The objective is to determine which of the k population means differ. It is usually of interest to determine which mean might be the largest (or smallest).