Download presentation
Presentation is loading. Please wait.
1
Comparing Many Group Means One Way Analysis of Variance
2
Data Situation The data situation has k populations and we wish to determine if there are any differences in the population means. The data situation has k populations and we wish to determine if there are any differences in the population means. If there are differences, we need to describe them. If there are differences, we need to describe them. Each population is sampled so that n_1 to n_k observations are obtained. Each population is sampled so that n_1 to n_k observations are obtained.
3
Examples An agricultural researcher wishes to know what color of bug trap catches the most bugs. Four colors are considered: yellow, white, green, blue. An agricultural researcher wishes to know what color of bug trap catches the most bugs. Four colors are considered: yellow, white, green, blue. An education researcher wishes to know if where a student sits in a classroom has any relationship to the grade the student will receive in the class. Three seating locations are considered: front, middle, and back. An education researcher wishes to know if where a student sits in a classroom has any relationship to the grade the student will receive in the class. Three seating locations are considered: front, middle, and back.
4
Hypotheses The null hypothesis is that all population means are equal. That is, no differences. The null hypothesis is that all population means are equal. That is, no differences. The alternative hypothesis is that not all of the population means are equal. The alternative hypothesis is that not all of the population means are equal. Note in this situation the Ha cannot be represented by using just symbols. Note in this situation the Ha cannot be represented by using just symbols.
5
Hypotheses
6
Test Statistic The idea is that if the population means differ, the variation between the group sample means will be large relative to the variation within the groups. The idea is that if the population means differ, the variation between the group sample means will be large relative to the variation within the groups. The test statistic is based on this idea – it is a ratio of the variation between groups divided by the variation within groups. The test statistic is based on this idea – it is a ratio of the variation between groups divided by the variation within groups. This ratio has an F distribution under Ho. This ratio has an F distribution under Ho.
7
F Distribution The F distribution is a skewed right distribution with minimum value zero, and can extend out to infinity. The F distribution is a skewed right distribution with minimum value zero, and can extend out to infinity. The F distribution is indexed by two sets of degrees of freedom: the first is the numerator degrees of freedom, and the second is the denominator degrees of freedom. The F distribution is indexed by two sets of degrees of freedom: the first is the numerator degrees of freedom, and the second is the denominator degrees of freedom.
8
Degrees of Freedom The numerator degrees of freedom are the number of groups, k, minus one giving k-1. The numerator degrees of freedom are the number of groups, k, minus one giving k-1. The denominator degrees of freedom are n-k, the number of observations minus the number of groups. The denominator degrees of freedom are n-k, the number of observations minus the number of groups.
9
Hypothesis Test Formula
10
Test Statistic The test statistic formula looks complex, but is pretty easy to understand: the denominator is simply finding the variation of each observation yij around its group average y-bar_i. This variation is summed over all observations within each group. This gives the total squared variation within groups. The test statistic formula looks complex, but is pretty easy to understand: the denominator is simply finding the variation of each observation yij around its group average y-bar_i. This variation is summed over all observations within each group. This gives the total squared variation within groups. The numerator gives the variation of each group sample mean around an overall average y-bar. The numerator gives the variation of each group sample mean around an overall average y-bar.
11
P-Value The p-value is found using an F-table with the appropriate degrees of freedom. The p-value is found using an F-table with the appropriate degrees of freedom. If the null hypothesis is not true, the test statistic F will be large and the probability of getting beyond a large positive F will be small. If the null hypothesis is not true, the test statistic F will be large and the probability of getting beyond a large positive F will be small. This means that a small p-value implies evidence to doubt the Ho, just like it always does. This means that a small p-value implies evidence to doubt the Ho, just like it always does. The test will always be one-sided, positive tail. The test will always be one-sided, positive tail.
12
F-Distribution
13
Color Boxplots
14
Trap Color Problem
15
Conclusion The data is unlikely to occur if the Ho was true. The data are inconsistent with the Ho. The data is unlikely to occur if the Ho was true. The data are inconsistent with the Ho. There is evidence to doubt the Ho. There is evidence to doubt the Ho. There is evidence to support the Ha. There is evidence to support the Ha. There is evidence that not all of the trap color means are equal. There are differences between the colors in how many bugs they catch. There is evidence that not all of the trap color means are equal. There are differences between the colors in how many bugs they catch.
16
Conclusion The sample summary shows how the colors differ. The sample summary shows how the colors differ. Sample means: TreatmentnMeanStd. Error yellow647.1666682.7738862 White615.6666671.3581033 Green631.54.047633 Blue614.8333332.181997
17
Conclusion The means show that yellow is by far the best color at catching bugs. The means show that yellow is by far the best color at catching bugs. The next best color is green. The next best color is green.
18
ANOVA Table ANOVA table: SourcedfSSMSF-StatP-value Treatments34218.45851406.152830.551935<0.0001 Error20920.546.025 Total235138.9585
19
Means Plot
20
Row Seating Analysis The response variable is grade in the course (grade points). The response variable is grade in the course (grade points). The grouping variable is seating location at three levels: front, middle, back. The grouping variable is seating location at three levels: front, middle, back. The idea is that academic performance may be related to seating location in the classroom. The idea is that academic performance may be related to seating location in the classroom.
21
Row Seating Analysis
23
Sample means: TreatmentnMeanStd. Error Front3514.2857140.9886511 Middle2511.121.3245376 Back1910.1052631.4446812 ANOVA table: SourcedfSSMSF-StatP-value Treatments2264.3011132.150543.4282960.0375 Error762929.572338.547005 Total783193.8735
24
Conclusion The data are unlikely to occur if the Ho is true. The data is inconsistent with the Ho. The data are unlikely to occur if the Ho is true. The data is inconsistent with the Ho. There is evidence to doubt the Ho. There is evidence to doubt the Ho. There is evidence to support the Ha. There is evidence to support the Ha. There is evidence the group means differ by seating location. That is, the seating location mean gpas differ. There is evidence the group means differ by seating location. That is, the seating location mean gpas differ.
25
Conclusion The table of means and the plots show clearly that the front of the classroom has higher grades than the other locations. The table of means and the plots show clearly that the front of the classroom has higher grades than the other locations. From the graphs it is clear that the middle and the back of the room have similar grades. From the graphs it is clear that the middle and the back of the room have similar grades. Where do you sit? Where do you sit?
26
Math Scores
27
GroupnMeanStd. Error Computer200.950.45581043 None140.785714270.85278195 Piano343.61764720.52396184 Singing10-0.30.47258157 ANOVA table: SourcedfSSMSF-StatP-value Treatments3184.1019161.3673028.418377<0.0001 Error74539.43667.2896833 Total77723.53845
28
Math Scores
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.