 # Two Groups Too Many? Try Analysis of Variance (ANOVA)

## Presentation on theme: "Two Groups Too Many? Try Analysis of Variance (ANOVA)"— Presentation transcript:

Two Groups Too Many? Try Analysis of Variance (ANOVA)
T-test Compare two groups Test the null hypothesis that two populations has the same average. ANOVA: Compare more than two groups Test the null hypothesis that two populations among several numbers of populations has the same average. The test statistic for ANOVA is the F-test (named for R. A. Fisher, the creator of the statistic).

Three types of ANOVA One-way ANOVA
Within-subjects ANOVA (Repeated measures, randomized complete block) Will not be covered in this class. Factorial ANOVA (Two-way ANOVA)

One-way ANOVA example Example: Curricula A, B, C.
You want to know if the population average score on the test of computer operations would have been different between the children who had been taught using Curricula A, B and C. Null Hypothesis: The population averages would have been identical regardless of the curriculum used. Alternative Hypothesis: The population averages differ for at least one pair of the population.

Verbal Explanation on the Logic in Comparing the Mean
If 2 or more populations have identical averages, the averages of random samples selected from those populations ought to be fairly similar as well. Sample statistics vary from one sample to the next, however, large differences among the sample averages would cause us to question the hypothesis that the samples were selected from populations with identical averages. How much should the sample averages differ before we conclude that the null hypothesis of equal population averages should be rejected ?

Logic of ANOVA In t-test, we calculated “t-statistics” In ANOVA, we calculate “F-statistics” Okay, we’ve dealt with this logic in the t-test, too. How is it different in ANOVA?

Logic of ANOVA F-statistics is obtained by comparing “the variation among the sample averages” to “the variation among observations within each of the samples”. F= Variation among the sample averages Variation among observations within each of the samples Only if variation among sample averages is substantially larger than the variation within the samples, (in other words only if F statistic is substantially large) do we conclude that the populations must have had different averages.

Sources of Variation Three sources of variation:
1) Total, 2) Between groups (“the variation among the sample averages” ), 3) Within groups (“the variation among observations within each of the samples”) Sum of Squares (SS): Reflects variation. Depend on sample size. Degrees of freedom (df): Number of population averages being compared. Mean Square (MS): SS adjusted by df. MS can be compared with each other. (SS/df)

Computing F-statistic
SS Total: Total variation in the data df total: Total sample size (N) -1 MS total: SS total/ df total SS between: Variation among the groups compared. df between: Number of groups -1 MS between : SS between/df between SS within: Variation among the scores who are in the same group. df within: Total sample size - number of groups -1 MS within: SS within/df within F statistic = MS between / MS within

Interpreting SPSS output
Univariate Analysis of Variance

Interpreting SPSS output
Tests of Between-Subjects Effects Dependent Variable: Current Salary Type III Sum Partial Eta Source of Squares df Mean Square F Sig. Squared Corrected Model 8.944E+10 a 2 4.472E+10 .000 .648 Intercept 2.915E+11 1 2.915E+11 .000 .857 JOBCAT 8.944E+10 2 4.472E+10 .000 .648 Error 4.848E+10 471 Total 6.995E+11 474 Corrected Total 1.379E+11 473 a. R Squared = .648 (Adjusted R Squared = .647)

Interpreting Significance
The probability of observing an F-statistic at least this large by chance is less than .05. Therefore, we can infer that the difference we observe in the sample will also be observed in the population. Therefore, reject the null hypothesis that there is no differences among the sample means. Accept the research hypothesis, that there is a difference between at least one pair of the population

Writing up the result “A one-way ANOVA was conducted in order to evaluate the relationship between the salary and the job category. The result of the One-way ANOVA was significant, F(2, 471) = , p<.001, partial η2=.65, which indicated that at least one pair of the job category in the mean salary is significantly different from each other.” Report the “descriptive statistics” after this. If not doing the follow-up test, describe and summarize the general conclusions of the analysis.

Follow-up test But we don’t know which pairs are significantly different from each other !! Conduct a “Follow-up test” to see specifically which means are different from which other means. Instead of repeating t-test for each combination (which can lead to an alpha inflation) there are some modified versions of t-test that adjusts for the alpha inflation. Most recommended: Tukey HSD test (When equal variance assumed) Dunnett’s C test (When euqal variance is not assumed) Other popular tests: Bonferroni test , Scheffe test

What’s Alpha Inflation?
Conducting multiple tests, will incur a large risk that at least one of them would be statistically significant just by chance (Type I error) . Example: 2 tests .05 Alpha (=probability) Probability of not having Type I error .95 .95x.95 = .9025 Probability of at least one Type I error is = Close to 10 %. Therefore, when you repeat the number of same tests, use more stringent criteria. e.g. .001

Interpreting SPSS output

Interpreting SPSS output
Post Hoc Tests Employment Category

Writing up the result of follow-up test
“The follow-up test was conducted in order to determine which job category was different from others. Because Levene’s test indicated that the equal variance cannot be assumed between the groups, Dunnett’s C test was used for the follow-up test in order to control for Type I error across the pairwise comparisons. The result of the follow-up test indicated that the salary of all three job categories are significantly different from each other.”

Relation between t-test and F-test
When two groups are compared both t-test and F-test will lead to the same answer. t2 = F. So by squaring F you’ll get t (or square root of F is t)

Formula for Sum of Squares in ANOVA