Chapter 17 Comparing Multiple Population Means: One-factor ANOVA
What if we have more than 2 conditions/groups? Interest - the effects of 3 drugs on depression - Prozac, Zoloft, and Elavil Select 24 people with depression, randomly assign (blindly) to one of four conditions: 1) Prozac, 2) Zoloft, 3) Elavil, and 4) Placebo After 1 month of drug therapy, we measure depression
Research Design and Data ProzacZoloftElavilPlacebo
Multiple t-tests? Differences between drugs? Prozac vs. ZoloftProzac vs. Elavil Prozac vs. PlaceboZoloft vs. Elavil Zoloft vs PlaceboElavil vs. Placebo 6 separate t-tests
Probability Theory (Revisited) The probability of making a correct decision when the null is false is 1 - α (generally.95) Each test is independent The probability of making the correct decision across all 6 tests is the product of those probabilities or, (.95)(.95)(.95)(.95)(.95)(.95) =.735
Type 1 error & multiple t-tests Thus, the probability of a type 1 error is not α, but 1 - (1 - α) C, where C is the number of comparisons Or, in the present case =.265
t statistic as a ratio obtained difference t = ———————————————— difference expected by chance (“error”) Easy – Pool Variance Hmmm…
Differences in the t test M 1 – M 2 or M D Can we subtract multiple means from one another? M 1 – M 2 – M 3 – M 4 = ???? M 4 – M 1 – M 2 – M 3 = ???? Is there another statistic that tells us how much things differ from one another?
What statistic describes how scores differ from one another? Variance How do a set a means differ from one another? Answer – variance between means/groups
t statistic as a ratio obtained difference t = ———————————————— difference expected by chance (“error”) variance between means/groups t = ———————————————— pooled variance
F statistic between-groups variance estimate F = —————————————— within-groups variance estimate Mean-square Treatment (MST or MSB) s 2 B F = ———————————————— = — Mean-square Error (MSE or MSW) s 2 W
ANOVA Analysis of Variance, or ANOVA, allows us to compare multiple group means, without compromising α And, even though an ANOVA uses variances and the F statistic, it helps test hypotheses about means
F statistic Between-groups variance (MST or MSB) is based on the variability between the groups Within-groups variance (MSE or MSW) is a measure of the variability within the groups –if there is no difference between these 2 measure of variability (due to no differences between groups), F will be close to 1 –if there is greater variability between-groups (due to differences between groups), F will be greater than 1
Between-groups variance (MST, MSB or s 2 B ) k groups where M i is the mean of the i th group, and M G is the grand mean (the mean of all scores)
Within-groups variance (MSE, MSW, or s 2 W ) k groups
SST (Sums of Squares Total) The sums of squares total can be used either as a check, or to calculate SSW
An ANOVA Table The results of an ANOVA are often presented in a table: Source SS df MSF Between Within Total
An ANOVA Table The results of an ANOVA are often presented in a table: Source SS df MSF Between Within Total
Procedure for Completing an ANOVA 1. Arrange Data by Group 2.Compute for each group (k groups): Σx Σx 2 M SS(x) n
Procedure for Completing an ANOVA 3.Compute the grand mean ( M G ), by adding all the scores and dividing by N M G = Σx/N 4.Compute SSB = Σ n i ( M i - M G ) 2 5.Compute SSW SSW = SS(x 1 ) + SS(x 2 ) + ···+ SS(x k ) 6. Compute SST = Σx 2 - (Σx) 2 /N
Procedure for Completing an ANOVA 7.Compute df df B = k - 1 df W = N - k df T = N -1 8.Fill in ANOVA table 9.Compute MS (SS/df) 10.Compute F = MSB/MSW
1. ANOVA Calculations ProzacZoloftElavilPlacebo
2. ANOVA Calculations Prozac (Group 1) 10ΣX 1 = 50 8ΣX 1 2 = M 1 = SS(X 1 ) = 50 9n 1 = 6 6
2. ANOVA Calculations Zoloft (Group 2) 14ΣX 2 = 90 12ΣX 2 2 = M 2 = SS(X 2 ) = 28 13n 2 = 6 17
2. ANOVA Calculations Elavil (Group 3) 19ΣX 3 = ΣX 3 2 = M 3 = SS(X 3 ) = 28 18n 3 = 6 20
2. ANOVA Calculations Placebo (Group 4) 21ΣX 4 = ΣX 4 2 = M 4 = SS(X 4 ) = n 4 = 6 22
3. ANOVA Calculations M G = Σx/N =(ΣX 1 + ΣX 2 + ΣX 3 + ΣX 4 )/ (n 1 +n 2 +n 3 +n 4 ) = ( )/( ) = 380/24 = 15.83
4. ANOVA Calculations SSB = Σ n i ( M i - X G ) 2 = 6( ) 2 + 6( ) 2 + 6( ) 2 + 6( ) 2 = 6(34.03) + 6(.69) + 6(1.36) + 6(30.25) = =
5. ANOVA Calculations SSW = SS(X 1 ) + SS(X 2 ) + ···+ SS(X k ) = =
6. ANOVA Calculations SST = ΣX 2 - (ΣX) 2 /N = ( ) - ( ) 2 /24 = /24 = =
Check SST = SSB + SSW = =
7. ANOVA Calculations df B = k -1 = 4 -1 = 3 df W = N - k = = 20 df T = N - 1 = 23
8. ANOVA Calculations Source SSdf MSF Between Within Total
8. ANOVA Calculations Source SSdf MSF Between Within Total
8. ANOVA Calculations Source SSdf MSF Between Within Total
8. ANOVA Calculations Source SSdf MSF Between Within Total
Hypothesis test of Anti-depressants 1. State and Check Assumptions –About the population Normally distributed? - don’t know Homogeneity of variance – we’ll check – About the sample Independent Random sample? – yes Independent samples –About the sample Interval level
Hypothesis test of Anti-depressants 2.Hypotheses H O : μ Prozac = μ Zoloft = μ Elavil = μ Placebo H A : the null is wrong
That’s an Odd H A You might think that the alternative hypothesis should look like this: H A : μ Prozac ≠ μ Zoloft ≠ μ Elavil ≠ μ Placebo Accepting this alternative indicates that all of the means are unequal, which is not what ANOVA determines
What does ANOVA determine? That at least one of the means is different than at least one other mean Since, that is a difficult statement to write, we say “the null is wrong”
Hypothesis test of Anti-depressants 3.Choose test statistic –4 groups independent samples One-factor ANOVA
Hypothesis test of Anti-depressants 4.Set Significance Level α =.05 Critical Value Non-directional Hypothesis with df B = k – 1 and df W = N – k df B = 3 and df W = 21 From Table D F crit = 3.07, so we reject H O if F ≥ 3.07
Hypothesis test of Anti-depressants 5.Compute Statistic Source SSdf MSF Between Within Total
Hypothesis test of Anti-depressants 6. Draw Conclusions –because our F falls within the rejection region, we reject the H O, and –conclude that at least one medicine is better than at least one other medicine in treating depression
Violations of Assumptions As with t-tests, ANOVA is fairly ROBUST to violations of normality and homogeneity of variance, but IF there are severe violations of these assumptions, Use a Kruskal-Wallis H test (a non- parametric alternative)
Procedure for completing a Kruskal-Wallis H 1.Arrange data in columns, 1 group per column, skipping columns between groups 2.Rank all the scores, assigning the lowest rank (1) to the lowest score (put ranks in the column next to the raw scores) 3.Sum the ranks in each column (ΣT j ) 4.Square the sum of the ranks of each column (ΣT j ) 2
Procedure for completing a Kruskal-Wallis H test 5.Compute SSB 6.Compute H
Procedure for completing a Kruskal-Wallis H test 6. Compute df = k H is distributed as a χ 2 –Look up critical value in χ 2 (chi-square) table with appropriate df
Dependent Samples (more than 2 conditions) Experiments are often conducted comparing more than 2 conditions –ANOVA –Kruskal-Wallis H Samples are often related - “dependent samples” (within-subjects, repeated measures, etc.)
Dependent Samples ANOVA SS(T) = SS(B) + SS(Bl) + SS(E) Calculate SS(T), SS(B), and SS(Bl) SS(E) = SS(T) - SS(B) - SS(Bl)
Why “Blocks”? A dependent samples ANOVA is sometimes referred to as a “Randomized-Block” design Each group of related measurements, either within-subject, or with matching, is a “Block” of measurements
SS(Bl) Sum of Squares Blocks - the sum of the squared deviations of each block mean from the grand mean SS(Bl) = Σk( M i - M G ) 2, or SS(Bl) = ΣBl 2 /k - N( M G 2 ), where Bl = sum of the scores in a block
Procedure for Completing A dependent samples ANOVA 1.Arrange data where columns are conditions, rows are blocks (subjects or matched-subjects) 2.Compute for each column (conditions) n ΣX ΣX 2 M SS(X) s 2
Procedure for Completing A dependent samples ANOVA 3.Total the scores in the rows in a new column to the right (Block Totals) 4. Square the block totals in the next column 5. Compute the grand mean ( M G ), by adding all the scores and dividing by N M G = ΣX/N 6.Compute SS(B) = Σ n i ( M i - M G ) 2
Procedure for Completing A dependent samples ANOVA 7.Compute SS(T) = ΣX 2 - NM G 2 8.Compute SS(Bl) = ΣBl 2 /k – NM G 2 9.Compute SS(E) = SS(T) - SS(B) - SS(Bl) 10.Compute df df B = k - 1 df Bl = n - 1 df E = (N - k) - (n - 1) df T = N -1
Procedure for Completing A dependent samples ANOVA 11. Fill in ANOVA table 12.Compute MS (SS/df) 13.Compute F = MSB/MSE
Dependent Samples ANOVA table Source SS df MSF Between Blocks Error Total
Example A researcher is interested in the effects of three new sleep-aids, Sleep E-Z, Zonked, and NockOut He selects 5 subjects and they take each of the 3 new drugs in a random order The number of hours slept per night on each of the new sleep-aids is recorded
Data SubjectSleep E-Z ZonkedNockOut
Hypothesis Test – Sleep aids 1. State and Check Assumptions –Population Normally Distributed – not sure, assume for time being H of V – not sure, but we’ll check sample variances –Sample Dependent samples Random assignment –Data Interval/Ratio
Hypothesis Test – Sleep aids 2. State Null and Alternative Hypotheses H O : μ 1 = μ 2 = μ 3 (the population means are equal) H A : H O is wrong (at least one of the means differs, can’t say “μ 1 ≠ μ 2 ≠ μ 3 ” because this means “all the means differ from one another”)
Hypothesis Test – Sleep aids 3. Choose Test Statistic –Parameter of interest – means –Number of Groups – 3 –One factor (or IV being manipulated) –Dependent Samples One-factor ANOVA for Dependent Samples (F)
Hypothesis Test – Sleep aids 4. Set Significance Level α =.05 F = MSB/MSE, df B = k – 1, df E = (N – k) – (n – 1), where N = total number of obs, k = number of groups/conditions, n = number of subs/blocks df B = 3 –1 = 2, df E = (15 – 3) – ( 5 – 1) F crit (2, 8) = 4.46 If our F ≥ 4.46, we Reject H O
Hypothesis Test – Sleep aids 5. Compute test Statistic
Computations SubS E-ZZNO
SubS E-ZZNO n555 ΣX ΣX M SS(X) s – H of V Otay!
SubS E-ZZNOBl n555 ΣX ΣX M SS(X) s
SubS E-ZZNOBl Bl N555 ΣBl 2 = 1815 ΣX ΣX M SS(X) s
Computations M G = ΣX/N = ( ) / (15) = 6.333
Computations SS(B) = Σ n i ( M i - M G ) 2 = 5( ) 2 + 5( ) 2 + 5( ) 2 =
Computations SS(T) = Σ X 2 - (Σ X) 2 /N = ( ) - ( ) 2 /15 = = 25.33
Computations SS(Bl) = ΣBl 2 /k - N( M G 2 ) = ( )/3 - 15(6.33) 2 = 1815/ = 3.33
Computations SS(E) = SS(T) - SS(B) - SS(Bl) = = 9.87
Computations df B = k - 1 = = 2 df Bl = n - 1 = = 4 df E = (N - k) - (n - 1) = (15 - 3) - (5 - 1) = = 8 df T = N -1 = = 14
Computations Source SSdf MSF Between Blocks Error Total
Computations Source SS df MSF Between Blocks Error Total
Computations Source SS df MSF Between Blocks Error Total
Hypothesis Test 6. Draw Conclusions –Since our F > 4.46, we Reject H O, accept H A –And conclude that the at least one of the medications resulted in more sleep than the others
Dependent samples ANOVA What if we violate one of the assumptions? Friedman test –means (or distribution) are of interest –more than 2 groups/conditions –dependent samples –concerns about normality, homogeneity of variance, etc.
Friedman F r 1.Arrange data in columns, 1 group/condition per column, (conditions = columns = k) 2. Place correlated measures (matched, repeated, etc.) across conditions in the same rows (n rows) 3.Rank the scores in each row from 1 to k, assigning the lowest rank (1) to the lowest score (put ranks in the column next to the raw scores)
Friedman (continued) 4. Sum the ranks of each column (ΣT k ) 5. Compute the mean of the Ts, T 6. Compute S
Friedman (continued) 7. Compute the Friedman test statistic F r 8. Compute df = k-1 9. Look up critical value in Χ 2 table or use Excel to find p