Presentation is loading. Please wait.

Presentation is loading. Please wait.

Two Factor ANOVA Copyright (c) 2008 by The McGraw-Hill Companies. This material is intended solely for educational purposes by licensed users of LearningStats.

Similar presentations


Presentation on theme: "Two Factor ANOVA Copyright (c) 2008 by The McGraw-Hill Companies. This material is intended solely for educational purposes by licensed users of LearningStats."— Presentation transcript:

1 Two Factor ANOVA Copyright (c) 2008 by The McGraw-Hill Companies. This material is intended solely for educational purposes by licensed users of LearningStats. It may not be copied or resold for profit.

2 When Do We Need Two-Factor ANOVA?
When we believe that the response variable Y is affected by more than one factor. Y = response variable A = first factor B = second factor Y = f(A, B) Model: A Y B

3 Two-Factor ANOVA (Randomized Block)
Linear Model Form: Yjk = m + tj + fk + ejk Hypotheses H0: tj = 0 (no treatment effect exists) H1: tj  0 (treatment effect exists) H0: fk = 0 (no block effect exists) H1: fk  0 (block effect exists) Definitions Yij = data in treatment j and block k m = common mean tj = effect due to treatment j fk = effect due to block k ejk = random error Note This notation is for a fixed effects model. If tj = 0 and fk = 0, the model collapses to Yjk = m + ejk which says that each observed data value is the mean perturbed by some random error.

4 Example of 3x3 Format (two-factor unreplicated)
Data Format Example of 3x3 Format (two-factor unreplicated) Row-Column Format (Excel) Stacked Format (Minitab) Note: Often only one factor (treatment) is of research interest, while the other variable (block) serves only to control for a second factor. The calculations are the same, no matter how you view the design. Usually, the blocking factor is placed in the rows.

5 Example: Pollution

6 ANOVA Table: 2-Factor General format: Illustration: Definitions
c = number of columns r = number of rows n = number of observations If F exceeds Fcrit there is a significant difference between treatment groups at the chosen a.

7 Interpretation F Statistics: p-Values: Bottom Line:
For freeway, F = exceeds Fcrit =3.490 (for d.f. = 3,12) so there is a significant difference between freeways at a = For time of day, F = exceeds Fcrit =3.259 (for d.f. = 4,12) so there is a significant difference between times of day at a = 0.05. p-Values: Both p-values are 0.000, which says that F statistics as large as these would not arise by chance at any common level of significance if the null hypothesis were true (i.e., if the treatment means were the same). We could reject the hypothesis of equal group means even at a = 0.001 Bottom Line: Both freeway and time of day affect the pollution level. Both effects are extremely significant.

8 Two-Factor ANOVA (replicated)
Model Form Xijk = m + aj + bk + abjk+ eijk Note aj and bk are called the main effects. Hypotheses H0: aj = 0 (no row effect exists) H1: aj  0 (row effect exists) H0: bk = 0 (no column effect exists) H1: bk  0 (column effect exists) H0: abjk = 0 (no interaction exists) H1: abjk  0 (interaction effect exists) Definitions Xijk = ith obs. in row j and col. k m = common mean aj = effect due to row j bk = effect due to column k abjk = effect due to interaction eijk = random error

9 Two-Factor ANOVA (replicated)
Example with 3 Groups Row-Column Format (Excel) Stacked Format (Minitab) For clarity, only two observations per cell are shown, but you can have as many as you want. Subscripts and symbols omitted for clarity.

10 Replicated: DVD Sales

11 ANOVA Table: Replicated
General format: Illustration: Definitions c = number of columns r = number of rows m = number of observations per cell If F exceeds Fcrit there is a significant difference between treatment groups at the chosen a.

12 Interpretation F Statistics: P-Values: Bottom Line:
Both main effects are significant difference at a = 0.05 since the F statistics ( and ) exceed their critical values (4.256 in both cases, using d.f. =2,9). For the interaction effect, F = does not exceed the critical value Fcrit =3.633 (for d.f. = 4,9) at a = 0.05. P-Values: For both main effects, the p-values of and are highly significant, but the interaction p-value of is significant at a = 0.20 but not at a = 0.10. Bottom Line: Both store size and display location affect weekly sales. Both effects are extremely significant, though store size is a stronger effect (smaller p-value). Interaction exists only at a very weak level of significance.

13 ANOVA Notation Textbooks and computer software use various symbols for treatments and subscripts. We follow Excel's practice of using r for rows and c for columns. In one-factor ANOVA, we use SSB for variation between columns and SSW for variation within columns. In two-factor ANOVA, we use SSA and SSB for the main effects and SSAB for the interaction.

14 Stacked Format Most computer packages expect the dependent variable to be in a column, and each factor to be in a column, like this: Obs Y Factor 1 Factor 2 1 y1 3 2 y2 y3 4 ... n yn

15 Example: One-Factor Stacked
Key AmtPaid is the amount to be paid to provider by insurance carrier ProdID refers to the patient’s insurance type M=Medicare D=Medicaid S=Commercial 1 P=Commercial 2 Note These observations (n = 498) were chosen at random from a database containing 1,508 records.

16 Minitab Procedure Copyright Notice Portions of MINITAB Statistical Software input and output contained in this document are printed with permission of Minitab, Inc. MINITABTM is a trademark of Minitab Inc. in the United States and other countries and is used herein with the owner's permission.

17 Minitab Results: Hospital Charges
P-value indicates no significant group difference at a = 0.10. Minitab shows overlapping C.I.'s for each group mean.

18 Stacked or Unstacked? Comment
In Excel, the one-factor ANOVA test requires that the data be grouped into separate columns. In this case, there are 4 insurance types so we would need 4 separate data columns. This is awkward for large data sets. Minitab can convert unstacked data into stacked data (and vice versa). In databases, the variables usually are coded in stacked format (one column for each factor).

19 Excel Procedure: Hospital Charges

20 Excel Results: Hospital Charges
P-value would not even be significant at a = 0.10. There are 4 insurance groups so d.f.=3 for the “treatment” effect. At the 5% level of significance, F does not exceed Fcrit.

21 More Than Two Factors? ANOVA can have any number of factors (main effects) and their interactions. For example: 3-factor ANOVA model with all possible interactions: X = f (A, B, C, AB, AC, BC, ABC) 4-factor ANOVA model with all possible interactions is: X = f (A, B, C, D, AB, AC, AD, BC, BD, CD, ABC, ABD, BCD, ACD, ABCD) Samples sizes are rarely large enough to estimate such models, so higher-order interactions often are not examined.

22 One factor often suffices!
Summary ANOVA compares means in several groups. Each group (or combination of factors) is a treatment. 1-factor ANOVA is most common, comparing c groups. 2-factor ANOVA without replication omits interactions. 2-factor ANOVA with replication allows interactions. k-factor ANOVA is conceptually simple (but Excel doesn’t do it). One factor often suffices!


Download ppt "Two Factor ANOVA Copyright (c) 2008 by The McGraw-Hill Companies. This material is intended solely for educational purposes by licensed users of LearningStats."

Similar presentations


Ads by Google