Topic 24: Two-Way ANOVA. Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model.

Slides:



Advertisements
Similar presentations
Topic 12: Multiple Linear Regression
Advertisements

Topic 32: Two-Way Mixed Effects Model. Outline Two-way mixed models Three-way mixed models.
Topic 15: General Linear Tests and Extra Sum of Squares.
Analysis of Variance Compares means to determine if the population distributions are not similar Uses means and confidence intervals much like a t-test.
EPI 809/Spring Probability Distribution of Random Error.
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. Analysis of Variance Chapter 16.
Topic 3: Simple Linear Regression. Outline Simple linear regression model –Model parameters –Distribution of error terms Estimation of regression parameters.
Copyright ©2011 Brooks/Cole, Cengage Learning Analysis of Variance Chapter 16 1.
Statistics for Managers Using Microsoft® Excel 5th Edition
Chapter 11 Analysis of Variance
Statistics for Business and Economics
Statistics for Managers Using Microsoft® Excel 5th Edition
Analysis of Variance Chapter 15 - continued Two-Factor Analysis of Variance - Example 15.3 –Suppose in Example 15.1, two factors are to be examined:
Chapter 11 Analysis of Variance
Copyright ©2011 Pearson Education 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel 6 th Global Edition.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chap 10-1 Analysis of Variance. Chap 10-2 Overview Analysis of Variance (ANOVA) F-test Tukey- Kramer test One-Way ANOVA Two-Way ANOVA Interaction Effects.
Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall 11-1 Chapter 11 Analysis of Variance Statistics for Managers using Microsoft Excel.
Chapter 12: Analysis of Variance
Topic 16: Multicollinearity and Polynomial Regression.
Topic 28: Unequal Replication in Two-Way ANOVA. Outline Two-way ANOVA with unequal numbers of observations in the cells –Data and model –Regression approach.
1 Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models.
1 Experimental Statistics - week 6 Chapter 15: Randomized Complete Block Design (15.3) Factorial Models (15.5)
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
© 2002 Prentice-Hall, Inc.Chap 9-1 Statistics for Managers Using Microsoft Excel 3 rd Edition Chapter 9 Analysis of Variance.
Topic 7: Analysis of Variance. Outline Partitioning sums of squares Breakdown degrees of freedom Expected mean squares (EMS) F test ANOVA table General.
Topic 14: Inference in Multiple Regression. Outline Review multiple linear regression Inference of regression coefficients –Application to book example.
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
CHAPTER 12 Analysis of Variance Tests
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Experimental Design and Analysis of Variance Chapter 10.
Chapter 10 Analysis of Variance.
GENERAL LINEAR MODELS Oneway ANOVA, GLM Univariate (n-way ANOVA, ANCOVA)
5-5 Inference on the Ratio of Variances of Two Normal Populations The F Distribution We wish to test the hypotheses: The development of a test procedure.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Topic 17: Interaction Models. Interaction Models With several explanatory variables, we need to consider the possibility that the effect of one variable.
STA305 week 51 Two-Factor Fixed Effects Model The model usually used for this design includes effects for both factors A and B. In addition, it includes.
Hormone Example: nknw892.sas Y = change in growth rate after treatment Factor A = gender (male, female) Factor B = bone development level (severely depressed,
Topic 23: Diagnostics and Remedies. Outline Diagnostics –residual checks ANOVA remedial measures.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
Topic 30: Random Effects. Outline One-way random effects model –Data –Model –Inference.
Topic 25: Inference for Two-Way ANOVA. Outline Two-way ANOVA –Data, models, parameter estimates ANOVA table, EMS Analytical strategies Regression approach.
Bread Example: nknw817.sas Y = number of cases of bread sold (sales) Factor A = height of shelf display (bottom, middle, top) Factor B = width of shelf.
Topic 26: Analysis of Covariance. Outline One-way analysis of covariance –Data –Model –Inference –Diagnostics and rememdies Multifactor analysis of covariance.
PSYC 3030 Review Session April 19, Housekeeping Exam: –April 26, 2004 (Monday) –RN 203 –Use pencil, bring calculator & eraser –Make use of your.
1 Experimental Statistics - week 9 Chapter 17: Models with Random Effects Chapter 18: Repeated Measures.
1 Experimental Statistics Spring week 6 Chapter 15: Factorial Models (15.5)
Topic 20: Single Factor Analysis of Variance. Outline Analysis of Variance –One set of treatments (i.e., single factor) Cell means model Factor effects.
Chapter 4 Analysis of Variance
Example 1 (knnl917.sas) Y = months survival Treatment (3 levels)
Experimental Statistics - week 9
Topic 29: Three-Way ANOVA. Outline Three-way ANOVA –Data –Model –Inference.
Interview Example: nknw964.sas Y = rating of a job applicant Factor A represents 5 different personnel officers (interviewers) n = 4.
Topic 27: Strategies of Analysis. Outline Strategy for analysis of two-way studies –Interaction is not significant –Interaction is significant What if.
Topic 22: Inference. Outline Review One-way ANOVA Inference for means Differences in cell means Contrasts.
Topic 21: ANOVA and Linear Regression. Outline Review cell means and factor effects models Relationship between factor effects constraint and explanatory.
1 Experimental Statistics - week 8 Chapter 17: Mixed Models Chapter 18: Repeated Measures.
CHAPTER 3 Analysis of Variance (ANOVA) PART 3 = TWO-WAY ANOVA WITH REPLICATION (FACTORIAL EXPERIMENT) MADAM SITI AISYAH ZAKARIA EQT 271 SEM /2015.
McGraw-Hill/IrwinCopyright © 2009 by The McGraw-Hill Companies, Inc. All Rights Reserved. Experimental Design and Analysis of Variance Chapter 11.
10-1 of 29 ANOVA Experimental Design and Analysis of Variance McGraw-Hill/Irwin Copyright © 2003 by The McGraw-Hill Companies, Inc. All rights reserved.
CHAPTER 3 Analysis of Variance (ANOVA) PART 3 = TWO-WAY ANOVA WITH REPLICATION (FACTORIAL EXPERIMENT)
Today: Feb 28 Reading Data from existing SAS dataset One-way ANOVA
MADAM SITI AISYAH ZAKARIA
Two-Way Analysis of Variance Chapter 11.
Factorial Experiments
Topic 31: Two-way Random Effects Models
Two-Factor Studies with Equal Replication
Two-Factor Studies with Equal Replication
ANOVA Analysis of Variance.
Experimental Statistics - week 8
Slides are available at:
Presentation transcript:

Topic 24: Two-Way ANOVA

Outline Two-way ANOVA –Data –Cell means model –Parameter estimates –Factor effects model

Two-Way ANOVA The response variable Y is continuous There are two categorical explanatory variables or factors

Data for two-way ANOVA Y is the response variable Factor A with levels i = 1 to a Factor B with levels j = 1 to b Y ijk is the k th observation in cell (i,j) In Chapter 19, we assume equal sample size in each cell (n ij =n)

KNNL Example KNNL p 833 Y is the number of cases of bread sold A is the height of the shelf display, a=3 levels: bottom, middle, top B is the width of the shelf display, b=2 levels: regular, wide n=2 stores for each of the 3x2=6 treatment combinations (n T =12)

Read the data data a1; infile ‘../data/ch19ta07.txt'; input sales height width; proc print data=a1; run;

The data Obs sales height width

Notation For Y ijk we use –i to denote the level of the factor A –j to denote the level of the factor B –k to denote the k th observation in cell (i,j) i = 1,..., a levels of factor A j = 1,..., b levels of factor B k = 1,..., n observations in cell (i,j)

Model We assume that the response variable observations are –Normally distributed With a mean that may depend on the levels of the factors A and B With a constant variance –Independent

Cell Means Model Y ijk = μ ij + ε ijk –where μ ij is the theoretical mean or expected value of all observations in cell (i,j) –the ε ijk are iid N(0, σ 2 ) This means Y ijk ~ N(μ ij, σ 2 ), independent The parameters of the model are – μ ij, for i = 1 to a and j = 1 to b –σ 2

Estimates Estimate μ ij by the mean of the observations in cell (i,j), For each (i,j) combination, we can get an estimate of the variance We need to combine these to get an estimate of σ 2

Pooled estimate of σ 2 In general we pool the s ij 2, using weights proportional to the df, n ij -1 The pooled estimate is s 2 = (Σ (n ij -1)s ij 2 ) / (Σ(n ij -1)) Here, n ij = n, so s 2 = (Σs ij 2 ) / (ab), which is the average sample variance

Run proc glm proc glm data=a1; class height width; model sales= height width height*width; means height width height*width; run;

Output Class Level Information ClassLevelsValues height width21 2 Number of Observations Read12 Number of Observations Used12

Means statement height Level of heightN sales MeanStd Dev

Means statement width Level of widthN sales MeanStd Dev

Means statement ht*w Level of height Level of widthN sales MeanStd Dev

Code the factor levels data a1; set a1; if height eq 1 and width eq 1 then hw='1_BR'; if height eq 1 and width eq 2 then hw='2_BW'; if height eq 2 and width eq 1 then hw='3_MR'; if height eq 2 and width eq 2 then hw='4_MW'; if height eq 3 and width eq 1 then hw='5_TR'; if height eq 3 and width eq 2 then hw='6_TW';

Plot the data symbol1 v=circle i=none; proc gplot data=a1; plot sales*hw/frame; run;

The plot

Put the means in a2 proc means data=a1; var sales; by height width; output out=a2 mean=avsales; proc print data=a2; run;

Output Data Set Obs height width _TYPE_ _FREQ_ avsales

Plot the means symbol1 v=square i=join c=black; symbol2 v=diamond i=join c=black; proc gplot data=a2; plot avsales*height=width/frame; run;

The interaction plot

Questions to consider Does the height of the display affect sales? If yes, compare top with middle, top with bottom, and middle with bottom Does the width of the display affect sales? If yes, compare regular and wide

But wait!!! Are these factor level comparisons meaningful? Does the effect of height on sales depend on the width? Does the effect of width on sales depend on the height? If yes, we have an interaction and we need to do some additional analysis

Factor effects model For the one-way ANOVA model, we wrote μ i = μ + α i Here we use μ ij = μ + α i + β j + (αβ) ij Under “common” formulation –μ (μ.. in KNNL) is the “overall mean” –α i is the main effect of A –β j is the main effect of B –(αβ) ij is the interaction between A and B

Factor effects model μ = (Σ ij μ ij )/(ab) μ i. = (Σ j μ ij )/b and μ.j = (Σ i μ ij )/a α i = μ i. – μ and β j = μ.j - μ (αβ) ij is difference between μ ij and μ + α i + β j (αβ) ij = μ ij - (μ + (μ i. - μ) + (μ.j - μ)) = μ ij – μ i. – μ.j + μ

Interpretation μ ij = μ + α i + β j + (αβ) ij μ is the “overall” mean α i is an adjustment for level i of A β j is an adjustment for level j of B (αβ) ij is an additional adjustment that takes into account both i and j that cannot be explained by the previous adjustments

Constraints for this framework α. = Σ i α i = 0 β. = Σ j β j = 0 (αβ).j = Σ i (αβ) ij = 0 for all j (αβ) i. = Σ j (αβ) ij = 0 for all i

Estimates for factor effects model

SS for ANOVA Table

df for ANOVA Table df A = a-1 df B = b-1 df AB = (a-1)(b-1) df E = ab(n-1) df T = abn-1 = n T -1

MS for ANOVA Table MSA = SSA/df A MSB = SSB/df B MSAB = SSAB/df AB MSE = SSE/df E MST = SST/df T

Hypotheses for two-way ANOVA H 0A : α i = 0 for all i H 1A : α i ≠ 0 for at least one i H 0B : β j = 0 for all j H 1B : β j ≠ 0 for at least one j H 0AB : (αβ) ij = 0 for all (i,j) H 1AB : (αβ) ij ≠ 0 for at least one (i,j)

F statistics H 0A is tested by F A = MSA/MSE; df=df A, df E H 0B is tested by F B = MSB/MSE; df=df B, df E H 0AB is tested by F AB = MSAB/MSE; df=df AB, df E

ANOVA Table Source df SS MS F A a-1 SSA MSA MSA/MSE B b-1 SSB MSB MSB/MSE AB (a-1)(b-1) SSAB MSAB MSAB/MSE Error ab(n-1) SSE MSE _ Total abn-1 SSTO MST

P-values P-values are calculated using the F(dfNumerator, dfDenominator) distributions If P ≤ 0.05 we conclude that the effect being tested is statistically significant

KNNL Example NKNW p 833 Y is the number of cases of bread sold A is the height of the shelf display, a=3 levels: bottom, middle, top B is the width of the shelf display, b=2: regular, wide n=2 stores for each of the 3x2 treatment combinations

PROC GLM proc glm data=a1; class height width; model sales= height width height*width; run;

Output Note that there are 6 cells in this design…(6-1)df for model SourceDF Sum of Squares Mean SquareF ValuePr > F Model Error Corrected Total

Output ANOVA Note Type I and Type III Analyses are the same because n ij is constant SourceDFType III SSMean SquareF ValuePr > F height <.0001 width height*width

Other output R-SquareCoeff VarRoot MSEsales Mean Commonly do not consider R-sq when performing ANOVA…interested more in difference in levels rather than the models predictive ability

Results The main effect of height is statistically significant (F=74.71; df=2,6; P<0.0001) The main effect of width is not statistically significant (F=1.16; df=1,6; P=0.32) The interaction between height and width is not statistically significant (F=1.16; df=2,6; P=0.37)

Interpretation The height of the display affects sales of bread The width of the display has no apparent effect The effect of the height of the display is similar for both the regular and the wide widths

Plot of the means

Additional analyses We will need to do additional analyses to explain the height effect (factor A) There were three levels: bottom, middle and top We could rerun the data with a one- way anova and use the methods we learned in the previous chapters Use means statement with lines

Run Proc GLM proc glm data=a1; class height width; model sales= height width height*width; means height / tukey lines; lsmeans height / adjust=tukey; run;

MEANS Output Alpha0.05 Error Degrees of Freedom6 Error Mean Square Critical Value of Studentized Range Minimum Significant Difference Means with the same letter are not significantly different. Tukey GroupingMeanNheight A B B B

LSMEANS Output heightsales LSMEAN LSMEAN Number Least Squares Means for effect height Pr > |t| for H0: LSMean(i)=LSMean(j) Dependent Variable: sales i/j < <.0001

Last slide We went over Chapter 19 We used program topic24.sas to generate the output for today.