Presentation is loading. Please wait.

Presentation is loading. Please wait.

EPI809/Spring 2008 1 Chapter 12 Multisample inference: Analysis of Variance Analysis of Variance.

Similar presentations


Presentation on theme: "EPI809/Spring 2008 1 Chapter 12 Multisample inference: Analysis of Variance Analysis of Variance."— Presentation transcript:

1 EPI809/Spring Chapter 12 Multisample inference: Analysis of Variance Analysis of Variance

2 EPI809/Spring Learning Objectives 1.Describe Analysis of Variance (ANOVA) 2.Explain the Rationale of ANOVA 3.Compare Experimental Designs 4.Test the Equality of 2 or More Means Completely Randomized Design Completely Randomized Design Randomized Block Design Randomized Block Design Factorial Design Factorial Design

3 EPI809/Spring Analysis of Variance A analysis of variance is a technique that partitions the total sum of squares of deviations of the observations about their mean into portions associated with independent variables in the experiment and a portion associated with error

4 EPI809/Spring Analysis of Variance The ANOVA table was previously discussed in the context of regression models with quantitative independent variables, in this chapter the focus will be on nominal independent variables (factors)

5 EPI809/Spring Analysis of Variance A factor refers to a categorical quantity under examination in an experiment as a possible cause of variation in the response variable.

6 EPI809/Spring Analysis of Variance Levels refer to the categories, measurements, or strata of a factor of interest in the experiment.

7 EPI809/Spring Types of Experimental Designs Experimental Designs One-Way Anova Completely Randomized Randomized Block Two-Way Anova Factorial

8 EPI809/Spring Completely Randomized Design

9 EPI809/Spring Completely Randomized Design 1. Experimental Units (Subjects) Are Assigned Randomly to Treatments Subjects are Assumed Homogeneous Subjects are Assumed Homogeneous 2. One Factor or Independent Variable 2 or More Treatment Levels or groups 2 or More Treatment Levels or groups 3. Analyzed by One-Way ANOVA

10 EPI809/Spring One-Way ANOVA F-Test 1. Tests the Equality of 2 or More (p) Population Means 2. Variables One Nominal Independent Variable One Nominal Independent Variable One Continuous Dependent Variable One Continuous Dependent Variable

11 EPI809/Spring One-Way ANOVA F-Test Assumptions 1. Randomness & Independence of Errors 2. Normality Populations (for each condition) are Normally Distributed Populations (for each condition) are Normally Distributed 3.Homogeneity of Variance Populations (for each condition) have Equal Variances Populations (for each condition) have Equal Variances

12 EPI809/Spring One-Way ANOVA F-Test Hypotheses H 0 :  1 =  2 =  3 =... =  p All Population Means are Equal All Population Means are Equal No Treatment Effect No Treatment Effect H a : Not All  j Are Equal At Least 1 Pop. Mean is Different At Least 1 Pop. Mean is Different Treatment Effect Treatment Effect  1   2 ...   p  1   2 ...   p

13 EPI809/Spring One-Way ANOVA F-Test Hypotheses H 0 :  1 =  2 =  3 =... =  p All Population Means are Equal All Population Means are Equal No Treatment Effect No Treatment Effect H a : Not All  j Are Equal At Least 1 Pop. Mean is Different At Least 1 Pop. Mean is Different Treatment Effect Treatment Effect  1 =  2 =... =  p  1 =  2 =... =  p Or  i ≠  j for some i, j. Or  i ≠  j for some i, j. X f(X)  1 = 2 = 3 X f(X)  1 = 2  3

14 EPI809/Spring Compares 2 Types of Variation to Test Equality of Means Equality of Means 2. If Treatment Variation Is Significantly Greater Than Random Variation then Means Are Not Equal 3.Variation Measures Are Obtained by ‘Partitioning’ Total Variation One-Way ANOVA Basic Idea

15 EPI809/Spring One-Way ANOVA Partitions Total Variation

16 EPI809/Spring One-Way ANOVA Partitions Total Variation Total variation

17 EPI809/Spring One-Way ANOVA Partitions Total Variation Variation due to treatment Total variation

18 EPI809/Spring One-Way ANOVA Partitions Total Variation Variation due to treatment Variation due to random sampling Total variation

19 EPI809/Spring One-Way ANOVA Partitions Total Variation Variation due to treatment Variation due to random sampling Total variation Sum of Squares Among Sum of Squares Between Sum of Squares Treatment Among Groups Variation

20 EPI809/Spring One-Way ANOVA Partitions Total Variation Variation due to treatment Variation due to random sampling Total variation Sum of Squares Within Sum of Squares Error (SSE) Within Groups Variation Sum of Squares Among Sum of Squares Between Sum of Squares Treatment (SST) Among Groups Variation

21 EPI809/Spring Total Variation YYYY Group 1 Group 2 Group 3 Response, Y

22 EPI809/Spring Treatment Variation YYYY Y3Y3Y3Y3 Y2Y2Y2Y2 Y1Y1Y1Y1 Group 1 Group 2 Group 3 Response, Y

23 EPI809/Spring Random (Error) Variation Y2Y2Y2Y2 Y1Y1Y1Y1 Y3Y3Y3Y3 Group 1 Group 2 Group 3 Response, Y

24 EPI809/Spring One-Way ANOVA F-Test Test Statistic  1.Test Statistic F = MST / MSE F = MST / MSE MST Is Mean Square for TreatmentMST Is Mean Square for Treatment MSE Is Mean Square for ErrorMSE Is Mean Square for Error  2.Degrees of Freedom 1 = p -1 1 = p -1 2 = n - p 2 = n - p p = # Populations, Groups, or Levelsp = # Populations, Groups, or Levels n = Total Sample Sizen = Total Sample Size

25 EPI809/Spring One-Way ANOVA Summary Table Source of Variation Degrees of Freedom Sum of Squares Mean Square (Variance) F Treatment p - 1 SST MST = SST/(p - 1) MST MSE Error n - p SSE MSE = SSE/(n - p) Total n - 1 SS(Total) = SST+SSE

26 EPI809/Spring One-Way ANOVA F-Test Critical Value  If means are equal, F = MST / MSE  1. Only reject large F! Always One-Tail! F apnp(,)1 0 Reject H 0 Do Not Reject H 0 F © T/Maker Co.

27 EPI809/Spring One-Way ANOVA F-Test Example As a vet epidemiologist you want to see if 3 food supplements have different mean milk yields. You assign 15 cows, 5 per food supplement. Question: At the.05 level, is there a difference in mean yields? Food1 Food2 Food

28 EPI809/Spring F One-Way ANOVA F-Test Solution  H 0 :  1 =  2 =  3  H a : Not All Equal   =.05  1 = 2 2 = 12  Critical Value(s): Test Statistic: Decision:Conclusion: Reject at  =.05 There Is Evidence Pop. Means Are Different  =.05 F MST MSE 

29 EPI809/Spring Summary Table Solution Source of Variation Degrees of Freedom Sum of Squares Mean Square (Variance) F Food = Error = Total =

30 EPI809/Spring SAS CODES FOR ANOVA  Data Anova;  input group$ milk  cards;  food food food  food food food  food food food  food food food  food food food  ;  run;  proc anova; /* or PROC GLM */  class group;  model milk=group;  run;

31 EPI809/Spring SAS OUTPUT - ANOVA Sum of Source DF Squares Mean Square F Value Pr > F Model <.0001 Error Corrected Total

32 EPI809/Spring Pair-wise comparisons  Needed when the overall F test is rejected  Can be done without adjustment of type I error if other comparisons were planned in advance (least significant difference - LSD method)  Type I error needs to be adjusted if other comparisons were not planned in advance (Bonferroni’s and scheffe’s methods)

33 EPI809/Spring Fisher’s Least Significant Difference (LSD) Test To compare level 1 and level 2 Compare this to t  /2 = Upper-tailed value or - t  /2 lower-tailed from Student’s t-distribution for  /2 and (n - p) degrees of freedom MSE = Mean square within from ANOVA table MSE = Mean square within from ANOVA table n = Number of subjects n = Number of subjects p = Number of levels p = Number of levels

34 EPI809/Spring Bonferroni’s method To compare level 1 and level 2 Adjust the significance level α by taking the new significance level α*

35 EPI809/Spring SAS CODES FOR multiple comparisons proc anova; class group; model milk=group; means group/ lsd bon; run;

36 EPI809/Spring SAS OUTPUT - LSD t Tests (LSD) for milk NOTE: This test controls the Type I comparisonwise error rate, not the experimentwise error rate. Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square Critical Value of t = t.975,12 Least Significant Difference Means with the same letter are not significantly different. t Grouping Mean N group A food1 B food2 C food3

37 EPI809/Spring SAS OUTPUT - Bonferroni Bonferroni (Dunn) t Tests for milk NOTE: This test controls the Type I experimentwise error rate Alpha 0.05 Error Degrees of Freedom 12 Error Mean Square Critical Value of t =t /3/2,12 Minimum Significant Difference Means with the same letter are not significantly different. Bon Grouping Mean N group A food1 B food2 C food3

38 EPI809/Spring Randomized Block Design

39 EPI809/Spring Randomized Block Design 1.Experimental Units (Subjects) Are Assigned Randomly within Blocks Blocks are Assumed Homogeneous Blocks are Assumed Homogeneous 2.One Factor or Independent Variable of Interest 2 or More Treatment Levels or Classifications 2 or More Treatment Levels or Classifications 3. One Blocking Factor

40 EPI809/Spring Randomized Block Design Factor Levels: (Treatments)A, B, C, D Experimental Units Treatments are randomly assigned within blocks Block 1ACDB Block 2CDBA Block 3BADC Block bDCAB

41 EPI809/Spring Randomized Block F-Test 1.Tests the Equality of 2 or More (p) Population Means 2.Variables One Nominal Independent Variable One Nominal Independent Variable One Nominal Blocking Variable One Nominal Blocking Variable One Continuous Dependent Variable One Continuous Dependent Variable

42 EPI809/Spring Randomized Block F-Test Assumptions 1.Normality Probability Distribution of each Block- Treatment combination is Normal Probability Distribution of each Block- Treatment combination is Normal 2.Homogeneity of Variance Probability Distributions of all Block- Treatment combinations have Equal Variances Probability Distributions of all Block- Treatment combinations have Equal Variances

43 EPI809/Spring Randomized Block F-Test Hypotheses  H 0 :  1 =  2 =  3 =... =  p All Population Means are Equal All Population Means are Equal No Treatment Effect No Treatment Effect  H a : Not All  j Are Equal At Least 1 Pop. Mean is Different At Least 1 Pop. Mean is Different Treatment Effect Treatment Effect  1   2 ...   p Is wrong  1   2 ...   p Is wrong

44 EPI809/Spring Randomized Block F-Test Hypotheses H 0 :  1 =  2 =... =  p All Population Means are Equal All Population Means are Equal No Treatment Effect No Treatment Effect H a : Not All  j Are Equal At Least 1 Pop. Mean is Different At Least 1 Pop. Mean is Different Treatment Effect Treatment Effect  1   2 ...   p Is wrong  1   2 ...   p Is wrong X f(X)  1 = 2 = 3 X f(X)  1 = 2  3

45 The F Ratio for Randomized Block Designs  SS=SSE+SSB+SST

46 EPI809/Spring Randomized Block F-Test Test Statistic  1.Test Statistic F = MST / MSE F = MST / MSE MST Is Mean Square for TreatmentMST Is Mean Square for Treatment MSE Is Mean Square for ErrorMSE Is Mean Square for Error  2.Degrees of Freedom 1 = p -1 1 = p -1 2 = n – b – p +1 2 = n – b – p +1 p = # Treatments, b = # Blocks, n = Total Sample Sizep = # Treatments, b = # Blocks, n = Total Sample Size

47 EPI809/Spring Randomized Block F-Test Critical Value  If means are equal, F = MST / MSE  1. Only reject large F! Always One-Tail! F apnp(,)1 0 Reject H 0 Do Not Reject H 0 F © T/Maker Co.

48 EPI809/Spring Randomized Block F-Test Example  You wish to determine which of four brands of tires has the longest tread life. You randomly assign one of each brand (A, B, C, and D) to a tire location on each of 5 cars. At the.05 level, is there a difference in mean tread life? Tire Location BlockLeft FrontRight FrontLeft RearRight Rear Car 1A: 42,000C: 58,000B: 38,000D: 44,000 Car 2B: 40,000D: 48,000A: 39,000C: 50,000 Car 3C: 48,000D: 39,000B: 36,000A: 39,000 Car 4A: 41,000B: 38,000D: 42,000C: 43,000 Car 5D: 51,000A: 44,000C: 52,000B: 35,000

49 EPI809/Spring F Randomized Block F-Test Solution  H 0 :  1 =  2 =  3 =  4  H a : Not All Equal   =.05  1 = 3 2 = 12  Critical Value(s): Test Statistic: Decision:Conclusion: Reject at  =.05 There Is Evidence Pop. Means Are Different  =.05 F =

50 EPI809/Spring SAS CODES FOR ANOVA data block; input Block$ trt$ resp cards; Car1A: Car1 C: Car1 B: Car1 D: Car2B: Car2 D: Car2 A: Car2 C: Car3C: Car3 D: Car3 B: Car3 A: Car4A: Car4 B: Car4 D: Car4 C: Car5D: Car5 A: Car5 C: Car5 B: ; run; proc anova; class trt block; model resp=trt block; Means trt /lsd bon; run;

51 EPI809/Spring SAS OUTPUT - ANOVA Dependent Variable: resp Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE resp Mean Source DF Anova SS Mean Square F Value Pr > F trt Block

52 EPI809/Spring SAS OUTPUT - LSD Means with the same letter are not significantly different. t Grouping Mean N trt A C: B D: B C B A: C C B:

53 EPI809/Spring SAS OUTPUT - Bonferroni Means with the same letter are not significantly different. Bon Grouping Mean N trt A C: A B A D: B B C A: C C B:

54 EPI809/Spring Factorial Experiments

55 EPI809/Spring Factorial Design  1.Experimental Units (Subjects) Are Assigned Randomly to Treatments Subjects are Assumed Homogeneous Subjects are Assumed Homogeneous  2.Two or More Factors or Independent Variables Each Has 2 or More Treatments (Levels) Each Has 2 or More Treatments (Levels)  3.Analyzed by Two-Way ANOVA

56 EPI809/Spring Advantages of Factorial Designs 1.Saves Time & Effort e.g., Could Use Separate Completely Randomized Designs for Each Variable e.g., Could Use Separate Completely Randomized Designs for Each Variable 2.Controls Confounding Effects by Putting Other Variables into Model 3.Can Explore Interaction Between Variables

57 EPI809/Spring Two-Way ANOVA 1. Tests the Equality of 2 or More Population Means When Several Independent Variables Are Used 2. Same Results as Separate One-Way ANOVA on Each Variable - But Interaction Can Be Tested

58 EPI809/Spring Two-Way ANOVA Assumptions 1.Normality Populations are Normally Distributed Populations are Normally Distributed 2.Homogeneity of Variance Populations have Equal Variances Populations have Equal Variances 3.Independence of Errors Independent Random Samples are Drawn Independent Random Samples are Drawn

59 EPI809/Spring Two-Way ANOVA Data Table YijkYijkYijkYijk Level i Factor A Level j Factor B Observation k Factor Factor B A12...b 1 Y 111 Y Y 1b1 Y 112 Y Y 1b2 2 Y 211 Y Y 2b1 Y 212 Y YX 2b2 : :::: a Y a11 Y a21...Y ab1 Y a12 Y a22...Y ab2

60 EPI809/Spring Two-Way ANOVA Null Hypotheses 1.No Difference in Means Due to Factor A H 0 :  1. =  2. =... =  a. H 0 :  1. =  2. =... =  a. 2.No Difference in Means Due to Factor B H 0 : .1 = .2 =... = .b H 0 : .1 = .2 =... = .b 3.No Interaction of Factors A & B H 0 : AB ij = 0 H 0 : AB ij = 0

61 EPI809/Spring Total Variation Two-Way ANOVA Total Variation Partitioning Variation Due to Treatment A Variation Due to Random Sampling Variation Due to Interaction  SSE  SSA SS(AB) SS(Total) Variation Due to Treatment B SSB

62 EPI809/Spring Source of Variation Degrees of Freedom Sum of Squares Mean Square F A (Row) a - 1 SS(A)MS(A)MS(A) MSE B (Column) b - 1 SS(B)MS(B)MS(B) MSE AB (Interaction) (a-1)(b-1)SS(AB)MS(AB)MS(AB) MSE Error n - ab SSEMSE Total n - 1 SS(Total) Two-Way ANOVA Summary Table Same as Other Designs

63 EPI809/Spring Interaction 1.Occurs When Effects of One Factor Vary According to Levels of Other Factor 2.When Significant, Interpretation of Main Effects (A & B) Is Complicated 3.Can Be Detected In Data Table, Pattern of Cell Means in One Row Differs From Another Row In Data Table, Pattern of Cell Means in One Row Differs From Another Row In Graph of Cell Means, Lines Cross In Graph of Cell Means, Lines Cross

64 EPI809/Spring Graphs of Interaction Effects of Gender (male or female) & dietary group (sv, lv, nor) on systolic blood pressure Interaction No Interaction Average Response svlvnor male female Average Response svlvnor male female

65 EPI809/Spring Two-Way ANOVA F-Test Example Effect of diet (sv-strict vegetarians, lv- lactovegetarians, nor-normal) and gender (female, male) on systolic blood pressure. Question: Test for interaction and main effects at the.05 level.

66 EPI809/Spring SAS CODES FOR ANOVA data factorial; input dietary$ sex$ sbp; cards; sv male sv male sv male sv male sv male sv male sv female sv female 99 sv female 83.6 sv female 99.6 sv female sv female lv male lv male lv male lv male lv male lv male nor male nor male nor male nor male nor male nor male nor female nor female nor female nor female nor female nor female ; run;

67 EPI809/Spring SAS CODES FOR ANOVA proc glm; class dietary sex; model sbp=dietary sex dietary*sex; run; proc glm; class dietary sex; model sbp=dietary sex; run;

68 EPI809/Spring SAS OUTPUT - ANOVA Dependent Variable: sbp Sum of Source DF Squares Mean Square F Value Pr > F Model Error Corrected Total R-Square Coeff Var Root MSE sbp Mean Source DF Type I SS Mean Square F Value Pr > F dietary sex dietary*sex Source DF Type III SS Mean Square F Value Pr > F dietary sex dietary*sex

69 EPI809/Spring Linear Contrast  Linear Contrast is a linear combination of the means of populations  Purpose: to test relationship among different group means Example: 4 populations on treatments T1, T2, T3 and T4. Contrast T1 T2 T3 T4 relation to test L μ 1 - μ 3 = 0 L2 1 -1/2 -1/2 0 μ 1 – μ 2 /2 – μ 3 /2 = 0 with

70 EPI809/Spring T-test for Linear Contrast (LSD)  Construct a t statistic involving k group means. Degrees of freedom of t - test: df = n-k. Compare with critical value t 1-α/2,, n-k. Reject H 0 if |t| ≥ t 1-α/2,, n-k. SAS uses contrast statement and performs an F – test df (1, n-k); Or estimate statement and perform a t-test df (n-k). To test H 0 : Construct

71 EPI809/Spring T-test for Linear Contrast (Scheffe)  Construct multiple contrasts involving k group means. Trying to search for significant contrast Compare with critical value. To test H 0 : Construct Reject H 0 if |t| ≥ a

72 EPI809/Spring SAS Code for contrast testing   proc glm;   class trt block;   model resp=trt block;   Means trt /lsd bon scheffe;   contrast 'A - B = 0' trt ;   contrast 'A - B/2 - C/2 = 0' trt ;   contrast 'A - B/3 - C/3 -D/3 = 0' trt ;   contrast 'A + B - C - D = 0' trt ;   lsmeans trt/stderr pdiff;   lsmeans trt/stderr pdiff adjust=scheffe; /* Scheffe's test */   lsmeans trt/stderr pdiff adjust=bon; /* Boneferoni's test */   estimate ‘A - B' trt ;   run;

73 EPI809/Spring Regression representation of Anova

74 EPI809/Spring Regression representation of Anova  One-way anova:  Two-way anova:  SAS uses a different constraint

75 EPI809/Spring Regression representation of Anova  One-way anova: Dummy variables of factor with p levels  This is the parameterization used by SAS

76 EPI809/Spring Conclusion: should be able to 1. Recognize the applications that uses ANOVA 2. Understand the logic of analysis of variance. 3. Be aware of several different analysis of variance designs and understand when to use each one. 3. Be aware of several different analysis of variance designs and understand when to use each one. 4. Perform a single factor hypothesis test using analysis of variance manually and with the aid of SAS or any statistical software.

77 EPI809/Spring Conclusion: should be able to 5. Conduct and interpret post-analysis of variance pairwise comparisons procedures. 6. Recognize when randomized block analysis of variance is useful and be able to perform the randomized block analysis. 7. Perform two factor analysis of variance tests with replications using SAS and interpret the output.

78 EPI809/Spring Key Terms  Between-Sample Variation  Completely Randomized Design  Experiment-Wide Error Rate  Factor  Levels  One-Way Analysis of Variance  Total Variation  Treatment  Within-Sample Variation


Download ppt "EPI809/Spring 2008 1 Chapter 12 Multisample inference: Analysis of Variance Analysis of Variance."

Similar presentations


Ads by Google