Presentation is loading. Please wait.

Presentation is loading. Please wait.

Analysis of Variance (ANOVA)

Similar presentations


Presentation on theme: "Analysis of Variance (ANOVA)"— Presentation transcript:

1 Analysis of Variance (ANOVA)
Shibin Liu SAS Beijing R&D

2 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 2

3 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 3

4 Lesson overview One sample t-Test μ μ0 4

5 Lesson overview Two-sample t-Test ? μ1 μ2 5

6 Lesson overview ANOVA ? ? μ1 μ2 μ3 6

7 Lesson overview ANOVA Predictor Variable Response Variable + Levels 7

8 + Lesson overview Response Variable Predictor Variable
ANOVA Response Variable Predictor Variable Predictor Variable + 8

9 Lesson overview One sample t-Test Two-sample t-Test ANOVA 9

10 Lesson overview 10 What do you want to examine?
The relationship between variables The difference between groups on one or more variables The location, spread, and shape of the data’s distribution Summary statistics or graphics? How many groups? Which kind of variables? SUMMARY STATISTICS DISTRIBUTION ANALYSIS TTEST LINEAR MODELS CORRELATIONS ONE-WAY FREQUENCIES & TABLE ANALYSIS LINEAR REGRESSION LOGISTIC REGRESSION Summary statistics Both Two Two or more Descriptive Statistics Descriptive Statistics, histogram, normal, probability plots Analysis of variance Continuous only Frequency tables, chi-square test Categorical response variable Inferential Statistics Lesson 1 Lesson 2 Lesson 3 & 4 Lesson 5 10

11 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 11

12 The Two-Sample t-Test: Introduction
? μ1 μ2 12

13 The Two-Sample t-Test: Introduction
? Salary Blood pressure Salary ? Blood pressure? … … 13

14 The Two-Sample t-Test: Introduction
In this topic, we will learn to do the following: Analyze differences between two population means using the t-Test task Verify the assumption of and perform a two-sample t-Test perform a one-sided t-Test 14

15 The Two-Sample t-Test: Introduction
Ha: μ1 ≠ μ2 Ha: μ1 - μ2 ≠ 0 15

16 Two-Sample t-Tests: Assumptions
Before we start the analysis, examine the data to verify that the statistical assumption are valid: Independent observations: Normally distributed data for each group Equal variances for each group. No information provided by other observations No impact between the observations If one of the assumptions is not valid and no adjustments are made, the probability of drawing incorrect conclusions from the analysis could increase. 𝜎1 2 = 𝜎2 2 16

17 Two-Sample t-Tests: F-Test for Equality of Variance
H0: 𝜎1 2 = 𝜎2 2 H1: 𝜎1 2 ≠ 𝜎2 2 F-Statistic 𝐹= max ( 𝑠1 2 , 𝑠2 2 ) m𝑖𝑛 ( 𝑠1 2 , 𝑠2 2 ) This test is only valid for independent samples from normal distributions. Normality is required even for large sample size. If you reject the null hypothesis, it is recommended that you use unequal variance t-test for testing the equality. When the null hypothesis is true, what value will the F-Statistic be close to? 𝐹≅1 𝐹=large value 17

18 Two-Sample t-Tests: Examining the Equal Variance t-Test and p-Values
? F-Test for Equal Variance H0: 𝜎1 2 = 𝜎2 2 > > 0.7446 0.05 18

19 Two-Sample t-Tests: Examining the Equal Variance t-Test and p-Values
𝜎1 2 = 𝜎2 2 > < 0.0003 0.05 H0: μ1 - μ2 =0 Ha: μ1 - μ2 ≠ 0 19

20 Two-Sample t-Tests: Examining the Unequal Variance t-Test and p-Values
? F-Test for Equal Variance H0: 𝜎1 2 = 𝜎2 2 > < 0.0185 0.05 20

21 Two-Sample t-Tests: Examining the Unequal Variance t-Test and p-Values
𝜎1 2 ≠ 𝜎2 2 > < 0.0320 0.05 H0: μ1 - μ2 =0 Ha: μ1 - μ2 ≠ 0 21

22 Two-Sample t-Tests: Demo
Scenario: compare two group’s means, girl’s and boy’s SAT scores Identify the data TestScores Classification variable? Continuous variable to analyze? 22

23 Two-Sample t-Tests: Demo
Analyze > ANOVA > t Test 23

24 Two-Sample t-Tests: Demo
Result interpreting: Normal? 24

25 Two-Sample t-Tests: Demo
Result interpreting Normal? 25

26 Two-Sample t-Tests: Demo
Result interpreting H0: μ1 - μ2 =0 H0: 𝜎1 2 = 𝜎2 2 26

27 Two-Sample t-Tests: Demo
Result interpreting The confidence interval for the mean difference ( , 125.2) includes 0. this implies that you cannot say with 95% confidence that the difference between boys and girls is not zero. Therefore, it also implies that the p-value is greater than 0.05. 27

28 Two-Sample t-Tests: One-Sided Tests
Ha: μ1 ≠ μ2 Ha: μ1 > μ2 New drug only have positive effect <0 Ha: μ1 −μ2 <0 One-Sided Test 28

29 Two-Sample t-Tests: One-Sided Tests
H0: μ1 ≤ μ2 H0: μ1 ≥ μ2 One direction One direction <0 29

30 Two-Sample t-Tests: One-Sided Tests
H0: μ1 > μ2 H0: equality Ha: μ1 < μ2 Power is the ability when you reject the null hypothesis when the hypothesis is false. Probability of Correct Rejection= 1−𝛽=Power Type II Error: Fail to Reject Null, when H0 is False Probability of Type II Error = 𝛽 Critical region 30

31 Two-Sample t-Tests: One-Sided Tests
H0: μ1 > μ2 H0: equality Ha: μ1 < μ2 Power is the probability when you reject the null hypothesis when the hypothesis is false. Or the probability when you detect a difference when the difference actually exists. Probability of Correct Rejection= 1−𝛽=Power Type II Error: Fail to Reject Null, when H0 is False Probability of Type II Error = 𝛽 What is Power? Power is the probability when your test will reject the null hypothesis when the hypothesis is false. Or the probability when you detect a difference when the difference actually exists. 31

32 Two-Sample t-Tests: One-Sided Tests/Scenario
one-sided upper-tailed t-test > SAT Scores SAT Scores Group 1 Group 2 H0: μ1 −μ2≤0 Ha: μ1 −μ2 >0 32

33 Two-Sample t-Tests: One-Sided Tests/Scenario
one-sided upper-tailed t-test t statistic H0: μ1 −μ2≤0 Ha: μ1 −μ2 >0 33

34 Two-Sample t-Tests: One-Sided Tests/Scenario
Analyze > ANOVA > t Test > Two sample > Preview Code> Insert code 34

35 Two-Sample t-Tests: One-Sided Tests/Scenario
0.0321< 0.05 0.0321< 0.05 0 is not in the interval 0 is not in the interval 35

36 Two-Sample t-Tests Question 1.
What justifies the choice of a one-sided test versus a two-sided test The need for more statistical power Theoretical and subject-matter considerations The non-significance of a two-sided test The need for an unbiased test statistic Answer: b 36

37 Two-Sample t-Tests Question 2.
A professor suspects her class is performing below the department average of 73%. She decides to test this claim. Which of the following is the correct alternative hypothesis? μ < 0.73 μ > 0.73 μ ≠ 0.73 Because the professor suspects that her class is performing below the department average, this is an example of a lower-tailed test. So the alternative hypotheses is u<0.73. Answer: a 37

38 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 38

39 One-Way ANOVA: Introduction
? ? μ1 μ2 μ3 39

40 One-Way ANOVA: Objective
Analyze difference between population means using the Linear Models task Verify the assumption of analysis variance. 40

41 One-Way ANOVA: ANOVA overview
Response Variable Predictor Variable + One-Way ANOVA only have one Predictor Variable. Levels 41

42 One-Way ANOVA: ANOVA overview
Case 1 Programmer earns more than teacher? Response Variable Predictor Variable salary Job title Programmer earns more than teacher? Teacher Programmer Teacher 42

43 One-Way ANOVA: ANOVA overview
Case 1 Two-sample t-Test Programmer earns more than teacher? Teacher Programmer Teacher Predictor Variable 43

44 One-Way ANOVA: ANOVA overview
Case 1 = F statistic One-way ANOVA t statistic2 Two-sample t-Test Programmer earns more than teacher? Teacher = Programmer Teacher Predictor Variable 44

45 One-Way ANOVA: ANOVA overview
Case 2 Response Variable T-cell counts Predictor Variable t cell [医]淋巴细胞(胸腺依赖性细胞) T-cell counts Medication 1 Medication 2 Placebo 45

46 One-Way ANOVA: ANOVA overview
Case 2 Two-sample t-Test Medication 1 Placebo t cell [医]淋巴细胞(胸腺依赖性细胞) T-cell counts Medication 2 Medication 1 Medication 2 Placebo 46

47 One-Way ANOVA: ANOVA overview
Case 2 One-way ANOVA Medication 1 Placebo Medication 2 47

48 One-Way ANOVA: The ANOVA Hypothesis
Small difference ANOVA ANOVA help check if the diffs are large enough to reject the population mean are same. H0: μ1 = μ2 = μ3 = μ4 48

49 One-Way ANOVA: The ANOVA Hypothesis
Medication 1 Predictor Variable Placebo Medication 2 49

50 One-Way ANOVA: The ANOVA Hypothesis
Ha: μ1 = μ2 ≠ μ3 Medication 1 Predictor Variable Placebo Medication 2 different 50

51 One-Way ANOVA: The ANOVA model
Error term i: indexes treatments (three types of medication) K: observation number μ: overall population mean i: indexes treatments (three types of medication) K: observation number effect τ1 =μ1 - μ One-way ANOVA μ: overall population mean τ2 =μ2 - μ τ3 =μ3 - μ 51

52 One-Way ANOVA: Sums of Squares
H0: μ1 = μ2 = μ3 > Variability between groups Variability within groups By ratio By ratio > 52

53 One-Way ANOVA: Sums of Squares
SST Total Sum of Squares SSM Model Sum of Squares SSE Error Sum of Squares 53

54 One-Way ANOVA: Sums of Squares
Variability between groups Variability within groups Total variability SST = SSM + SSE SST Total Sum of Squares SSM Model Sum of Squares SSE Error Sum of Squares SST =∑ ∑ ( 𝑌 𝑖𝑗 − 𝑌 ) 2 Total Sum of Squares SSM =∑ 𝑛 𝑖 ( 𝑌 𝑖 − 𝑌 ) 2 Model Sum of Squares SSE =∑ ∑ ( 𝑌 𝑖𝑗 − 𝑌 𝑖 ) Error Sum of Squares 54

55 One-Way ANOVA: F statistic
F statistic and Critical Value at α=0.05 F(Model df, Error df)= MSM / MSE= SSM 𝑑𝑓𝑀 SSE 𝑑𝑓𝐸 MSM : Model Mean Square MSE : Error Mean Square In general, degree of freedom (DF) can be thought of as the number of independent pieces of information. Model DF is the number of treatment minus 1. Corrected total DF is the sample size minus 1. Error DF is the sample size minus the number of treatments (or the difference between the corrected total DF and the Model DF) SST Total Sum of Squares SSM Model Sum of Squares SSE Error Sum of Squares 55

56 One-Way ANOVA: Coefficient of Determination
“proportion of variance accounted for by the model ” SST Total Sum of Squares SSM Model Sum of Squares SSE Error Sum of Squares 56

57 One-Way ANOVA: Assumptions for ANOVA
Independent observations Error terms are normally distributed Error terms have equal variances Assessing ANOVA Assumptions Good data collection methods help ensure the independence assumption. Diagnostic plots can be used to verify the assumption that the error is approximately normally distributed The Linear Models task can produce a test of equal variance. H0 for this hypothesis test is that the variances are equal for all populations. > 57

58 One-Way ANOVA: Predicted and Residual values
Estimates of the error term - = Observation Group mean Residuals > 58

59 One-Way ANOVA: Predicted and Residual values
+ The predicted value in ANOVA is the group mean. A residual is the difference between the observed value of the response and the predicted value of the response variable. The predicted value in ANOVA is the group mean. A residual is the difference between the observed value of the response and the predicted value of the response variable. 59

60 Three fertilizers + control
One-Way ANOVA: Scenario: Comparing Group Means with One-Way ANOVA Garlic Three fertilizers + control Predictor Variable Response Variable Bulb Weight Fertilizer 60

61 Ha: at least one is different
One-Way ANOVA: Scenario: Comparing Group Means with One-Way ANOVA ANOVA H0: μ1 = μ2 = μ3 = μ4 Ha: at least one is different 61

62 Description check of MGGARLIC across groups
One-Way ANOVA: Scenario: Comparing Group Means with One-Way ANOVA Description check of MGGARLIC > Summary Statistic Description check of MGGARLIC across groups 62

63 One-Way ANOVA: Question 1.
You have 20 observations in you ANOVA and you calculate the residuals. What will they sum to? -20 20 400 Need more information Answer: d 63

64 One-Way ANOVA: Question 2.
Which of the following phrases describes the model sums of squares, or SSM? The variability between the groups The variability within the groups The variability explained by the error terms Answer: a 64

65 One-Way ANOVA: Question 3.
Match the null hypothesis to the correct SAS output a) b H0: 𝜎1 2 = 𝜎2 2 b) a H0: μ 1 = μ 2 65

66 One-Way ANOVA: Scenario: Comparing Group Means with One-Way ANOVA
Task> ANOVA>Linear Models, with MGGARLIC data 66

67 Three assumptions: Independence Normality Equal variance
One-Way ANOVA: Scenario: Comparing Group Means with One-Way ANOVA Task> ANOVA>Linear Models Three assumptions: Independence Normality Equal variance One-Way ANOVA can say at "least one mean is different" , but cannot say "which specific group is different". post hoc test can do it. 67

68 One-Way ANOVA: Summary
Null Hypothesis: All means are equal Alternative Hypothesis: at least one mean is different Produce descriptive statistics. Verify assumptions. Independence Normality Equal variance Examine the p-value in the ANOVA table. If the p-value is less than alpha, reject the null hypothesis. 68

69 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 69

70 ANOVA with Data from a Randomized Block Design: Introduction
Brands age Placebo Medication 1 Medication 2 70

71 ANOVA with Data from a Randomized Block Design: Introduction
Over 50 30-50 Under30 We are not interested in age, but want make the impact of age to minimum. 71

72 ANOVA with Data from a Randomized Block Design: Objective
Recognize the difference between a completely randomized design and a randomized block design. Differentiate between observed data and designed experiments. Use the Linear Models task to analyze data from a randomized block design 72

73 ANOVA with Data from a Randomized Block Design: Observational Studies
Groups can be naturally occurring. Gender and ethnicity Random assignment might be unethical or untenable Smoking or credit risk groups Groups can be naturally occurring. Gender and ethnicity Random assignment might be unethical or untenable: Smoking or credit risk groups In Observational or Retrospective studies, the data values are observed as they occur, not affected by an experimental design. In Observational or Retrospective studies, the data values are observed as they occur, not affected by an experimental design. 73

74 ANOVA with Data from a Randomized Block Design: Controlled Experiments
Random assignment might be desirable to eliminate selection bias. You often wan to look at the outcome measure prospectively. You can manipulate the factors of interest and can more reasonably claim causation. You can design your experiment to control for other factors contributing to the outcome measure. 74

75 ANOVA with Data from a Randomized Block Design
Question 3. Can you determine a cause-and–effect relationship in an observational study? Yes No In an observational study, you often examine what already occurred, and therefore have little control over factors contributing to the outcome. In a controlled experiment, you can manipulate the factors of interest and can more reasonably claim causation. Answer: b observational study controlled study In an observational study, you often examine what already occurred, and therefore have little control over factors contributing to the outcome. In a controlled experiment, you can manipulate the factors of interest and can more reasonably claim causation. 75

76 ANOVA with Data from a Randomized Block Design: Nuisance Factors
Nuisance Factors are factors can affect the outcome but are not of interest in the experiment. = T-Cell Count Medication + Age Age + Medication Randomized block design 76

77 ANOVA with Data from a Randomized Block Design
Question 4. Which part of the ANOVA tables contains the variation due to nuisance factors? Sum of Squares Model Sum of Squares Error Degrees of Freedom Answer: b 77

78 ANOVA with Data from a Randomized Block Design: Including a Blocking Variable in the Model
Age 78

79 ANOVA with Data from a Randomized Block Design: Including a Blocking Variable in the Model
+ Age 79

80 ANOVA with Data from a Randomized Block Design: More ANOVA Assumptions
The treatment are randomly assigned to each block Independent observations Under30 30-50 Over 50 Normally distributed data The treatment are randomly assigned to each block The treatments don’t change across the blocks. Interaction: the treatment changes across the blocks. Equal variances No interaction 80

81 ANOVA with Data from a Randomized Block Design
Scenario: Creating a Randomized Block Design Garlic H0: μ1 = μ2 = μ3 = μ4 Ha: at least one is different Sun What’s the nuisance factors in this case PH level of soil Rain 81

82 ANOVA with Data from a Randomized Block Design
Scenario: Creating a Randomized Block Design Sun = Bulb Weight fertilizer + PH of the soil + Rain 82

83 ANOVA with Data from a Randomized Block Design
Scenario: Creating a Randomized Block Design Randomized block design 83

84 ANOVA with Data from a Randomized Block Design
Scenario: Creating a Randomized Block Design 84

85 ANOVA with Data from a Randomized Block Design
Question 5. In a block design, Which part of the ANOVA tables contains the variation due to nuisance factors? Sum of Squares Model Sum of Squares Error Degrees of Freedom Answer: a 85

86 ANOVA with Data from a Randomized Block Design: Performing ANOVA with Blocking
Task> ANOVA>Linear Models, with MGGARLIC_BLOCK data 86

87 ANOVA with Data from a Randomized Block Design: Performing ANOVA with Blocking
Task> ANOVA>Linear Models, with MGGARLIC_BLOCK data The F value of Sector, the blocking variable, is 6.53 more than 1. Thumb rule: if the F value of the blocking variable is more than 1, it should be considered. 87

88 ANOVA with Data from a Randomized Block Design
My groups are different. What next? The p-value for Fertilizer indicates you should reject the H0 that all groups are the same. From which pairs of fertilizers, are garlic bulb weights different from one another? Should you go back and do several t-tests? 88

89 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 89

90 ANOVA Post Hoc Tests: Introduction
μ1 One-way ANOVA μ2 μ? Randomized block design μ3 90

91 ANOVA Post Hoc Tests: Introduction
μ1 Pairwise test H0: μ1 = μ2 P-value μ? Type I error μ2 Pairwise test H0: μ1 = μ3 P-value Pairwise test H0: μ2 = μ3 P-value μ3 ANOVA Post Hoc Tests Type I error Multiple Comparison Method 91

92 ANOVA Post Hoc Tests: Multiple Comparison Methods
Question 7. With a fair coin, your probability of getting heads on one flip is 0.5. if you flip a coin and got heads, what is the probability of getting heads on the second try? 0.5 0.25 0.00 1.00 0.75 Answer: a 92

93 ANOVA Post Hoc Tests: Multiple Comparison Methods
Question 8. With a fair coin, your probability of getting heads on one flip is 0.5. If you flip a coin twice, what is the probability of getting at least one head out of two? 0.5 0.25 0.00 1.00 0.75 Answer: e 93

94 ANOVA Post Hoc Tests: Multiple Comparison Methods
Pairwise t-test Type I Error H0: μ1 = μ2 H0: μ1 = μ3 α=0.05 H0: μ2 = μ3 94

95 ANOVA Post Hoc Tests: Multiple Comparison Methods
Comparisonwise Error Rate Number of Comparisons Experimentwise Error Rate 0.05 1 3 0.14 6 0.26 10 0.40 Comparisonwise Error Rate(CER): the probability of Type I Error on the single one Pairwise t-test. The Experimentwise Error Rate (EER) uses an alpha that take into consideration all the pairwise comparisons you are making. nc: Number of comparisons Reject 1 of 3 null hypothesis just by chance, even the null is true. Type I Error 𝐸𝐸𝑅=1− (1−𝛼) 𝑛𝑐 Pairwise t-test nc: Number of comparisons 95

96 ANOVA Post Hoc Tests: Tukey's Multiple Comparison Method
Tukey Method EER H0: μ1 = μ2 H0: μ1 = μ3 Pairwise comparisons H0: μ2 = μ3 96

97 ANOVA Post Hoc Tests: Tukey's Multiple Comparison Method
Tukey Method EER=0.05 EER<0.05 H0: μ1 = μ2 H0: μ1 = μ3 Pairwise comparisons 97

98 ANOVA Post Hoc Tests: Tukey's Multiple Comparison Method
This method is appropriate when you consider pairwise comparisons only. The Experimentwise Error Rate is: Equal to alpha when all pairwise comparisons are considered Less than alpha when fewer than all pairwise comparisons are considered 98

99 ANOVA Post Hoc Tests: scenario: determine which mean is different
Garlic Ha: at least one is different Fertilizers: three organics, one control ? Fertilizers: three organics, one control 99

100 ANOVA Post Hoc Tests: Diffograms and the Tukey Method
Difference between the means least square mean by Equality of the means The downward-sloping diagnose line shows the confidence intervals for the difference. The upward-sloping line is a reference line showing where the group means would be equal. intersection of the downward-sloping diagnose line for a pair with the upward-sloping, broken gray diagonal line implies that the confidence interval includes zero and that the mean difference between the two groups is not statistically significant. 100

101 ANOVA Post Hoc Tests: Diffograms and the Tukey Method
Is there the diff between the treatments 1 and 2? Can you identify the pairwise comparisons that do not have significant diff means? The downward-sloping diagnose line shows the confidence intervals for the difference. The upward-sloping line is a reference line showing where the group means would be equal. intersection of the downward-sloping diagnose line for a pair with the upward-sloping, broken gray diagonal line implies that the confidence interval includes zero and that the mean difference between the two groups is not statistically significant. 101

102 ANOVA Post Hoc Tests: Dunnett's Multiple Comparison Method
Special Case of Comparing to a Control Comparing to a control is appropriate when there is a natural reference group, such as a placebo group in a drug trial. Experimentwise Error Rate is no greater than the stated alpha Comparing to a control takes into account the correlations among tests One-sided hypothesis test against a control group can be performed Control comparison computes and tests k-1 GroupWise differences, where k is the number of levels of the classification variable. An example is the Dunnett method Dunnett's Multiple Comparison Method is recommended when there is a true control group. When appropriate it is more powerful than methods that control for all possible comparisons. 102

103 ANOVA Post Hoc Tests: Control Plots and the Dunnett Method
Upper decision limit L-S-mean control plot are produced only when you specify that you want to compare all other group means against a control group mean. The value of the control group mean is shown as a horizontal line. the shaded area is bounded by the UDL and LDL (Upper decision limit and Lower decision limit). if the vertical line extends past the shaded area, the means that the group represented by that line is significantly from the control group. Lower decision limit 103

104 ANOVA Post Hoc Tests: Performing a Post Hoc Tests
104

105 ANOVA Post Hoc Tests: Performing a Post Hoc Tests: Turkey
105

106 ANOVA Post Hoc Tests: Performing a Post Hoc Tests: Dunnett
106

107 ANOVA Post Hoc Tests: Performing a Post Hoc Tests: t-test
107

108 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 108

109 Two-Way ANOVA with Interactions: Introduction
One-way ANOVA ? ? μ1 μ2 μ3 109

110 Two-Way ANOVA with Interactions: Introduction
One-way ANOVA Predictor Variable Response Variable + Levels 110

111 Two-Way ANOVA with Interactions: Introduction
Response Variable Predictor Variable Predictor Variable + Levels 111

112 Two-Way ANOVA with Interactions: Introduction
High alloy Low alloy Heat 1 Heat 2 Heat 3 Heat 4 112

113 Two-Way ANOVA with Interactions: Objective
Fit a two-way ANOVA Detect interactions between factors Analyze the treatments when there is a significant interaction 113

114 Two-Way ANOVA with Interactions: n-Way ANOVA
Response Variable Predictor Variable One-way ANOVA Response Variable Predictor Variable Predictor Variable N-way ANOVA More than one Predictor Variable More than one Predictor Variable 114

115 Two-Way ANOVA with Interactions: n-Way ANOVA
? Randomized block design N-way ANOVA blocking factor interested factor 115

116 Two-Way ANOVA with Interactions: interactions
No interaction 116

117 Two-Way ANOVA with Interactions: interactions
? Alloys Interactions When you analyze an n-way ANOVA with interactions, you should first look at any tests for interactions among factors. If there is no interactions between the factors, the tests for the individual factor effects can be interpreted as true effects of that factor. If an interactions exists between any factors, the tests for the individual factor effects might be misleading, due to masking of the effects by the interaction. This is especially true for unbalanced data. ? Heat setting 117

118 Two-Way ANOVA with Interactions: The Two-Way ANOVA Model
When non-significant Error term μ: overall population mean, Regardless of alloy and heating effect effect Effect of interaction αi =μi - μ βj =μj - μ 118

119 Two-Way ANOVA with Interactions: Scenario: Using a Two-Way ANOVA
Response Variable Predictor Variable Predictor Variable Levels Levels 119

120 Two-Way ANOVA with Interactions: Scenario: Using a Two-Way ANOVA
Response Variable Predictor Variable Predictor Variable Blood pressure Disease types Drug doses A, B, C 100ml, 200ml, 300ml, placebo 120

121 Two-Way ANOVA with Interactions: Identify the data
Drug 121

122 Two-Way ANOVA with Interactions: Applying the model
Two-way ANOVA assumptions: Independent observations Normally distributed data Equal variances 122

123 Two-Way ANOVA with Interactions: The Two-Way ANOVA Model
Observed BllodP for each patient effect effect Effect of interaction Error term αi =μi - μ βj =μj - μ Overall mean of BllodP 123

124 Two-Way ANOVA with Interactions: The Two-Way ANOVA Model
H0: None are statistically Different H0: ? 124

125 Two-Way ANOVA with Interactions: Examining Your Data
/* Create format, Method I, via EG UI */ data drugdose; input dose $ 8. level; cards; Placebo 1 50 mg 2 100 mg 3 200 mg 4 ; run; /*Method II, by code*/ proc format library=work; value dosefmt 1='Placebo' 2='50 mg' 3='100 mg' 4='200 mg'; run; 125

126 Two-Way ANOVA with Interactions: Examining Your Data
In which disease type does the drug dose appear to be most effective? 126

127 Two-Way ANOVA with Interactions: Performing Two-Way ANOVA with Interactions
Task> ANOVA> Linear Models 127

128 Two-Way ANOVA with Interactions: Performing Two-Way ANOVA with Interactions
The type I SS are model-order dependent. Each effect is adjusted only for the preceding effects in the model. They are known as sequential sums of squares. They are useful in cases where the marginal effect for adding terms in a specific order is important. An example is a test of polynomials, where X, X*X, and X*X*X are in the model. Each term is test only controlling for a lower-order term. The type I SS is additive. They sum to the Model Sum of Squares for the overall model. The type III sum of squares are commonly called partial sum of squares. The type III sum of squares for a particular variable is the increase in the model sum of squares due to adding the variable to a model that already contains all the other variables in the model. The type III sum of squares, therefore, do not depend on the order in which the explanatory variables are specified in the model. The type III sum of squares values are not generally additive (except in a completely balanced design). The values do not necessarily sum to the Model SS. You generally interpret and report results based on the type III SS. 128

129 Two-Way ANOVA with Interactions: Performing a Post Hoc Pairwise Comparison
129

130 Two-Way ANOVA with Interactions: Performing a Post Hoc Pairwise Comparison
Given all of this information, it seems you would want to aggressively treat blood pressure in people with disease A with high dose of the drug. For those with disease B, treating with the drug at all would be a mistake. For those with disease C, there seems to be no effect on blood pressure. 130

131 Agenda 0. Lesson overview 1. Two-Sample t-Tests 2. One-Way ANOVA
3. ANOVA with Data from a Randomized Block Design 4. ANOVA Post Hoc Tests 5. Two-Way ANOVA with Interactions 6. Summary 131

132 Question 9. If you want to compare the average monthly spending for males versus females which statistical method should you choose? One-Sample t-Tests One-Way ANOVA Two-Way ANOVA Answer: b 132

133 Home Work: Exercise 1 1.1 Using the t Test for Comparing Groups
Elli Sageman, a Master of Education candidate in German Education at the University of North Carolina at Chapel Hill in 2000, collected data for a study: she looked at the effectiveness of a new type of foreign language teaching technique on grammar skills. She selected 30 students to receive tutoring; 15 received the new type of training during the tutorials and 15 received standard tutoring. Two students moved away from the district before completing the study. Scores on a standardized German grammar test were recorded immediately before the 12–week tutorials and then again 12 weeks later at the end of the trial. Sageman wanted to see the effect of the new technique on grammar skills. The data are in the GERMAN data set. Change change in grammar test scores Group the assigned treatment, coded Treatment and Control Assess whether the Treatment group changed the same amount as the Control group. Use a two-sided t-test. Analyze the data using the t Test task. Assess whether the Treatment group improved more than the Control group. Do the two groups appear to be approximately normally distributed? Do the two groups have approximately equal variance? Does the new teaching technique seem to result in significantly different change scores compared with the standard technique? 133

134 Home Work: Exercise 2 2.1 Analyzing Data in a Completely Randomized Design Consider an experiment to study four types of advertising: local newspaper ads, local radio ads, in-store salespeople, and in-store displays. The country is divided into 144 locations, and 36 locations are randomly assigned to each type of advertising. The level of sales is measured for each region in thousands of dollars. You want to see whether the average sales are significantly different for various types of advertising. The Ads data set contains data for these variables: Ad type of advertising Sales level of sales in thousands of dollars Examine the data. Use the Summary Statistics task. What information can you obtain from looking at the data? Test the hypothesis that the means are equal. Be sure to check that the assumptions of the analysis method that you choose are met. What conclusions can you reach at this point in your analysis? 134

135 Home Work: Exercise 3 3.1 Analyzing Data in a Randomized Block Design
When you design the advertising experiment in the first question, you are concerned that there is variability caused by the area of the country. You are not particularly interested in what differences are caused by Area, but you are interested in isolating the variability due to this factor. The ads1 data set contains data for the following variables: Ad type of advertising Area area of the country Sales level of sales in thousands of dollars Test the hypothesis that the means are equal. Include all of the variables in your model. What can you conclude from your analysis? Was adding the blocking variable Area into the design and analysis detrimental to the test of Ad? 135

136 Home Work: Exercise 4 4.1 post Hoc Pairwise Comparisons
Consider again the analysis of Ads1 data set. There was a statistically significant difference among means for sales for the different types of advertising. Perform a post hoc test to look at the individual differences among means for the advertising campaigns. Conduct pairwise comparisons with an experiments error rate of a=0.05. (use the Tukey adjustment) which types of advertising are significantly different? Use display (case sensitive ) as the control group and do a Dunnett comparison of all other advertising methods to see whether those methods resulted in significantly different amounts of sales compared with display advertising in stores? 136

137 Home Work: Exercise 5 5.1 Performing Two-Way ANOVA
Consider an experiment to test three different brands of concrete and see whether an additive makes the cement in the concrete stronger. Thirty test plots are poured and the following features are recorded in the Concrete data set: Strength the measured strength of a concrete test plot Additive whether an additive was used in the test plot Brand the brand of concrete being tested Use the Summary Statistics task to examine the data, with Strength as the analysis variable and Additive and Brand as the classification variables. What information can you obtain from Looking at the data? Test the hypothesis that the means are equal, making sure to include an interaction term if the results from the Summary Statistics output indicate that would be advisable. What conclusions can you reach at this point in your analysis? Do the appropriate multiple comparisons test for statistically significant effects? 137

138 Thank you!


Download ppt "Analysis of Variance (ANOVA)"

Similar presentations


Ads by Google