Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Analysis Using R: 5. Analysis of Variance Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia.

Similar presentations


Presentation on theme: "Data Analysis Using R: 5. Analysis of Variance Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia."— Presentation transcript:

1 Data Analysis Using R: 5. Analysis of Variance Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia

2 ANOVA and the concept of “Effect” ABC There are differences between groups, but no differences within group. The model is now: –Y ij =  +  j where  = 40;  1 = -2,  2 = 6 and  3 = -4. Note that  1 +  2 +  3 = 0 ABC

3 ANOVA and the concept of “Effect” ABC ABC overall mean: 41.1 In reality, there is always random variation in a population, so that there is sampling error. The model now includes an error term: Y ij =  +  j +  ij Effect of product A: = -1.8 product B: = 5.8 product C: = -4.4

4 ANOVA Model Partition of variation into –Between groups –Within groups The model: Y ij =  j  ij Assumptions: Normality Independence Homogeneity Var(Y) = Var(  ) + Var(  ) + Var(  ) = Var(  ) + Var(  )

5 Variation between groups ABC Mean Overall mean: 41.1 The sum of squares for difference between groups: ( ) 2 + ( ) 2 + ( ) 2 = But the mean of each group is calculated from 3 observations. So the “true” sum of squares is: SSb = 3*( ) 2 + 3*( ) 2 + 3*( ) 2 = Degrees of freedom: (3 groups – 1) = 2.

6 Variation within groups ABC Mean SS for group A: SS1 = (43 – 39.3) 2 + (40 – 39.3) 2 + (35 – 39.3) 2 = 32.7 SS for group B: SS2 = (41 – 47.3) 2 + (47 – 47.3) 2 + (54 – 47.3) 2 = 84.7 SS for group C: SS3 = (39 – 36.7) 2 + (34 – 36.7) 2 + (37 – 36.7) 2 = 12.7 SS for within group: SSW = SS1+SS2+SS3 = Degrees of freedom: (3 – 1) + (3 – 1) + (3 – 1) = 6

7 Summary of Analysis Source of variationDFSSMS Among groups Within groups Total F statistic = MSa / MSw = 92.4 / 21.7 = 4.27 P value associated with (2, 6) df: 0.07

8 ANOVA by R group <- c(1,1,1,2,2,2,3,3,3) y <- c(43, 40, 35, 41, 47, 54, 39, 34, 37) group <- as.factor(group) analysis <- lm(y ~ group) summary(analysis) anova(analysis) ABC

9 Summary of Variation > anova(analysis) Response: y Df Sum Sq Mean Sq F value Pr(>F) group Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

10 Estimate of Treatment Effects > summary(analysis)... Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-06 *** group group Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Residual standard error: on 6 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 6 DF, p-value:

11 Multiple Comparisons: Tukey’s Method res <- aov(y ~ group) TukeyHSD (res) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = y ~ group) $group diff lwr upr p adj

12 Multiple Comparisons: Tukey’s Method plot(TukeyHSD(res), ordered=T)

13 Graphical Analysis average <- tapply(y, group, mean) std <- tapply(y, group, sd) ss <- tapply(y, group, length) sem <- std/sqrt(ss) stripchart(y ~ group, "jitter", jit=0.05, pch=16, vert=TRUE) arrows(1:3, average+sem, 1:3, average-sem, angle=90, code=3, length=0.1) lines(1:3, average, pch=4, type="b", cex=2)

14 Graphical Analysis

15 Factorial ANOVA VarietyPesticideTotal 1234 B B B Tổng số Model: product = a + b(variety) + g(pesticide) + e

16 Factorial ANOVA by R VarietyPesticideTotal 1234 B B B Tổng số variety <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3) pesticide <- c(1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4) product <- c(29,50,43,53,41,58,42,73,66,85,69,85) variety <- as.factor(variety) pesticide <- as.factor(pesticide) data <- data.frame(variety, pesticide, product)

17 Factorial ANOVA by R analysis <- aov(product ~ variety + pesticide) anova(analysis) Analysis of Variance Table Response: product Df Sum Sq Mean Sq F value Pr(>F) variety *** pesticide ** Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

18 Multiple Comparisons > TukeyHSD(analysis) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = product ~ variety + pesticide) $variety diff lwr upr p adj $pesticide diff lwr upr p adj

19 Multiple Comparisons > plot(TukeyHSD(analysis), ordered=TRUE)

20 Latin-square ANOVA PlotVariety Aa 143 Ba 128 Bb 166 Ab 2170 Ab 178 Aa 140 Ba 131 Bb 3135 Bb 173 Ab 169 Aa 141 Ba 4145 Ba 136 Bb 165 Ab 173 Aa

21 Latin-square ANOVA: summary PlotVariety Aa 143 Ba 128 Bb 166 Ab 2170 Ab 178 Aa 140 Ba 131 Bb 3135 Bb 173 Ab 169 Aa 141 Ba 4145 Ba 136 Bb 165 Ab 173 Aa Mean by varietyMean by plotMean by method 1: : : : Overall mean: : : : : Overall mean: (Aa): (Ab): (Ba): (Bb): Overall mean:

22 Latin-square ANOVA by R PlotVariety Aa 143 Ba 128 Bb 166 Ab 2170 Ab 178 Aa 140 Ba 131 Bb 3135 Bb 173 Ab 169 Aa 141 Ba 4145 Ba 136 Bb 165 Ab 173 Aa y <- c(175, 143, 128, 166, 170, 178, 140, 131, 135, 173, 169, 141, 145, 136, 165, 173) variety <- c(1,2,3,4, 1,2,3,4, 1,2,3,4, 1,2,3,4,) sample <- c(1,1,1,1, 2,2,2,2, 3,3,3,3, 4,4,4,4) method <- c(1, 3, 4, 2, 2, 1, 3, 4, 4, 2, 1, 3, 3, 4, 2, 1) variety <- as.factor(variety) sample <- as.factor(sample) method <- as.factor(method)

23 Latin-square ANOVA by R latin <- aov(y ~ sample + variety + method) summary(latin) Df Sum Sq Mean Sq F value Pr(>F) sample variety *** method e-09 *** Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

24 Latin-square – Multiple Comparisons > TukeyHSD(latin) $variety diff lwr upr p adj $method diff lwr upr p adj

25 Graphical Analysis boxplot(y ~ method, xlab="Methods (1=Aa, 2=Ab, 3=Ba, 4=Bb", ylab="Production")

26 Cross-over Study ANOVA NhómMã số bệnh nhân số (id) Thời gian (phút) ra mồ hôi trên trán Tháng 1Tháng 2 ABAPlacebo BAPlaceboA

27 Cross-over Study ANOVA by R y <- c(6,8,12,7,9,6,11,8, 4,7,6,8,10,4,6,8, 5,9,7,4,9,5,8,9 7,6,11,7,8,4,9,13) seq <- c(1,1,1,1,1,1,1,1, 1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2) period <- c(1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2, 2,2,2,2,2,2,2,2, 1,1,1,1,1,1,1,1) treat <- c(1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2, 1,1,1,1,1,1,1,1, 2,2,2,2,2,2,2,2) id <- c(1,3,5,6,9,10,13,15, 1,3,5,6,9,10,13,15, 2,4,7,8,11,12,14,16, 2,4,7,8,11,12,14,16) seq <- as.factor(seq) period <- as.factor(period) treat <- as.factor(treat) id <- as.factor(id) data <- data.frame(seq, period, treat, id, y)

28 Cross-over Study ANOVA by R xover <- lm(y ~ treat + seq + period) anova(xover) Analysis of Variance Table Response: y Df Sum Sq Mean Sq F value Pr(>F) treat * seq period id Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

29 Cross-over Study ANOVA by R > TukeyHSD(aov(y ~ treat+seq+period+id)) Tukey multiple comparisons of means 95% family-wise confidence level Fit: aov(formula = y ~ treat + seq + period + id) $treat diff lwr upr p adj $seq diff lwr upr p adj $period diff lwr upr p adj


Download ppt "Data Analysis Using R: 5. Analysis of Variance Tuan V. Nguyen Garvan Institute of Medical Research, Sydney, Australia."

Similar presentations


Ads by Google