Presentation is loading. Please wait.

Presentation is loading. Please wait.

Comparing ≥ 3 Groups Analysis of Biological Data/Biometrics

Similar presentations


Presentation on theme: "Comparing ≥ 3 Groups Analysis of Biological Data/Biometrics"— Presentation transcript:

1 Comparing ≥ 3 Groups Analysis of Biological Data/Biometrics
Dr. Ryan McEwan Department of Biology University of Dayton

2 So far in class we have looked at situations where we have two groups to compare.
Like LeBron’s assists Or House Prices

3 What do we do when there are more than 2 groups?

4 You may be tempted to t-test everything one at a time.
This would be statistically invalid! Instead you need to first ask: is there an overall effect of the treatment

5 This is done using…. Analysis of Variance or ANalysis Of Variance = ANOVA

6 Let’s grab some data from the book
comp=read.csv("C:/Users/rmcewan1/Documents/R/A_BioData/Data/competition.csv")

7 First test for an overall treatment effect
summary(aov(comp$biomass~comp$clipping)) of anova(lm(comp$biomass~comp$clipping))

8 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

9 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

10 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

11 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

12 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

13 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

14 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs”

15 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs” Note that a lot of comparisons are being made in this kind of process…that has implications

16 Bonferroni correction
Basic idea is that as you increase the number of comparisons you mathematically or de facto increase the probability of making a Type 1 error. (False positive) So you need to adjust the P-value – make it more stringent- for each individual test. So you cannot just run regular t-tests as a Post-hoc procedure. You have to do a correction. Post-hoc, pairwise, procedures that follow ANOVA have these corrections built in.

17 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs” This is often referred to as a Post-hoc procedure pairwise.t.test(comp$biomass,comp$clipping,padj="bonferroni")

18 Second test for individual differences
PAIRWISE COMPARISONS aka “Compare all Pairs” This is often referred to as a Post-hoc procedure Another version is the Tukey test (also called the Tukey-Kramer Multiple Comparisons Test) TukeyHSD(aov(comp$biomass~comp$clipping)) Generally, the Tukey HSD test is considered a superior method and we will adopt this for our class.

19 ANalysis Of Variance = ANOVA
Assumptions! -Independence -Normality -Equal Variance

20 ANalysis Of Variance = ANOVA
Assumptions! -Independence; -Normality; -Equal Variance Guidelines One way ANOVA is robust to minor violations of normality and equal variance Assessing normality using tests (Shapiro, etc, is losing favor among some statisticians) Graphical methods (histogram, normal Q-Q plots) remain useful, however they are also somewhat ambiguous. General practice we will adopt for this class: (a) Exploratory data analysis including outlier scanning (b) graphical methods (c) tests of both normality and equal variance (d) as the analyst, take all of this into account, make a decision and report it -Generally speaking, the data should be “a mess” to resort to the non-parametric approach.

21 If you decide that you need to run a non-parametric test because your data are not meeting the assumptions of ANOVA. Run the Kruskal-Wallis Test kruskal.test(dataset$measurement~dataset$group) It is important to note that this test is less powerful for detecting differences. It is more conservative, thus you run the risk of a type II error- false negative.

22 For Non-normal data, you can use the pairwise Wilcoxon test.
pairwise.wilcox.test(data$measures, data$group, p.adj=“bonferroni”) It is important to note that this test is less powerful for detecting differences. It is more conservative, thus you run the risk of a type II error- false negative.

23 (2) Test for a normal distribution and equal variance
Exploratory Data Analysis (screen for outliers, look at the range of variation, etc) (2) Test for a normal distribution and equal variance -hist(DATA) (use this to take a look at the distributions -use plot(aov( ) to create 4 plots. FIRST set par(mfrow=c(2,2)) Shapiro-Wilk test: shapiro.test(DATA) : [P ≤0.05 = not-normal] Bartlett test for homogeneity of variance: bartlett.test(Data ~ Site) [P ≤0.05 = not-equal variance] Reject the null hypothesis of a normal distribution “Non-NORMAL” Cannot reject the null of a normal distribution “NORMAL” kruskal.test(data$measures~data$group) anova(lm(data$measures~data$group)) P > 0.05 P ≤ 0.05 P > 0.05 P ≤ 0.05 Significant Treatment Effect! Cannot reject the null- Treatments are statistically indistinguishable Significant Treatment Effect! Cannot reject the null- Treatments are statistically indistinguishable Analysis complete Analysis complete What are the individual differences TukeyHSD(aov(data$measures~data$group)) pairwise.wilcox.test(data$measures, data$group, p.adj=“bonferroni”) This code will show you difference among each pair of treatments in the data set. You need to report the result of the overall ANOVA and the individual comparisons. This code will show you difference among each pair of treatments in the data set. You need to report the result of the overall ANOVA and the individual comparisons.

24 -hist(DATA) (use this to take a look at the distributions
-use plot(aov( ) to create 4 plots. FIRST set par(mfrow=c(2,2)) Shapiro-Wilk test: shapiro.test(DATA) Bartlett test for homogeneity of variance: bartlett.test(Data ~ Site)

25 Overall treatment effect P = 0.0087

26 Overall treatment effect P = 0.0087

27 Overall treatment effect P = 0.0087

28 Overall treatment effect P = 0.0087
B A A A

29 Overall treatment effect P = 0.0087
B B A A A

30 Overall treatment effect P = 0.0087
B B AB A A

31 Overall treatment effect P = 0.0087
B B AB AB A

32 Overall treatment effect P = 0.0087
B B AB AB A

33 Overall treatment effect P = 0.0087
B B AB AB boxplot(comp$biomass~comp$clipping,ylab="mean biomass", xlab="competition treatment",col="darkgreen") hist(comp$biomass) par(mfrow=c(2,2)) plot(aov(comp$biomass~comp$clipping)) shapiro.test(comp$biomass) bartlett.test(comp$biomass~comp$clipping) anova(lm(comp$biomass~comp$clipping)) TukeyHSD(aov(comp$biomass~comp$clipping)) A

34 (2) Test for a normal distribution and equal variance
Exploratory Data Analysis (screen for outliers, look at the range of variation, etc) (2) Test for a normal distribution and equal variance -hist(DATA) (use this to take a look at the distributions -use plot(aov( ) to create 4 plots. FIRST set par(mfrow=c(2,2)) Shapiro-Wilk test: shapiro.test(DATA) Bartlett test for homogeneity of variance: bartlett.test(Data ~ Site) Reject the null hypothesis of a normal distribution “Non-NORMAL” Cannot reject the null of a normal distribution “NORMAL” kruskal.test(data$measures~data$group) anova(lm(data$measures~data$group)) P > 0.05 P ≤ 0.05 P > 0.05 P ≤ 0.05 Significant Treatment Effect! Cannot reject the null- Treatments are statistically indistinguishable Significant Treatment Effect! Cannot reject the null- Treatments are statistically indistinguishable Analysis complete Analysis complete What are the individual differences TukeyHSD(aov(data$measures~data$group)) pairwise.wilcox.test(data$measures, data$group, p.adj=“bonferroni”) This code will show you difference among each pair of treatments in the data set. You need to report the result of the overall ANOVA and the individual comparisons. This code will show you difference among each pair of treatments in the data set. You need to report the result of the overall ANOVA and the individual comparisons.

35 B B A

36 The ANalysis Of Variance = ANOVA Family
Repeated Measures ANOVA - if you measure the same treatments many times Two-way ANOVA – if you have more than one treatment influence a single set of experimental units Interaction term!! Randomized Complete Block ANOVA - RCB-ANOVA -If your experiment is a block design (more on this later) ANCOVA- If your experiment has a “Co-Variate” MANOVA- Multivariate ANOVA for when you have large complex data sets


Download ppt "Comparing ≥ 3 Groups Analysis of Biological Data/Biometrics"

Similar presentations


Ads by Google