Design of experiment: ANOVA and testing hypotheses

Slides:



Advertisements
Similar presentations
Week 2 – PART III POST-HOC TESTS. POST HOC TESTS When we get a significant F test result in an ANOVA test for a main effect of a factor with more than.
Advertisements

3.3 Hypothesis Testing in Multiple Linear Regression
A. The Basic Principle We consider the multivariate extension of multiple linear regression – modeling the relationship between m responses Y 1,…,Y m and.
1 Chapter 4 Experiments with Blocking Factors The Randomized Complete Block Design Nuisance factor: a design factor that probably has an effect.
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
ANOVA: Analysis of Variation
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Maximum likelihood (ML) and likelihood ratio (LR) test
Basics of ANOVA Why ANOVA Assumptions used in ANOVA Various forms of ANOVA Simple ANOVA tables Interpretation of values in the table Exercises.
Elementary hypothesis testing
Basics of ANOVA Why ANOVA Assumptions used in ANOVA
Part I – MULTIVARIATE ANALYSIS
Basics of ANOVA Why ANOVA Assumptions used in ANOVA
Regression II Model Selection Model selection based on t-distribution Information criteria Cross-validation.
Elementary hypothesis testing
ANOVA Determining Which Means Differ in Single Factor Models Determining Which Means Differ in Single Factor Models.
Maximum likelihood (ML) and likelihood ratio (LR) test
Comparing Means.
Elementary hypothesis testing Purpose of hypothesis testing Type of hypotheses Type of errors Critical regions Significant levels Hypothesis vs intervals.
Chapter 3 Experiments with a Single Factor: The Analysis of Variance
Analysis of Variance (ANOVA) MARE 250 Dr. Jason Turner.
Design of experiment and ANOVA
Lecture 9: One Way ANOVA Between Subjects
Log-linear and logistic models
13-1 Designing Engineering Experiments Every experiment involves a sequence of activities: Conjecture – the original hypothesis that motivates the.
Design of experiment I Motivations Factorial (crossed) design
Lecture 12 One-way Analysis of Variance (Chapter 15.2)
Linear and generalised linear models
Incomplete Block Designs
Inferences About Process Quality
Linear and generalised linear models
Basics of regression analysis
Today Concepts underlying inferential statistics
Administrata New final exam schedule: Handed out Tues, Dec 3
Multiple comparisons Basics of linear regression Model selection.
Analysis of Variance. ANOVA Probably the most popular analysis in psychology Why? Ease of implementation Allows for analysis of several groups at once.
Basics of ANOVA Why ANOVA Assumptions used in ANOVA Various forms of ANOVA Simple ANOVA tables Interpretation of values in the table Exercises.
Intermediate Applied Statistics STAT 460
When we think only of sincerely helping all others, not ourselves,
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
 The idea of ANOVA  Comparing several means  The problem of multiple comparisons  The ANOVA F test 1.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Design and Analysis of Experiments Dr. Tai-Yue Wang Department of Industrial and Information Management National Cheng Kung University Tainan, TAIWAN,
Statistics 11 Confidence Interval Suppose you have a sample from a population You know the sample mean is an unbiased estimate of population mean Question:
STA305 week21 The One-Factor Model Statistical model is used to describe data. It is an equation that shows the dependence of the response variable upon.
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
ANOVA (Analysis of Variance) by Aziza Munir
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
Everyday is a new beginning in life. Every moment is a time for self vigilance.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Regression Part II One-factor ANOVA Another dummy variable coding scheme Contrasts Multiple comparisons Interactions.
Orthogonal Linear Contrasts This is a technique for partitioning ANOVA sum of squares into individual degrees of freedom.
ANOVA: Analysis of Variance.
1 Experimental Statistics - week 14 Multiple Regression – miscellaneous topics.
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 13: One-way ANOVA Marshall University Genomics Core.
Linear Models One-Way ANOVA. 2 A researcher is interested in the effect of irrigation on fruit production by raspberry plants. The researcher has determined.
ANOVA P OST ANOVA TEST 541 PHL By… Asma Al-Oneazi Supervised by… Dr. Amal Fatani King Saud University Pharmacy College Pharmacology Department.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Analysis of Variance STAT E-150 Statistical Methods.
Chapters Way Analysis of Variance - Completely Randomized Design.
MARE 250 Dr. Jason Turner Analysis of Variance (ANOVA)
Inferential Statistics Psych 231: Research Methods in Psychology.
F73DA2 INTRODUCTORY DATA ANALYSIS ANALYSIS OF VARIANCE.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
ANALYSIS OF VARIANCE (ANOVA)
Hypothesis testing using contrasts
Presentation transcript:

Design of experiment: ANOVA and testing hypotheses Interactions Contrasts Multiple hypothesis tests

ANOVA tables Standard ANOVA tables look like Where v1,,,vp are values we want to test if they are 0. df-s is degrees of freedom corresponding to these values. SSh-s are corresponding sum of the squares (h denotes hypothesis). F is F-value used for F distribution. Its degrees of freedom are (di,de). Prob-s are corresponding probability. If a particular probability is very low (less than 0.05) then we reject hypothesis that this value is 0. These values are calculated using likelihood ratio test. Let us say we want to test hypothesis: H0: vi=0 vs H1:vi0 Then we maximise likelihood under null hypothesis to find corresponding variance and then we maximise the likelihood under alternative hypothesis and find corresponding variance. Then we calculate sum of the squares for null and alternative hypotheses and find F-statistics effect df SSh MS F prob v1 d1 SS1 MS1=SS1/d1 MS1/MSe pr1 ... … vp dp SSp MSp=SSp/dp MSp/MSe prp error de SSe MSe=SSe/de total N SSt

LR test for ANOVA Suppose variances are: Then mean sum of the squares for the null and alternative hypotheses as: Since the first sum of the squares is 2 with degrees of freedom dfh and the second sum of squares is 2 with degrees of freedom dfe and they are independent then their ratio has F-distribution with degrees of freedom (dfh,dfe). Degrees of freedom of hypothesis is found using the number of levels of the factor minus 1 in the simplest case. Using this type of ANOVA tables we can only tell if there is significant differences between means. It does not tell which one is significantly different. Degrees of freedom of hypothesis is defined by the number of constraints it implies. Degrees of freedom of error is as usual the number of observations minus the number of (estimated) parameters

Example: Two way ANOVA Let us consider an example taken from Box, Hunter and Hunter. Experiment was done on animals. Survival times of the animals for various poisons and treatment was tested. Table is: treatment A B C D poisons I 0.31 0.82 0.43 0.45 0.45 1.10 0.45 0.71 0.46 0.88 0.63 0.66 0.43 0.72 0.76 0.62 II 0.36 0.92 0.44 0.56 0.29 0.61 0.35 1.02 0.40 0.49 0.31 0.71 0.23 1.24 0.40 0.38 III 0.22 0.30 0.23 0.30 0.21 0.37 0.25 0.36 0.18 0.38 0.24 0.31 0.23 0.29 0.22 0.33

ANOVA table ANOVA table produced by R: Df Sum Sq Mean Sq F value Pr(>F) pois 2 1.03828 0.51914 22.5135 4.551e-07 *** treat 3 0.92569 0.30856 13.3814 5.057e-06 *** pois:treat 6 0.25580 0.04263 1.8489 0.1170 Residuals 36 0.83013 0.02306 Most important values are F and Pr(>F). In this table we have tests for main effects - pois. and treat. Moreover we have “interactions” between these two factors. If there are significant interactions then it would be difficult to separate effects of these two factors. They should be considered simultaneously. In this case pr. for interaction is not very small and it is not large enough to discard interaction effects with confidence. In these situations transformation of the observations might help. Let us consider ANOVA table for the transformed observations. Let us use transformation 1/y. Now ANOVA table looks like: Df Sum Sq Mean Sq F value Pr(>F) pois 2 34.903 17.452 72.2347 2.501e-13 *** treat 3 20.449 6.816 28.2131 1.457e-09 *** pois:treat 6 1.579 0.263 1.0892 0.3874 Residuals 36 8.697 0.242

ANOVA table According to this table pr. corresponding to the interaction terms is high. It means that interaction for the transformed observations is not significant. We could remove interaction terms. We can build ANOVA table without the interactions. It will look like: Df Sum Sq Mean Sq F value Pr(>F) pois 2 34.903 17.452 71.326 3.124e-14 *** treat 3 20.449 6.816 27.858 4.456e-10 *** Residuals 42 10.276 0.245 Now we can say that there is significant differences between poisons as well as treatments. Sometimes it is useful to use transformation to reduce the effect of interactions. For this several different transformations (inverse, inverse square, log) could be used.

Interactions For one way ANOVA we need 1) to find out if there is significant effect (difference between means); 2) to analyse and find out where this difference is located For two way or higher levels we need first to consider interactions. We would like to remove interactions if we can. Otherwise tests for the main effects (differences between means of observations corresponding to different levels of factors) may not be reliable. If we see significant interactions then we can try transformations of the data. Box and Cox family of variance stabilisiting transformations is one way of removing interactions. If we find necessary transformation then we can carry out further analysis using the transformed data or use GLM with corresponding link function. If interactions can not be removed then we may need to combine all pairs together and work with them as if we have one way anova with IJ levels, where I is the number of levels for the first factor and J is the number of levels for the second factor.

What to do if there is an effect ANOVA tables will tell us if there is an effect somewhere. But it does not say where is this effect. If we can make conclusions that there is significant differences then we should carry on and find out where are these differences. One way (not recommended) is to look at the confidence intervals using confint. For example for poison data: 2.5 % 97.5 % (Intercept) 0.22201644 0.40631690 poison1 -0.09299276 0.01986776 poison2 -0.13414253 -0.06898247 treatB 0.23217989 0.49282011 treatC -0.05198677 0.20865344 treatD 0.08967989 0.35032011 Since these intervals were calculated after making decision that there are significant main effects they meant to be more reliable (we know that there are effects so effects we see on this table may correspond to actual effects that exist). It is called Fisher least significant differences - Fisher’s LSD. Confidence intervals based directly on t distribution are very optimistic when many tests are performed simultaneously.

Contrasts Let us say we have m parameters -  estimate and we have a vector c with m elements and with the property: ici=0. If we are dealing with unbalanced anova then inici=0 should be considered. Then we can build hypothesis H0: i ici=0. i ici is called contrast. If we have a mxk matrix C then we can form k contrasts. Matrix C is called contrasts matrix. Usually this matrix is orthogonal. I.e. jcij =0 jcij ckj =0 In anova we are interested in effects, i.e. differences between between effects of different levels of some factors. They can be written using contrasts matrix. Default contrasts matrix used in R - contr.treatment uses this type of contrast matrix. Note that R uses contrasts by default and during the estimation usually not the parameters themselves but those corresponding to contrasts are estimated. Finding where the effect is becomes multiple hypotheses testing problem.

Multiple comparison One comparison - use t-test or equivalent Few comparisons - use Bonferroni Many comparisons - Tukey’s honest significant differences, Holm, Scheffe or others

Bonferroni correction If there is only one comparison then we can use t-test or intervals based on t distribution. However if the number of tests increases then probability that significant effect will be observed when there is no significant effect becomes very large. It can be calculated using 1-(1-)n, where  is significance level and n is the number of comparisons. For example if the significance level is 0.05 and the number of comparisons (tests) is 10 then the probability that at least one significant effect will be detected by chance is 1-(1-0.05)10=0.40. Bonferroni suggested using /n instead of  for designing simultaneous confidence intervals. It means that the intervals will be calculated using i-j  t/(2n)(se of comparison)0.5 Clearly when n becomes very large these intervals will become very conservative. Bonferroni correction is recommended when only few effects are compared. pairwise.t.test in R can do Bonferroni correction. When finding p-values then p values are multiplied by the number of comparisons (Note that of we are testing effects of I levels of factor then the number of comparisons is I(I-1)/2

Bonferroni correction: Example Let us take the example dataset - poisons and try Bonferroni correction for each factor: pairwise.t.test(poison,treat,”none”,data=poisons) 1 2 2 0.32844 - 3 3.3e-05 0.00074 pairwise.t.test(poison,treat,”bonferroni”,data=poisons) 1 2 2 0.9853 - 3 1e-04 0.0022 As it is seen each p-value is multiplied by the number of comparisons 3*2/2 = 3. If the corresponding adjusted p-value becomes more than one then it is truncated to one. It says that there is significant differences between effects of poisons 1 and 3 and between 2 and 3. Difference between effects of poisons 1 and 2 is not significant. Note: Command in R - pairwise.t.test can be used for one way anova only.

Holm correction Another correction for multiple tests - Holmes correction is less conservation than Bonferroni correction. It is a modification of Bonferroni correction. It works in a sequential manner. Let us say we need to make n comparisons an significant level is . Then we calculate p values for all of them and sort them and apply the following procedure: set i = 1 If pi< /(n-i+1) then it is significant, otherwise it is not. If a comparison number i is significant then increment i by one and if i  n go to the step 2 The number of significant effects will be equal to i where the procedure stops. When reporting p-values Holm correction works similar to Bonferroni but in a sequential manner. If we have m comparisons then the smallest p value is multiplied by m, the second smallest is multiplied by m-1, j-th comparison is multiplied by m-j+1

Holm correction: example Let us take the example - the data set poisons and try Holm correction for each factor: pairwise.t.test(poison,treat,”none”,data=poisons) 1 2 2 0.32844 - 3 3.3e-05 0.00074 pairwise.t.test(poison,treat,”holm”,data=poisons) # this correction is the default in R 1 2 2 0.3284 - 3 1e-04 0.0015 The smallest is multiplied by 3 the second by 2 and the largest by 1 It says that there is significant differences between effects of poisons 1 and 3 and between 2 and 3. Difference between effects of poisons 1 and 2 is not significant.

Tukey’s honest significant difference This test is used to calculate simultaneous confidence intervals for differences of all effects. Tukey’s range distribution. If we have a random sample of size N from normal distribution then the distribution of stundetised range - (maxi(xi)-mini(xi))/sd is called Tukey’s distribution. Let us say we want to test if i- j is 0. For simultaneous 100% confidence intervals we need to calculate for all pairs lower and upper limits of the interval using: difference ± ql, sd (1/Ji+1/Jj)0.5 /2 Where q is the -quantile of Tukey’s distribution, Ji and Jj are the numbers of observations used to calculate I and j, sd is the standard deviation, l is the number of levels to be compared and  is the degree of freedom used to calculate sd.

Tukey’s honest significant difference R command to perform this test is TukeyHSD. It takes an object derived using aov as an input and gives confidence intervals for all possible differences. For example for poison data (if you want to use this command you should use aov for analysis) lm1 = aov(time~poison+treat,data=poisons) TukeyHSD(lm1) $poison diff lwr upr p adj 2-1 -0.073125 -0.2089936 0.0627436 0.3989657 # insignifacnt 3-1 -0.341250 -0.4771186 -0.2053814 0.0000008 # significant 3-2 -0.268125 -0.4039936 -0.1322564 0.0000606 # significant $treat diff lwr upr p adj B-A 0.36250000 0.18976135 0.53523865 0.0000083 #sginificant C-A 0.07833333 -0.09440532 0.25107198 0.6221729 #insgnific D-A 0.22000000 0.04726135 0.39273865 0.0076661 #significant C-B -0.28416667 -0.45690532 -0.11142802 0.0004090 #significant D-B -0.14250000 -0.31523865 0.03023865 0.1380432 #insignific D-C 0.14166667 -0.03107198 0.31440532 0.1416151 #insignific

Scheffe’s simultaneous confidence intervals If we have a parameter vector  then a linear combination cT is estimatable if there exists a vector a so that E(aTy) = cT. Scheffe’s theorem states that simultaneous 100(1-)% confidence interval for all estimatable  is:  + (q Fq,n-r() )1/2 (var()1/2 q is the dimension of the dimension of space of all possible contrasts, r is the rank of X (design matrix), n is the number of observations. It can also be applied for regression surface confidence intervals xT + (q Fq,n-r() )1/2 (var(xT(XTX)-1x))1/2

Testing for homogeneity of variances One of the simplest tests for homogeneity is using Levene’s test. It can be done using the algorithm: 1) fit linear model; 2) calculate residuals; 3) fit linear model using absolute values of residuals as a response vector and 4) test for differences. If p-values are very small then variances are not equal (homogenic) otherwise variances are not homogenic. There is also an R command levene.test Another test for homogeneity of variances is Bartlett test. It can be done using bartlett.test in R. Bartlett test is sensitive to violation normality assumption. Levene’s test is less sensitive to such violation.

Testing for homogeneity of variances Example: lm2 = lm(time~poison+treat) summary(lm(abs(lm2$res)~poison+treat)) Residual standard error: 0.09448 on 42 degrees of freedom Multiple R-squared: 0.2533, Adjusted R-squared: 0.1644 F-statistic: 2.849 on 5 and 42 DF, p-value: 0.02649 P value is 0.026 small enough (usually you would look around 0.01). lm2 = lm(1/time~poison+treat) Residual standard error: 0.2634 on 42 degrees of freedom Multiple R-squared: 0.1494, Adjusted R-squared: 0.04812 F-statistic: 1.475 on 5 and 42 DF, p-value: 0.2183 Now p-value is sufficiently big. So we can say that variances are homogenic.

References Stuart, A., Ord, KJ, Arnold, S (1999) Kendall’s advanced theory of statistics, Volume 2A Box, GEP, Hunter, WG and Hunter, JS (1978) Statistics for experimenters Berthold, MJ and Hand, DJ. Intelligent Data Analysis Cox, GM and Cochran WG. Experimental design Dalgaard, P. Introductory Statistics with R