Presentation on theme: "Two and more factors in analysis of variance"— Presentation transcript:
1Two and more factors in analysis of variance Factorial and nested designs
2Factorial designEach level of the first factor is combined with each level of the second one. By two levels in each factor2 factors -> 4 combinations3 factors -> 8 combinationsGenerally: Number of combinations is product of number of levels for each factor
3Mowing, fertilization, removing of dominant Usually – each combination in several replications
4Factorial designs in terrain - factors: shape and pattern
5Another possibility - nested design factor A (local)factor C (plant)sing. observ.Plant 1 from the first locality has nothing common with plant 1 from any other locality.
7Proportional designThe same proportion of replications of each factor at each level of other factor; contingency table of no. of replications χ2 equals zero - i.e. factors are absolutely independentIn ideal case is the same number of observations in all combinations, but proportional design is enough:
8[formula for expected frequency in contingency table] So, for example for non-fertilized non-mowedI.e. the same proportional representation of the first factor’s level by each level of the second factor – then we consider the factors independent
9When factors are “independent”, and design is balanced Balanced designWeights of rats
10When factors are “independent”, and design is proportional Proportion designWeights of rats
11When factors are “dependent”, i. e When factors are “dependent”, i.e. design isn’t balanced nor proportionalNon-proportional designWeights of ratsAccording to marginal means it seems as listening of music can affect weight of rats. (There are methods, which can partly cope with it [LS means], but power of test is lowered for both factors).
12Statistica can compute anything, but If I have proportion design, the result should be always the same.Two-way ANOVA can be computed even in non-proportion design – default there (Type III sum of squares - orthogonal) is alright, but I can, according to the experiment situation, decide myself for other type (perhaps Type I - sequential), and I should know, what means what (and why are results different).
13Model of two-way ANOVATwo factors (mown and fertilized) - index i is level of the first factor (non-mown, mown), index j is level of the second one, k replication in within group – response is e.g. number of species.Grand meanEffect of mowingEffect of fertilizationError variabilityInteractionParameterisation is usually such, that α, β, and γ would be balanced around zero (then μ is really mean of everything).
14Three null hypothesis αi=0 for all i – mowing has no effect βi=0 for all j – fertilization has no effectγij=0 for all combinations of ij - there is no interaction between mowing and fertilizationNull interaction means, that main effects are purely additive
15Null interaction“Effect of every factor is independent of the level of other factor” ATTENTION – it means additivity
17Can be seen well in graphs (interaction plot) Do not forget to stress, that connection of means isn’t an interpolation here – we just want to visualize interaction with help of (non) parallelism of lines
18Can be seen well in diagram (interaction plot) When I refer about result, it isn’t enough to write that interaction is significant, but one need to say why (where is the deviation from additivity).
19Null hypothesis of main effects - “averaged” over all levels of the second effect αi=0 for all i – mowing has no effect (at mean over all levels of fertilization)βi=0 for all j – fertilization has no effect (at mean over all levels of mowing)
20You have to use head when interpreting results!!! (and look at diagram) Administrate two medicines separately and together (factorial design) - main effects are insignificant – it doesn’t mean the medicines are ineffective though. Just their effects cancel when applied together.
21Holds again – grand/overall variability expressed with help of SSTOT can be divided SSASSBSSAB(interaction)SSTOT = sum of deviations from grand meanSSA = sum of deviations of marginal means of factor A groups from grand mean, weigh by number of observations (similar to SSB)SSAB = weigh sum of squares of deviations of means combination from means if there is pure additivityExplained by modelSSerror(resid)Error(Residual)Expected without interaction
22Example mown, fertilized, number of species as response Test of null hypothesis, that mean number of species is zero everywhere
23a, b are sums of levels for factors A and B, n is number of observations in all groups Holds DFA= a-1, DFB=b-1, DFAB=(a-1)(b-1), DFTOT=n-1DFerror = DFTOT - DFA - DFB - DFABHolds again, that fraction MS = SS/DF is estimation of grand variance, if null hypothesis is true
24If all the effects are fixed Test: Feffect = MSeffect / MSerror
25Problem – what is in denomination depends on which factor is with fixed effect and which factor with random effect (especially important if one of the factors is experimental (and thus of our major interest), and the other is locality. Important for experimental design planning!
26I, the experimenter, am the one deciding, which model I will use “classic” ANOVA factorialANOVA without interactions (also Main effects ANOVA) - “non-additivity” is part of random variability – it makes possible to work with data with one observation for each factor combination (better avoid it though)
27Experimental design Pseudoreplications C RANDOMIZED BLOCKS WRONG LATIN SQUAREPseudoreplications
28Completely randomized blocks I test by two-way analysis of variance without repetition (error variability is deviations from additivity, i.e. interaction between block and treatment)It can give more powerful test, if blocks explain something, i.e. help to control variability.
29Multiple comparisonSimilar to one-way analysis of variance – if I do it “on interaction” – I compare all factorially-made groups with each other; if I do it on main effect, I compare additive effects of single levels. I am the one deciding what will be compared.
30Friedman test - nonparametric ANOVA for completely randomized blocks Based on sequencing values inside blockwhere a is number of levels of factor studied, b is number of blocks and Ri is sum of ranks for level i of factor studied.
31Two-factorial experiment – I compare daisy and sunflower and their response to level of nutrients (response is height of plant)Three null hypothesis:1. Height of daisies and sunflowers isn’t different (it can sometimes happen, we are testing totally unrealistic null hypothesis, we didn’t need to test this one obviously)2. Height of plants is independent of level of nutrients3. Effect of level of nutrients is the same for both species
32We have a problemData are positively skewed (the least important problem)There is distinctive inhomogeneity of variances (CV could be constant, i.e. SD linearly depends on mean)Classic interaction tests additivity – thus if fertilization elongates daisies from 10 to 20 cm, sunflowers should be elongated from 100 to 110 cm. From biological point of view this isn’t absolutely “the same effect” to both species.
33Additive effect Multiplicative effect with every value we multiply error – thus SD is linearly dependent on mean. εijk has lognormal distribution centered around 1.After log-transformationis multiplicative effect changed to additive
34Logarithmic transformation Changes lognormal distribution to normal oneIf SD was linearly dependent on mean, it leads to homogeneity of variancesChanges multiplicative effects to additive onesATTENTION – it makes everything simultaneously – I cannot want just one of those
35Many biological data contain zeroes Transformation often used X´ = log(X+1) has similar quality, but not exactly the same, especially if there are low X values. Particularly inaccurate can be the change from multiplicativity to additivity!!!Sometimes is used X´ = log(bX+a), where a and b are constants. (but the change to additivity from multiplicativity is never achieved)
36Other transformations used For Poisson distribution (numbers of randomly placed individuals)For percentages (p as a number between 0 and 1)
37Nested design We measure length of corolla’s tubes factor A (local)factor C (plant)sing. observ.Plant 1 from the first locality has nothing common with plant 1 from any other locality.
38The top factor in hierarchy can be either with fixed effect or with random one Factors in lower position of hierarchy are almost always with random effect (it is possible to compute it also with fixed one, but it is very unusual case)In analysis of sum of squares we count squares of differences of each observation (or mean) and its hierarchically nearest upper relevant mean.If hierarchically lower effects are random, then we test every effect against nearest hierarchically lower effect
39Null hypothesis on lower hierarchical levels – plants do not differ in mean length of their tubes in scope of localityTest of null hypothesis, that mean tube length is zeroFlocality= MSlocality/MSplantFplant = MSplant/MSerror= 2,15/2,24=0,96Ideal, when model is balanced - Statistika compute it even if it isn’t, but they are various approximations….
40Most frequent useAnalysis of variability among single hierarchical levels, e.g. in taxonomyOften – I am interested mainly (only) in hierarchically higher factor, everything else is just for increasing test power.I.e. I can have just 6 pounds, three pastured and three non-pastured (I am not able to have more). In each of them I lay out 10 squares for biomass sampling, and I do three analytic determinations from every square. Analysis of variability can help me to plan optimal sampling design.
41Mind mixed samplesI can spare my work, but they must be independently replicated!These aren’t independent observations
42More complicated models of ANOVA Factorial and nested designs can be combined in different ways, whereas some of them will be with fixed effect and some with random one
43Split plot (main plots and split plots - two error levels) 6 plots (3 calcite, 3 granite), 3 types of impacts in each plot
44ANOVA - Repeated measures I have some experimental design and I follow the state of individual objects in time, e.g. growing plants, etc.