2 What Technique? Response variable(s) ... Predictors(s) No Yes ... is one• distribution summary• regression models... are many• indirect gradient analysis(PCA, CA, DCA, MDS)• cluster analysis• direct gradient analysis• constrained cluster analysis• discriminant analysis (CVA)
18 Managing Dimensionality (but not acronyms) PCA, CA, RDA, CCA, MDS, NMDS, DCA, DCCA, pRDA, pCCA
19 Type of Data Matrix species attributes desert macroph inverts uses sitesspeciesattributesattributeswatervarraingullsindividualssites
20 Models of Species Response There are (at least) two models:-Linear - species increase or decrease along the environmental gradientUnimodal - species rise to a peak somewhere along the environmental gradient and then fall again
25 Inferring Gradients from Species (or Attribute) Data
26 Indirect Gradient Analysis Environmental gradients are inferred from species data aloneThree methods:Principal Component Analysis - linear modelCorrespondence Analysis - unimodal modelDetrended CA - modified unimodal model
32 Making Effective Use of Environmental Variables
33 ApproachesUse single responses in linear models of environmental variablesUse axes of a multivariate dimension reduction technique as responses in linear models of environmental variablesConstrain the multivariate dimension reduction into the factor space defined by the environmental variables
34 Ordination Constrained by the Environmental Variables
36 Working with the Variability that we Can Explain Start with all the variability in the response variables.Replace the original observations with their fitted values from a model employing the environmental variables as explanatory variables (discarding the residual variability).Carry our gradient analysis on the fitted values.
37 Unconstrained/Constrained Unconstrained ordination axes correspond to the directions of the greatest variability within the data set.Constrained ordination axes correspond to the directions of the greatest variability of the data set that can be explained by the environmental variables.
39 Direct Gradient Analysis Environmental gradients are constructed from the relationship between species environmental variablesThree methods:Redundancy Analysis - linear modelCanonical (or Constrained) Correspondence Analysis - unimodal modelDetrended CCA - modified unimodal model
40 Direct Gradient Analysis Basic PCA yik = b0k + b1kxi + eikxi - the sample scores on the ordination axisb1k - the regression coefficients for each species (the species scores on the ordination axis)In RDA there is a further constraint on xi xi = c1zi1 + c2zi2Making yik = b0k + b1kc1zi1 + b1kc2zi2 + eik
41 Direct Gradient Analysis cca(species_data ~ e1 + e en, data=environmental_data)cca(dune ~ Manure + Moisture + A1, data=dune.env)
48 Ways of Building Models Automated environmental variable selection (stepwise addition or removal of variables from the model – as with multiple regression)mod0 <- cca(nasser.inverts ~ 1, nasser.watervar)mod1 <- cca(nasser.inverts ~ ., nasser.watervar)op <- options(digits=7)mod <- step(mod0, scope=formula(mod1))options(op)modplot(mod)
49 Ways of Building Models Manual selection of environmental variables using prior knowledge (e.g. example starting with full model and removing terms)mod1 <- cca(nasser.inverts ~ ., nasser.watervar)mod2 <- cca(nasser.inverts ~ . -WMg, nasser.watervar)mod3 <- cca(nasser.inverts ~ . -WMg -WEC, nasser.watervar)mod4 <- cca(nasser.inverts ~ . -WMg -WEC -WCa, nasser.watervar)
50 Ways of Evaluating Models Graphically using Procrustes Rotationplot(procrustes(mod2, mod1))plot(procrustes(mod3, mod2))plot(procrustes(mod4, mod3))plot(procrustes(mod4, mod1))
52 Ways of Evaluating Models Permutation Tests can be used to assess adequacy of the models using a Pseudo ANOVA or Permutestanova(mod1)anova(mod2)anova(mod3)anova(mod4)permutest.cca(mod1, perm=1000)permutest.cca(mod2, perm=1000)permutest.cca(mod3, perm=1000)permutest.cca(mod4, perm=1000)
54 Getting rid of the Variability that is Not of Interest Amongst the explanatory variables there may be variability attributable to:Blocks and other design strataCovariates that we can measure but are not the focus of interestWe may want to use only the variability attributable to:Meaningful Environmental Variables
55 Partial Analyses Remove the effect of covariates variables that we can measure but which are of no intereste.g. block effects, start values, etc.Carry out the gradient analysis on what is left of the variation after removing the effect of the covariates.
57 Forest Data Sites – 28 sites in forests in Finland grazed by reindeer Species Data – 44 heathland plant species (including many lichens and mosses that are very sensitive to their chemical environment)Environmental Data – Soil chemical composition (N P K Ca Mg S Al Fe Mn Zn Mo Baresoil Humdepth pH)
65 Different types of data exampleContinuous data : heightCategorical dataordered (nominal) : growth ratevery slow, slow, medium, fast, very fastnot ordered : fruit colouryellow, green, purple, red, orangeBinary data : fruit / no fruit
66 Similarity matrixWe define a similarity between units – like the correlation between continuous variables.(also can be a dissimilarity or distance matrix)A similarity can be constructed as an average of the similarities between the units on each variable.(can use weighted average)This provides a way of combining different types of variables.
67 Distance metrics relevant for continuous variables: Euclideancity block or ManhattanABAB(also many other variations)
74 Similarity coefficients for binary data simple matchingcount if both units 0 or both units 1Jaccardcount only if both units 1(also many other variants)simple matching can be extended to categorical data0,11,10,01,00,11,10,01,0
75 Clustering methods hierarchical non-hierarchical divisive put everything together and splitmonothetic / polytheticagglomerativekeep everything separate and join the most similar points (classical cluster analysis)non-hierarchicalk-means clustering
76 Agglomerative hierarchical Single linkage or nearest neighbourfinds the minimum spanning tree:shortest tree that connects all pointschaining can be a problem
77 Agglomerative hierarchical Complete linkage or furthest neighbourcompact clusters of approximately equal size.(makes compact groups even when none exist)
78 Agglomerative hierarchical Average linkage methodsbetween single and complete linkage
84 Species and Sites as Weighted Averages of each other SPP Bel per Jun buf …42.. Jun art Air pra Ele pal Rum ace …23.. Vic lat Bra rut Ran fla Hyp rad Leo aut Pot pal Poa pra …4.. Cal cus Tri pra …2.. Tri rep Ant odo Sal rep Ach mil …2.. Poa tri …45.. Ely rep Sag pro Pla lan …5.. Agr sto Lol per …6.. Alo gen Bro hor …2..
85 Species and Sites as Weighted Averages of each other
86 Reciprocal Averaging - unimodal Site A B C D E F Species Prunus serotina Tilia americana Acer saccharum Quercus velutina Juglans nigra
87 Reciprocal Averaging - unimodal Site A B C D E F Species Score Species Iteration Prunus serotina Tilia americana Acer saccharum Quercus velutina Juglans nigraIteration Site Score
88 Reciprocal Averaging - unimodal Site A B C D E F Species Score Species Iteration Prunus serotina Tilia americana Acer saccharum Quercus velutina Juglans nigraIteration Site Score
89 Reciprocal Averaging - unimodal Site A B C D E F Species Score Species Iteration Prunus serotina Tilia americana Acer saccharum Quercus velutina Juglans nigraIteration Site Score
90 Reciprocal Averaging - unimodal Site A B C D E F Species Score Species Iteration Prunus serotina Tilia americana Acer saccharum Quercus velutina Juglans nigraIteration Site Score
91 Reordered Sites and Species Site A C E B D F Species Species Score Quercus velutina Prunus serotina Juglans nigra Tilia americana Acer saccharum Site Score
93 Alpha and Beta Diversity alpha diversity is the diversity of a community (either measured in terms of a diversity index or species richness)beta diversity (also known as ‘species turnover’ or ‘differentiation diversity’) is the rate of change in species composition from one community to another along gradients; gamma diversity is the diversity of a region or a landscape.