Presentation on theme: "What we Measure vs. What we Want to Know"— Presentation transcript:
1What we Measure vs. What we Want to Know "Not everything that counts can be counted, and not everything that can be counted counts." - Albert Einstein
2Scales, Transformations, Vectors and Multi-Dimensional Hyperspace All measurement is a proxy for what is really of interest - The Relationship between themThe scale of measurement and the scale of analysis and reporting are not always the same - TransformationsWe often make measurements that are highly correlated - Multi-component Vectors
6Output Importance of components: > summary(gulls.pca2)Importance of components:Comp Comp Comp Standard deviation Proportion of Variance Cumulative Proportion> gulls.pca2$loadingsLoadings:Comp.1 Comp.2 Comp.3 Comp.4 Weight Wing Bill H.and.B
9Inferring Gradients from Attribute Data (e.g. species)
10Indirect Gradient Analysis Environmental gradients are inferred from species data aloneThree methods:Principal Component Analysis - linear modelCorrespondence Analysis - unimodal modelDetrended CA - modified unimodal model
15ApproachesUse single responses in linear models of environmental variablesUse axes of a multivariate dimension reduction technique as responses in linear models of environmental variablesConstrain the multivariate dimension reduction into the factor space defined by the environmental variables
16Dimension Reduction (Ordination) ‘Constrained’ by the Environmental Variables
18Working with the Variability that we Can Explain Start with all the variability in the response variables.Replace the original observations with their fitted values from a model employing the environmental variables as explanatory variables (discarding the residual variability).Carry our gradient analysis on the fitted values.
19Unconstrained/Constrained Unconstrained ordination axes correspond to the directions of the greatest variability within the data set.Constrained ordination axes correspond to the directions of the greatest variability of the data set that can be explained by the environmental variables.
20Direct Gradient Analysis Environmental gradients are constructed from the relationship between species environmental variablesThree methods:Redundancy Analysis - linear modelCanonical (or Constrained) Correspondence Analysis - unimodal modelDetrended CCA - modified unimodal model
23How Similar are Objects/Samples/Individuals/Sites?
24Similarity approaches or what do we mean by similar?
25Different types of data exampleContinuous data : heightCategorical dataordered (nominal) : growth ratevery slow, slow, medium, fast, very fastnot ordered : fruit colouryellow, green, purple, red, orangeBinary data : fruit / no fruit
26Different scales of measurement exampleLarge Range : soil ion concentrationsRestricted Range : air pressureConstrained : proportionsLarge numbers : altitudeSmall numbers : attribute countsDo we standardise measurement scales to make them equivalent? If so what do we lose?
27Similarity matrixWe define a similarity between units – like the correlation between continuous variables.(also can be a dissimilarity or distance matrix)A similarity can be constructed as an average of the similarities between the units on each variable.(can use weighted average)This provides a way of combining different types of variables.
28Distance metrics relevant for continuous variables: Euclideancity block or ManhattanABAB(also many other variations)
29Similarity coefficients for binary data simple matchingcount if both units 0 or both units 1Jaccardcount only if both units 1(also many other variants, eg Bray-Curtis)simple matching can be extended to categorical data0,11,10,01,00,11,10,01,0
37DiscriminatingIf you have continuous measurements and you know which 2 groups you are looking for (e.g. male and female in the gulls data), linear discriminant analysis will find a function of the measurements which will help to allocate new subjects to the groups
38Canonical Variate Analysis For more than 2 groups canonical variate analysis maximises the between group to within group variances – this is related to a multivariate analysis of variance (MANOVA)
40Clustering methods hierarchical non-hierarchical divisive put everything together and splitmonothetic / polytheticagglomerativekeep everything separate and join the most similar points (classical cluster analysis)non-hierarchicalk-means clustering
41Agglomerative hierarchical Single linkage or nearest neighbourfinds the minimum spanning tree:shortest tree that connects all pointschaining can be a problem
42Agglomerative hierarchical Complete linkage or furthest neighbourcompact clusters of approximately equal size.(makes compact groups even when none exist)
43Agglomerative hierarchical Average linkage methodsbetween single and complete linkage
48Building and testing models Basically you just approach this in the same way as for multiple regression – so there are the same issues of variable selection, interactions between variables, etc.However the basis of any statistical tests using distributional assumptions are more problematic, so there is much greater use of randomisation tests and permutation procedures to evaluate the statistical significance of results.
73Models of Species Response There are (at least) two models:-Linear - species increase or decrease along the environmental gradientUnimodal - species rise to a peak somewhere along the environmental gradient and then fall again
77Non-metric multidimensional scaling NMDS maps the observed dissimilarities onto an ordination space by trying to preserve their rank order in a low number of dimensions (often 2) – but the solution is linked to the number of dimensions chosenit is like a non-linear version of PCOdefine a stress function and look for the mapping with minimum stress(e.g. sum of squared residuals in a monotonic regression of NMDS space distances between original and mapped dissimilarities)need to use an iterative process, so try with many different starting points and convergence is not guaranteed
78Procrustes rotationused to compare graphically two separate ordinations