Download presentation

1
**An Introduction to Multivariate Analysis**

Lectures 14-15 Drs. Alan S.L. Leung and Kenneth M.Y. Leung

2
**Multivariate analysis**

An extension to univariate (with a single variable) and bivariate (with two variables) analysis Dealing with a number of samples and species/environmental variables simultaneously

3
**Multivariate Data Set Data usually in a form of data matrix…..**

Morphological measurement of organisms (e.g. length) Physiological measurement of organisms (e.g. blood pressure) Physiochemical measurement of the environment (e.g. air temperature) Species abundance Species richness etc…… Data usually in a form of data matrix…..

7
**Similarity (S) between samples**

Ranged from 0 to 100 % or 0 to 1 S = 100% if two samples are totally similar (i.e. the entries in two samples are identical) S = 0 if two samples are totally dissimilar (i.e. the two samples has no species in common)

8
**Bray-Curtis coefficient (Bray & Curtis, 1957)**

First developed in terrestrial ecology Where, yij represented the abundance of species i in sample j, yik represented the abundance of species i in sample k, and n represented the total number of samples.

9
**Please calculate the Bray-Curtis Similarity between samples:**

where, yij represented the abundance of species i in sample j, yik represented the abundance of species i in sample k, and n represented the total number of samples. Please calculate the Bray-Curtis Similarity between samples: X2 and X3 X3 and Y1

10
**} = 84 } = 38 SX2 X3 = 100 { 1 - SX3 Y1 = 100 { 1 - 3+0+0+2+8**

} = 84 SX3 Y1 = 100 { 1 - } = 38

11
**Species similarity matrix**

12
**Transformation Two distinct roles:**

To validate statistical assumptions for parametric analysis (e.g. variance heterogeneity in ANOVA) To weight the contributions of common and rare species in non-parametric multivariate analysis

13
**Why Transforming the data?**

To weight the contributions of common and rare species Transformed and untransformed data can give different results on the computation of dissimilarities between samples Affect the final outcome (solution) of nMDS

14
**Choice of transformation in multivariate analysis**

Intermediate abundance species Square-root Fourth-root / Log (1+y) Presence/Absence Degree of severity Rare species Not commonly used

15
**Species similarity matrix – Fourth-root transformed**

Some patterns can be seen, but…

16
**Multivariate Techniques**

The most widely used multivariate techniques included: Cluster Analysis Ordination E.g. Multiple discriminant analysis

17
Cluster Analysis Put samples (sites, species, or environmental variables) into groups based on their similarity. Samples within the same group are more similar to each other than samples in different groups

18
Dendrogram Samples Statistical Software: PRIMER 5 for Windows

19
**Ordination Graphical presentation technique**

Ordination map (usually two or three-dimensional) The relatively distances among points in the ordination map represent the similarity among samples (say species composition)

20
**Two Types of Ordination Techniques**

Indirect gradient analysis Only includes biological data - Species abundance by samples matrix Environmental data can be correlated with the ordination axes subsequently Direct gradient analysis Includes both environmental and biological data

21
**Non-metric Multi-dimensional Scaling (nMDS)**

Indirect gradient analysis Including: Principle Component Analysis (PCA) Correspondence Analysis (CA) Detrended Correspondence Analysis (DCA) Non-metric Multi-dimensional Scaling (nMDS) Principle Component Analysis (PCA) Direct gradient analysis Including: Redundancy Analysis (RD) Canonical Correspondence Analysis (CCA) Detrended Canonical Correspondence Analysis (DCCA) Non-metric Multi-dimensional Scaling (nMDS)

22
**PCA Use original data matrix Best-fit curve**

First Principle Component Axis (PC1) Source: Clarke, K. R. & Warwick, R. M. (1994) Change in Marine Communities: an Approach to Statistical Analysis and Interpretation. Plymouth Marine Laboratory, Plymouth: 144pp.

23
Rotation Second principal component axis (PC2) – perpendicular to PC1 (i.e. uncorrelated / orthogonal)

24
**Third principal component axis (PC3)**

Theoretically, many more species can be added

26
**The variances extracted by the PCs**

Eigenvalues PC Eigenvalues %Variation Cum.%Variation Eigenvectors (Coefficients in the linear combinations of variables making up PC's) Variable PC1 PC2 PC3 PC4 PC5 A B C D E Species

28
**Ecological data which can fulfill these assumptions are rare…..**

PCA Assumptions Linear relationships between variables Normality of the variables Ecological data which can fulfill these assumptions are rare…..

29
**Multidimensional Scaling**

A technique for analyzing multivariate data Visualization of the relationships between samples to facilitate interpretation in a low dimensional space There are two types of MDS: Metric Non-metric

30
**Metric MDS: Non-metric MDS (nMDS)**

Assume the input data is either interval or ratio during measurement Quantitative Non-metric MDS (nMDS) The data should be in the form of rank Quantitative and/or Qualitative

31
**Major Advantages of nMDS**

Ordination is based on the ranked similarities/dissimilarities between pairs of samples Ordinal data could be used The actual values of data are not being used in the ordination, few (no?) assumptions on the nature and quality of the data e.g. 1 = very low; 2 = low; 3 = mid; 4 = high; 5 = very high

32
**Bray-Curtis similarity**

Modified from Clarke & Warwick, 1994

33
**An Ecological Example Spatial and temporal variability in benthic**

macroinvertebrate communities in Hong Kong Streams

34
**Macroinvertebrate communities**

35
**Macroinvertebrate communities**

36
**Macroinvertebrate communities**

37
Study Sites (HK map)

38
Spatial

39
Temporal

40
**Macroinvertebrate Sampling & Identification**

41
**Nested analysis of variance (ANOVA)**

Statistical Analysis Nested analysis of variance (ANOVA) Regions (Random, orthogonal) Sites (Random, nested within Regions) Sections (Random, nested within Sites) Spatial Years (Random, orthogonal) Seasons (Fixed, orthogonal) Days (Random, nested within Years and Seasons) Temporal Interactions between them

42
**Non-parametric multivariate analysis**

Statistical Analysis Non-parametric multivariate analysis Non-metric multidimensional scaling (NMDS) Analysis of similarities (ANOSIM) Display the stream community data in ordination diagrams intended to reveal underlying patterns in the community structure Compare the community structure among spaces and times

43
**Species Abundance vs. Samples**

44
**Fourth – root transformed**

45
**A measure of goodness-of-fit**

50
**Multivariate analysis - Temporal**

Years [All samples in all sites; Each Region; Each Site; Each Section in each Site] Seasons (all years & each year) [All samples in all sites; Each Region; Each Site; Each Section in each Site] Dates within Seasons in each year

54
Day 1 A 1 – Day 1 B 2 n.s. Day 1 C 3 Day 2 A 4 0.281 Day 2 B 5 Day 2 C 6 Day 3 A 7 0.531 Day 3 B 8 0.500 0.698 Day 3 C 9 Day 4 A 10 0.771 0.729 0.813 0.844 0.833 0.792 Day 4 B 11 0.688 0.667 0.521 0.333 0.469 0.615 Day 4 C 12 0.406 0.448 0.417 ANOSIM R statistics: R = 1 only if all replicates within sites are more similar to each other than any replicates from different sites R is approximately zero if the similarities between and within sites are the same on average Results of one-way ANOSIM between the Lam Tsuen site sampling sections within the dry season in The pairs that are significantly different (at 5% significant level) are shown with the R statistics values.

55
Day 1 A 1 – Day 1 B 2 n.s. Day 1 C 3 Day 2 A 4 0.281 Day 2 B 5 Day 2 C 6 Day 3 A 7 0.531 Day 3 B 8 0.500 0.698 Day 3 C 9 Day 4 A 10 0.771 0.729 0.813 0.844 0.833 0.792 Day 4 B 11 0.688 0.667 0.521 0.333 0.469 0.615 Day 4 C 12 0.406 0.448 0.417 Day 2 Day 3 ANOSIM R statistics: R = 1 only if all replicates within sites are more similar to each other than any replicates from different sites R is approximately zero if the similarities between and within sites are the same on average

56
ANOSIM LT 1997 Dry Season The number of pairs of sections significantly different (percentage) Average R statistics of significantly different pairs The same section between different days 9/18 (50%) 0.578 Among all sections within the same day 0/12 (0%) –– Among all sections between different days 20/36 (56%) 0.602

57
ANOSIM LT 1997 Wet Season The number of pairs of sections significantly different (percentage) Average R statistics of significantly different pairs The same section between different days 15/18 (83%) 0.674 Among all sections within the same day 2/12 (17%) 0.662 Among all sections between different days 29/36 (81%) 0.647

58
**Implications The macroinvertebrate community structures are,**

on average: more similar within the same region more similar within the same site …. and the patterns are more obvious in the dry seasons Sites of the same region are more similar to each others Samples of the same site are more similar to each others

59
Implications There is no obvious pattern on the community structure between sections within a site The community structures of the study sites are, in general, similar between years Seasonality The spatial scale “Sections” is not an important factor However, in some sites, variation between years could be high There are STRONG seasonality patterns. However, within season variation (days) is also noticeable

60
Implications Patterns in the community structure are uncovered. Regions, Sites and Seasons are important factors to our understanding of the stream communities in Hong Kong Although there is small scale variability (within site), large scale variability (among sites and between regions) is playing a more important role in the macroinvertebrate communities

Similar presentations

OK

Canonical Correlation Analysis, Redundancy Analysis and Canonical Correspondence Analysis Hal Whitehead BIOL4062/5062.

Canonical Correlation Analysis, Redundancy Analysis and Canonical Correspondence Analysis Hal Whitehead BIOL4062/5062.

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on polynomials in maths what is pi Ppt on bluetooth broadcasting system Ppt on diversity in living organisms images Ppt on media revolution 7 Ppt on eia report news Ppt on point contact diode Ppt on endangered species free download Ppt on employee time management Human liver anatomy and physiology ppt on cells Ppt on marketing strategy