Presentation is loading. Please wait.

Presentation is loading. Please wait.

Phylogenetic Diversity Measures Based on Hill Numbers Anne Chao National Tsing Hua University Institute of Statistics Hsin-Chu, Taiwan 30043 Eco-Stats.

Similar presentations


Presentation on theme: "Phylogenetic Diversity Measures Based on Hill Numbers Anne Chao National Tsing Hua University Institute of Statistics Hsin-Chu, Taiwan 30043 Eco-Stats."— Presentation transcript:

1 Phylogenetic Diversity Measures Based on Hill Numbers Anne Chao National Tsing Hua University Institute of Statistics Hsin-Chu, Taiwan 30043 Eco-Stats Symposium The University of New South Wales Sydney, Australia July 11-12, 2012

2 Collaborators of this work: Chun-Huo Chiu National Tsing Hua Univ Taiwan Lou Jost EcoMinga Foundation Ecuador

3 Outline: Traditional Diversity Measures (do not consider species relatedness) Focus on: Doubling property & Hill numbers Phylogenetic Diversity via Hill Numbers (consider taxonomic or phylogenetic distance between species) Simple Illustrative Examples Statistical Estimation (brief)

4 Bird species diversity

5 Diversity for the class Crustacea (greatest diversity in the oceans)

6 Dazzling Orchid Diversity… Some are abundant, some are rare, some still undiscovered

7 Variance vs. Diversity Variance Numerical Variables Categories (Species) Diversity

8 Biodiversity Definition Variety and variability among living organisms and the ecological complexes in which they occur Variation of life at all levels of biological organization

9 Biodiversity Levels Gene diversity- diversity of genes within a species Species (or taxonomic/phylogenetic) diversity- diversity among species in an ecosystem Ecosystem (or functional) diversity- diversity of different ecosystems on Earth.

10 Traditional Species Diversity: S species and indexed by 1, 2,.., S Species absolute abundance/biomass (A 1, A 2, …, A S ) Species relative abundance/biomass (p 1, p 2, …, p S ) sum=1

11 Traditional Biodiversity Measures/Indices “Diversity measures” is a diverse issue: Different indices/measures quantify different aspects Two components - species richness - evenness among abundances

12 Species: 4 More diverse Species: 3

13 Uneven Species: 3 Even More Diverse

14 4 species Uneven 3 species Even Which one is more diverse?

15 Gini-Simpson Index (Gini 1912; Simpson 1949) (Gini-Simpson Index) Take two individuals, the probability that they belong to different species: (Simpson Index/Concentration, Repeat Rate)

16 Shannon (1948) entropy Measure of uncertainty uncertainty in the species identity of a randomly sampled individual

17 Doubling Property MacArthur (1965), Hill (1973) There are two completely distinct (no overlapped species) communities, each with diversity measure X Combine these two with equal weight, the diversity should become 2X An essential minimum requirement for a “diversity” that ecologists expect “Replication principle” in economics (Dalton 1920): extension to K communities/groups

18 What kinds of measures satisfy doubling property? Species richness? Entropy? Gini-Simpson index? Yes!! no!!

19 Species richness 4 + 4 = 8 Entropy? 1.39 + 1.39 > 2.08 Gini-Simpson index? 0.75 + 0.75 > 0.875

20 Species richness 4 + 4 = 8 Exp(entropy) 4 + 4 = 8 Inverse (Simpson) 4 + 4 = 8 If a measure cannot satisfy RP in this simple completely distinct case, we would not expect it to work for complicated cases

21 Hill’s (1973) Family of Diversity Indices of order q q = 0, 0 D = species richness q =1, 1 D = exponential of entropy q = 2, 2 D = inverse of Simpson index

22 “Order” q (Tsallis 2001; Keylock 2005) The order q determines the measure’s sensitivity to species frequencies q > 1, sensitive to common species q < 1, sensitive to rare species q = 1, weighs species by their frequencies, without favoring either common or rare species

23 Hill numbers: transform to units of “species” Entropy = 1.39 is equivalent to exp(1.39) = 4 “species” Gini-Simpson index = 0.75 is equivalent to 1/(1-0.75) = 4 “species”

24 Hill Numbers: “Species Equivalent” “Effective number of species” The number of equally-common species that would be needed to give the same diversity as the community in study For equally-common community, Hill numbers are equal to species richness for all orders of q;

25 25 “Effective number of species” Community: S species {p 1, p 2, …, p S } Hill numbers = D for an order q Simple Community: D species with equal relative abundances {1/D, …,1/D} 

26 Hill numbers: An intuitive equivalence p1p1 p2p2 … pSpS … Complex community : = Then Simple community :

27 Examples: four hypothetical communities There are 100 species, 500 individuals A: equally-common B: slightly uneven C: moderately uneven D: highly uneven

28 Quantifying species diversity by a profile of Hill numbers Equally Common Slightly uneven Moderately uneven highly uneven

29 Diversity partitioning via Hill numbers Partitioning gamma (regional) diversity into alpha (within-community) diversity and beta (between-community) diversity Intense debates on additive or multiplicative? Chao et al. (2012) proposed a resolution that both converge to the same classes of similarity measures: Jaccard, Sorenson (q = 0), Horn (q=1) and Morisita-Horn similarity measures (q =2)

30 Phylogenetic Diversity : Community 1 Community 2 All else being equal, which community is more diverse?

31 Species in community 2 is more phylogenetically diverse than community 1 Pielou (1975, p. 17) was the first to notice the concept of diversity could be broadened to consider taxonomic difference between species. Community 1 Community 2

32 “I think” Tree of Life The first-known sketch by Charles Darwin of an evolutionary tree describing the relationships among groups of organisms http://www.amnh.org/exhibition s/darwin/idea/treelg.php

33 p1p1 p2p2 p3p3 p1p1 p2p2 p3p3 Phylogenetic Diversity Measures : We not only consider the relative abundance of species, but also the phylogenetic relationship among species. And, satisfy the essential requirement “replication principle”.

34 Doubling Property for Phylogenetic Diversity Two completely phylogenetically distinct assemblages (no shared lineages), with the same phylogenetic diversity =X. Assemblages are pooled in equal proportions, then the pooled assemblage has phylogenetic diversity 2X. Similar extension to N assemblages

35 35 Doubling Property in phylogenetic version Two completely phylogenetically distinct (no overlapped tree branch) across assemblages, each with diversity measure X Combine these two, the diversity becomes 2X

36 Pioneering Work in phylogenetic diversity (1) Branch-length-based measure: Phylogenetic Diversity PD (Faith 1992) sum of the branch lengths of the phylogeny connecting all species from tips to root Satisfy “replication principle”.

37 Faith (2002) PD: total branches length 12 10 9 8 Lineages completely distinct

38 Pioneering work (2) Weitzman (1992, 1993, 1998) from a perspective of economic theory of biodiversity preservation “Unfortunately, Noah’s Ark has a limited capacity …. and a (limited) budget available for biodiversity preservation…” What to preserve?

39 The Noah ’ s Ark: the agony of choice The woodpecker might have to go! Courtesy of Ramon Teja, http://www.livepencil.com/

40 Traditional Phylogenetic Species richness Faith PD (Faith 1992) Entropy Phylogenetic entropy (Allen et al. 2009) Gini-Simpson Quadratic entropy (Rao 1982) Hill Numbers Chao, Chiu and Jost (2010)

41 Pioneering Work (3) Quadratic entropy (Rao 1982) d ij : phylogenetic distance between species i and j, p i and p j denote species relative abundance of species i and j. Q: mean phylogenetic distance between any two randomly chosen individuals in a community Phylogenetic entropy (Allen et. al. 2009) L i : length of branch i, a i : the abundance descending from branch i. A parametric class based on Tsallis entropy (Pavoine et. al. 2009) I 0 = Faith’s PD minus the tree height I 1 = phylogenetic entropy Hp I 2 = Rao’s Q measure

42 Phylogenetic diversity measures Except for Faith’s PD, all indices mentioned above do NOT satisfy the “replication principle”. (Need transformations!) Chao et al. (2010) were motivated to develop a unified class of phylogenetic diversity measures based on Hill numbers Satisfy “replication principle”

43 3333 3333 33333333 Faith’s PD 12 + 12 = 24 Phylogenetic entropy H P ? 4.16 + 4.16 > 6.24 Rao’s Q ? 2.25 + 2.25 > 2.625

44 Phylogenetic Diversity Measures: Two parameters: Order q in Hill number Time parameter T: Consider the phylogenetic diversity through T years ago t=0 (Present time ) p 1 +p 2 3 4 p 1 2 3 p 2 3 p 1 p 2 p 3 p 4 slice 1 slice 2 slice 3 L 1 L 2 L 3 L 4 L 5 L 6 L 7

45 Basic approach based on Hill Numbers for shared lineages At any given moment t, slice the tree, we can find the lineage (branch cuts, “species”) and its relative abundance (measure of their importance in the present-day community) Obtain Hill number q D(t) at moment t. Average over from the present time to T years ago Call this average diversity as “Mean Diversity of order q over T years”, it is in units of “lineage” (or “species”).

46 Conceptual framework for q = 0 Connect Faith’s PD to mean species richness For a fixed T, the nodes divide the phylogenetic tree into Segment 1, 2 and 3 with duration (length) T1, T2, and T3 In any moment of Segment 1, there are 4 lineages (i.e., 4 branches cut) Segment 2, there are 3 lineages Segment 3, there are 2 lineages The mean lineage (species) richness over the time interval [−T, 0] is (T1/T) ×4 + (T2/T) ×3 + (T3/T) ×2 = total branch length in [-T, 0] / T (Mean Phylogenetic Diversity of order 0 over T years) If T = height of tree, then

47 Conceptual framework for q > 0 To incorporating abundance, use lineage abundance: sum of the relative abundances descended from the branch There are T1 assemblages with abundance vector{p1, p2, p3, p4 }, T2 assemblages with abundance vector {p1, p2+p3, p4 } and T3 assemblages with abundance vector {p1+p2+p3, p4 }. There are a total of T1+T2+T3 = T assemblages and each is given the same weight 1/T. The “Mean diversity of order q over T years” is the following average

48 Mean Phylogenetic Diversity of order q over T years General Formula B T : all branches in the time interval [-T, 0] L i : the length (duration) of Branch i in the set B T a i : the total relative abundance descended from Branch i

49 Interpretation of mean diversity Mean effective number of completely distinct lineages (species) over T years Link to traditional diversity: When all species are completely equally distinct with branch lengths T (including T = 0, ignoring phylogeny)

50 “Effective number of lineages (species)” Assemblage: S species {p 1, p 2, …, p S } Mean diversity = for an order q, time T Assemblage: lineages with equal relative abundances, completely distinct all with branch length T 

51 Related Measure: Branch Diversity q = 0, branch diversity reduces to Faith’s PD Branch diversity: the amount of evolutionary “work” done on the assemblage or the effective lineage- years or lineage-length (or other units) contained in the tree in the time period [−T, 0]

52 Generalize and unify existing measures: Order q = 0 = Total branch lengths in [-T, 0] / T Order q =1 Order q = 2

53 3333 3333 33333333 PD/T 4 + 4 = 8 Exp(H P /T) 4 + 4 = 8 1/(1-Q/T) 4 + 4 = 8

54 Taxonomic Diversity of Level = 3 Phylogenetic tree based on the classical Linnaean taxonomic categories

55 CT: Thinned Site (gray/blue) CU: Un-thinned Site (black/red) Shimatani (2001) Four- level taxonomic tree Phylogenetic tree by PHYLOMATIC (Webb & Donoghue 2004 )

56 Traditional Species diversity: Hill numbers for two sites Thinned Site Un-thinned Site

57 Order q Site CT (thinned site)Site CU (un-thinned site) q=05.4027.25105.3386.7509 q=12.6603.9514.9672.7973.9045.664 q=21.9403.1873.8092.0543.0124.548 Shimatani (2001) concluded that the traditional diversity indices and the taxonomic diversity give different conclusions about the effect of thinning. Our results based on “Mean Phylogenetic Diversity” are consistent with those based on the traditional species diversity for q = 0, 1 and 2.

58 Diversity profile Non-phylogenetic: Use a profile of Hill numbers (as a function of order q) to quantify diversity of a community Phylogenetic: Use three profiles (q = 0, 1, 2); each is a function of time T to quantify phylogenetic diversity All these measures satisfy “doubling property”

59 Based on species richness (q = 0), the diversity of the thinned site dominates that of un-thinned site for all values of T. But for the common species (q = 1) and very abundant species (q = 2), we have the reverse conclusion. Mean Phylogenetic Diversity Un-thinned Site Thinned Site Un-thinned Site Thinned Site Un-thinned Site Thinned Site

60 Extensions The general cases of non-ultrametric trees Partitioning phylogenetic Hill numers: phylogenetic alpha, beta, gamma diversity measures and related similarity measures (Chiu, Jost & Chao 2013) Extension to dendrogram-based functional diversity (Petchey and Gaston, 2002) Extension to distance-based functional diversity

61 Statistical Estimation for traditional diversity measures Depends on the order q q = 0 “species richness estimation” q = 1 “Shannon entropy estimation” and its exponential q = 2 widely used in genetics (gene identity, or heterozygosity) Nearly unbiased estimator exists Non-surprisingly Non-trivial Surprisingly Non-trivial Non-surprisingly trivial

62 q = 0 “species richness estimation” Since Fisher, Corbert and Williams (1943) Curve fitting (fitting a parametric curve to SAC) Parametric models for species abundances Non-parametric approach Rarefaction/extrapolation of species accumulation curve (by estimating expected species richness for a finite size sample or sample completeness

63 q = 1 “Entropy estimation” Since Shannon (1948) Traditional bias-reduction Jackknife for bias-reduction Bayesian approaches Coverage-adjusted estimator Estimation via Renyi’s entropies Polynomial representation

64 Other Related Estimation Issues Hill numbers: Estimation of gamma, alpha and beta diversity and related similarity/differentiation measures Their phylogenetic generalization

65 Main References: Chao, A., Chiu C.-H. and Jost, L. (2010). Phylogenetic diversity measures based on Hill numbers. Philosophical Transactions of the Royal Society B., 365, 3599-3609. Chiu, C.-H., Jost, L. and Chao, A. (2013). Phylogenetic beta diversity, similarity, and differentiation measures based on Hill numbers. To appear in Ecological Monographs. Chao, A., Gotelli, N. G., Hsieh, T. C., Sander, E. L., Ma, K. H., Colwell, R. K. and Ellison, A. M. (2013). Rarefaction and extrapolation with Hill numbers: a framework for sampling and estimation in species biodiversity studies. To appear in Ecological Monographs.

66 Nanney (2004) “We are all blind men (and women) trying to describe a monstrous elephant of ecological and evolutionary diversity...”

67 Heaven is under our feet as well as over our heads Henry David Thoreau, Writer and Naturalist (1817-1862) THANK YOU VERY MUCH!! THANK YOU VERY MUCH!!


Download ppt "Phylogenetic Diversity Measures Based on Hill Numbers Anne Chao National Tsing Hua University Institute of Statistics Hsin-Chu, Taiwan 30043 Eco-Stats."

Similar presentations


Ads by Google