Presentation is loading. Please wait.

Presentation is loading. Please wait.

Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI

Similar presentations


Presentation on theme: "Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI"— Presentation transcript:

1 Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI knowlesl@umich.edu

2 Emphasis on multilocus data in phylogenetics and phylogeography… The good The bad The ugly Utility of single locus data for inferences about speciation history??

3 Estimating population genetic parameters relevant to the process of species divergence 11 T Present AA 22 m

4 speciation T Was speciation promoted by displacements into glacial refugia or recolonization of sky islands during interglacials? Was diversification inhibited or promoted during the Pleistocene? accurate & precise estimates of T is essential to evaluating when, and therefore the geographic setting, of species divergence Parameterized model for making inferences about the divergence process

5 36 M. oregonensis 23 M. montanus Divergence M. oregonensis and M. montanus from the Rocky Mountains Carstens & Knowles 2007, Mol. Ecol. 16:619-27. 5 anonymous nuclear loci 1 mitochondrial locus

6 11 T Present m AA 22 coalescent framework and multilocus versus single locus data set 4.9 x 10 5 to 2.0 x 10 6 estimate from average mtDNA genetic distance: *same mutation rate used in the different approaches divergence of gene lineages within the ancestral species

7 Assumed species tree of Poephila finches Jennings & Edwards (2005) Evolution hecki acuticauda cincta Long-tailed Finch Black-throated Finch t ahc -t ah t ah Australia  ah  ahc Identified role of geographic barriers in a Pleistocene divergence of the grass finches Bayes Markov chain Monte Carlo (MCMC) method (Yang and Rannala) - multiple independent loci - estimates ancestral  (present  also) - estimates population divergence times - uses branch length information - accounts for uncertainty in gene trees Assumptions: -“know” the species tree - random mating - no gene flow after population divergence - free recombination among loci (not within) Parameterized model for making inferences about the divergence process Analysis of 30 anonymous nuclear loci

8 Jennings & Edwards (2005) Evolution hecki acuticauda cincta t ahc -t ah t ah  ah  ahc Prior and posterior probability distributions (grey and black lines refer to analyses based on two different priors) Increasing variance with decreasing number of loci

9 Estimating population genetic parameters relevant to the process of species divergence 11 T Present AA 22 m The good The bad The ugly

10 Estimating the history (order) of divergence events (i.e., the species tree) for recently derived taxa Effects of sampling scheme: contrast between sequencing single representatives per species versus multiple individuals per species

11 gene tree species tree Gene trees will not always match the species tree deep coalescence Maddison 1997

12 While there is a distribution of possible gene trees for a given species tree, the probabilities of each gene tree differs low P(G tree |S tree ) high P(G tree |S tree ) Degnan & Salter (2005) Evolution 5 taxa 105 possible gene tree topologies * The shape of this distribution will differ depending on the shape of the species tree

13 Inferred history of species divergence differs among loci Jennings & Edwards (2005) Evolution Gene trees from 30 anonymous markers with single individual sequenced per species

14 Estimating the history (order) of divergence events (i.e., the species tree) for recently derived taxa Gene tree from one locus with 9 individuals sequenced in each of 8 different species

15 Multilocus data concatenation “THE history” Arbitrary criteria History of divergence based on single nucleotide difference What is the true species tree?

16 Recently developed approaches for estimating the species tree (explicitly consider the process of gene lineage coalescence in the estimation of the history of species divergence) Maddison & Knowles 2006 Edwards et al. 2007 Liu & Pearl 2007 gene tree species tree Gene tree from one locus with multiple individuals sequenced per species discord Extract the historical signal of species divergence, despite discord between the gene tree and species tree

17 Goal: estimate the species tree directly (as opposed to estimating a gene tree and equating that gene tree with the history of the species) species tree species A

18 gene tree species tree discord (1) minimize the number of deep coalescences (2) shallowest divergence between species Considers the process of lineage sorting, but the actual probabilities of incomplete lineage sorting are not quantified using a stochastic model STEM and BEST: Likelihood and Bayesian approaches that incorporate stochastic models of both nucleotide substitution and lineage sorting processes Can the history of species divergence be recovered from a single gene tree:

19 simulated species trees simulated sequences simulated gene trees shallowest divergence approach minimize the number of deep coalescences reconstructed gene trees reconstructed species trees infer species tree: 500 1............ Maddison & Knowles 2006

20 simulated species trees inferred species trees accuracy assessment number of partitions of the species in common between original and inferred species trees (max = 5 for the 8 species trees)

21 500 replicate species trees of 8 species each Goals:  Examine a reasonable spectrum of topologies and branch lengths simulated species trees (500 species trees were simulated rather than choosing a single species tree & assessing how well it can be reconstructed with many simulation replicates) t = 100,000 (i.e., 1N e ); 500 replicate species trees t = 1,000,000 (i.e., 10N e ); 500 replicate species trees (*topologies of the two sets of trees are identical)  Determine how the extent of incomplete lineage sorting affects the ability to reconstruct species histories Maddison & Knowles 2006

22 (1, 3, 9 or 27 gene trees representing unlinked loci simulated independently with either 1, 3, 9 or 27 gene sequences simulated for each locus per species) simulated species trees simulated gene trees neutral coalescence (N e = 100,000)  Increasing total sampling effort per species (either 1, 3, 9 or 27 sequences per species)  Increasing the number of individuals per locus versus the number of loci per species for a given sampling effort Accuracy affected by: Maddison & Knowles 2006

23 genecopies per locus 1N e 10N e 17.61.8 328.76.9 963.214.7 27114.425.7 Number of deep coalescences Lots of discord (i.e.,our simulated data should well reflect the challenges faced by reconstructing evolutionary relationships near the species/population level) Maddison & Knowles 2006

24 b. total tree depth of 10 N e 3 9 27 1 locusa. total tree depth of 1N e 1 Deep Coalescents 3 9 27 Average proportion of correct partitions (those in the inferred tree matching the true tree) gene trees retain some signal of phylogenetic history despite significant discord with species tree * Average accuracy greater as expected 0.26 0.27 0.47 0.53 0.59 0.60 0.64 0.56 0.76 0.73 1 locus 0.79 0.78 0.80 0.79 0.82 0.84 0.60 is reasonably successful, given that the shared partition measure is sensitive to minor changes in tree structure (approximately equivalent to a single terminal taxon being out of place) Shallowest Divergence Deep Coalescents Shallowest Divergence Deep Coalescents Shallowest Divergence Deep Coalescents Shallowest Divergence gene copies gene copy Deep Coalescents Shallowest Divergence Deep Coalescents Shallowest Divergence Deep Coalescents Shallowest Divergence Deep Coalescents gene copies Maddison & Knowles 2006

25 gene tree species tree * * * * * * Estimating the history (order) of divergence events (i.e., the species tree) for recently derived taxa Gene tree from one locus with multiple individuals sequenced per species and very simple approach The good The bad The ugly What would happen if more loci were considered?

26 0 0 0.1 0.2 0.3 0.4 12 34 5 0.8 proportion of trees random 1 individual 3 individuals 9 individuals 27 individuals 012 34 5 0 0.1 0.2 0.3 0.4 0.8 proportion of trees tree accuracy ( number of shared partitions with ‘true’ tree) random 1 locus 3 loci 9 loci 27 loci Frequency distribution of species tree accuracy with increasing number of loci Frequency distribution of species tree accuracy with increasing number of individuals Similar accuracy for a given sampling effort if sample multiple individuals compared to loci for recent divergence (t = 1N e ) The curve marked “random” shows the expected distribution of the accuracy measure in comparing two randomly simulated trees

27 Wayne Maddison Bryan Carstens, (former postdoc) knowlesl@umich.edu support NSF (DEB 04-47224) & the University of Michigan Acknowledgements:


Download ppt "Speciation history inferred from gene trees L. Lacey Knowles Department of Ecology and Evolutionary Biology University of Michigan, Ann Arbor MI"

Similar presentations


Ads by Google