Presentation is loading. Please wait.

Presentation is loading. Please wait.

The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23,

Similar presentations


Presentation on theme: "The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23,"— Presentation transcript:

1 The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23, 2010

2 Thinking of the history of life in terms of phylogenetic trees is as old as scientific biology (if not older) Charles Darwin (1859) Origin of Species [one and only illustration]: "descent with modification" Ernst Haeckel (1879) The Evolution of Man A brief history of TOL

3 Advent of molecular phylogenetics – trees can be made from alignments of homologous genes - expectations of objectively reconstructed complete Tree of Life Woese et al. (1990) Towards a natural system of organisms: proposal for the domains Archaea, Bacteria, and Eucarya. PNAS 87, 4576-4579 [Figure 1, modified] A brief history of TOL ribosomal RNA phylogeny

4 Enter genomics…and things get complicated Fleischmann et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science. 1995 Jul 28;269(5223):496-512.

5 Massive horizontal gene transfer (HGT) from archaea to a hyperthermophilic bacterium

6 Significant HGT from eukaryotes to a (facultative) symbiotic bacterium

7 Phylogenetic trees for aaRSs – 20 key enzymes of protein biosynthesis diversity of HGT signals Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press

8 Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals

9 Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press

10 Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals

11 Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals

12 Wolf Y I et al. Genome Res. 1999;9:689-710 ©1999 by Cold Spring Harbor Laboratory Press Phylogenetic trees for aaRS: diversity of HGT signals

13 Doolittle WF. (2000) Uprooting the tree of life. Sci. Am. 282, 90-95 [modified] History of TOL BacteriaArchaea Eukaryotes BacteriaArchaea Eukaryotes Troubled times – "uprooting" of TOL for prokaryotes. horizontal gene transfer is rampant; no gene is exempt histories of individual genes are non-coherent with each other tree-like signal is completely lost (or never existed at all) there are no species (or other taxa) in prokaryotes the consistent tree-likesignal we observe is created by biases in HGT "Standard Model""Net of Life"

14 So – in the light of HGT - is there a Tree of Life? Or was the history of life more like a net? To address this question, explore the Forest of Life – trees for ALL conserved genes

15 ANALYSIS OF THE FOL RECONSTRUCTION OF THE FOL 1. SELECTION OF ORTHOLOGOUS GENES 2. MULTIPLE ALIGNMENT 3. CONSTRUCTION OF ML PHYLOGENETIC TREES Tree11 Tree20.4921 Tree30.5910.4871 ………… Tree N 0.325 0.4850.1121 Tree comparison networks cluster analysis 4. TREE COMPARISON METHODS Matrix of distances between trees FOL NUTs > 90% species Exploration of the Forest of Life (FOL): bioinformatic pipeline

16 FOL and NUTs Forest of Life (FOL) and Nearly Universal Trees (NUTs) most of the trees contain relatively few species 102 "nearly universal" trees (90+ species): NUTs 6901 trees (FOL)

17 Archaea and Bacteria in NUTs How well archaea and bacteria are separated? 56% of NUTs show perfect separation between archaea and bacteria; 92% of NUTs show partial, non-random separation archaea and bacteria separated archaea and bacteria mixed archaea invade bacterial subtree bacteria invade archaeal subtree

18 NUTs vs Random Trees Inconsistency Score (IS) compares a tree to a set of trees. The score is based on the frequency of bipartitions derived from the given tree among the whole set of trees. Range ~[0..1]. NUTs are much more consistent other than random trees random trees NUTs

19 NUTs Pattern of Similarity An edge connects two vertices (trees) if the distance between them is below the threshold A single connected cluster appears and gradually grows to encompass all NUTs as the threshold is lowered.

20 Are NUTs Clustered? The 102x102 matrix of distances between NUTs is projected into a lower- dimensional space using multidimensional scaling (CMDS). Analysis using gap function approach (Tibshirani et al. 2001) shows lack of distinct separate clusters in the tree topology space

21 NUTs vs FOL Similarity between NUTs and the rest of the FOL. The NUTs are connected to 2505 trees (36%) from the FOL at a 0.8 similarity cut-off. The mean similarity between the NUTs and the FOL is ~0.5 (only ~0.1 for random trees).

22 NUTs vs FOL The matrix of distances between all the trees in the FOL is projected into a lower-dimensional space using CMDS. FOL/COG trees form 7 distinct clusters in tree topology space. Clusters differ largely by phyletic patterns. NUTs form a tight group within one of the clusters and are approximately equidistant to all clusters. NUTs NUTs vs FOL clusters

23 Interim conclusions Analysis of the full Forest of Life in comparison to NUTs shows that: a considerable fraction of FOL trees are very similar to NUTs: average FOL-NUTs similarity is dramatically above the random level unlike NUTs, topologies of the FOL trees show distinct clustering largely determined by the phyletic patterns (i.e. set of species present) in the tree topology space NUTs form a comparatively tight centrally located group compared to NUTs, FOL trees show much wider diversity of their topologies; however, the "central" tree-like signal still exists for a large part of the FOL a "consensus" tree make sense at least as a crude representation of the common trend in the FOL (especially so for the NUTs)

24 Distinguishing Tree and Net signals in the FOL using Quartets of species The FOL shows a great diversity of phyletic and phylogenetic patterns. We employ the quartet analysis to measure the vertical and horizontal evolutionary signal in different areas of the Forest. there are C 4 100  4x10 6 species quartets from the set of 100 species each quartet of species can resolve into 3 different tree topologies (~12x10 6 combinations total) any tree containing all 4 species from a given quartet resolves them into one of these 3 topologies for any quartet one can compute the support for all 3 topologies within a set of trees (i.e. relative frequencies of the topologies); ~8x10 10 comparisons for the whole FOL (or any subset of trees)) 1 2 3 3 f 1 + f 2 + f 3 = 1

25 Looking for tree-like and net-like evolution in the FOL Species Matrix – NUTs – mostly tree-like signal NUTs ArchaeaBacteria Crenarchaeota Euryarchaeota Cyanobacteria Proteobacteria often rarely together

26 Species Matrix – Rest of the FOL – much stronger net-like (horizontal) signal FOL-NUTs ArchaeaBacteria Crenarchaeota Euryarchaeota Cyanobacteria Proteobacteria often rarely together

27 Species Matrix vs Function J U K L F H I N O S D MECG RQ P T V

28 Conclusions – 3 Quartet-based analysis of species position in the Forest of Life shows that: relationships between species in NUTs roughly follow the conventional microbial taxonomy – probably, a consequence of significant contribution of tree-like evolution the “TOL” signal declines with decreasing number of species in the tree different functional classes of genes show substantially different balance between tree-like and net-like modes of gene transfer and possibly in preferred routes of HGT

29 The Take-Home Message There is no single "Tree of Life" describing the evolutionary history of all or even the majority of the prokaryote genomes Yet, there is a central tree-like trend of evolution compatible with a common history of descent of prokaryotic groups This trend is more evident at shallow phylogenetic depths, in more ubiquitous genes and among some functional categories of genes (eg, translation) Observations are compatible with the ancient divergence between the bacterial and archaeal clades followed by rapid diversification of major phyla followed by HGT that distorted but did not destroy the tree-like signal Altogether, HGT might dominate evolution but the tree-like signal is stronger than the signal from any particular route of HGT Puigbo P, Wolf YI, Koonin EV. (2009) Search for a 'Tree of Life' in the thicket of the phylogenetic forest. J. Biol. 8, 59 Koonin EV, Wolf YI, Puigbo P. (2009) The Phylogenetic Forest and the Quest for the Elusive Tree of Life. Cold Spring Harb Symp Quant Biol. [advanced epub] Koonin EV, Wolf YI. (2009) The fundamental units, processes and patterns of evolution, and the tree of life conundrum. Biol Direct. 4, 33 Puigbo P, Wolf YI, Koonin EV. The tree and net signals in the evolution of prokaryotes, In preparation

30 The Take-Home Message To see the forest for the trees… …but also to see trees for the forest.

31 A quick redux on bioinformatics bioinformatics is a toolbox for computational analysis of biological problems the toolbox is already quite large and versatile, so in many cases, combination of available tools into pipelines allows researchers to address new questions of course, new tools often have to be developed and added to the toolbox biologists must understand the capacities but also the limitations of the tools Can bioinformatics – using the vast amounts of data produced by genomics and systems biology – help transform biology “from stamp collection to physics”?

32 Acknowledgments Pere Puigbo Fellow explorers of the Forest of Life Yuri Wolf


Download ppt "The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23,"

Similar presentations


Ads by Google