In brief Vertical vs. Horizontal Homologous vs. Unequal

Slides:



Advertisements
Similar presentations
The Cobweb of Life Revealed by Genome-Scale Estimates of Horizontal Gene Transfer By Fan Ge, Li-San Wang, Junhyong Kim Published: August 30, 2005 Presented.
Advertisements

Original Figures for "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring"
A Separate Analysis Approach to the Reconstruction of Phylogenetic Networks Luay Nakhleh Department of Computer Sciences UT Austin.
1 3. genome analysis. 2 The first DNA-based genome to be sequenced in its entirety was that of bacteriophage Φ-X174; (5,368 bp), sequenced by Frederick.
Phylogenetic reconstruction
Molecular Evolution Revised 29/12/06
The Cobweb of life revealed by Genome-Scale estimates of Horizontal Gene Transfer Fan Ge, Li-San Wang, Junhyong Kim Mourya Vardhan.
Bas E. Dutilh Phylogenomics Using complete genomes to determine the phylogeny of species.
The Tree of Life (TOL) in the age of Genomics or a journey through the Phylogenetic Forest Eugene Koonin, NCBI / NLM / NIH RECOM BE, San Diego, May 23,
Example of bipartition analysis for five genomes of photosynthetic bacteria (188 gene families) total 10 bipartitions R: Rhodobacter capsulatus, H: Heliobacillus.
Microbial Evolution Zoology/Anthro/Botany 410 Nicole T. Perna April24, 2014.
TGCAAACTCAAACTCTTTTGTTGTTCTTACTGTATCATTGCCCAGAATAT TCTGCCTGTCTTTAGAGGCTAATACATTGATTAGTGAATTCCAATGGGCA GAATCGTGATGCATTAAAGAGATGCTAATATTTTCACTGCTCCTCAATTT.
T-COFFEE Multiple Alignments of Orthologous Sequences Horizontal Gene Transfer (Phylogenetic Trees) WebLogo.
Coalescence and the Cenancestor J. Peter Gogarten University of Connecticut Department of Molecular and Cell Biology.
3- NON-RIBOSOMAL GENE RECONSTRUCTION  Core / auxiliary / strain specific genes  Housekeeping genes and accordance with global reconstruction  MLSA 
Proliferation cluster (G12) Figure S1 A The proliferation cluster is a stable one. A dendrogram depicting results of cluster analysis of all varying genes.
Figure S1_Yao Qin et al. Figure S1 Occurrence and distribution of trihelix family in different plant species. Red branches in the cladogram indicate that.
Bioinformatics 2011 Molecular Evolution Revised 29/12/06.
1. How does conjugation work? Sex in Bacteria How do bacteria exchange DNA.
RECONSTRUCTING A “UNIVERSAL TREE” Classical view Prokaryotes Eukaryotes 1977: C. Woese 3 “primordial kingdoms” (or domains) - based on ribosomal RNA sequence.
Sequence-based Similarity Module (BLAST & CDD only ) & Horizontal Gene Transfer Module (Ortholog Neighborhood & GC content only)
Phylogenetic analyses of cyanobacterial genomes: Quantification of horizontal gene transfer events Olga Zhaxybayeva, J. Peter Gogarten, Robert L. Charlebois,
Identifying conserved segments in rearranged and divergent genomes Bob Mau, Aaron Darling, Nicole T. Perna Presented by Aaron Darling.
Anis Karimpour-Fard ‡, Ryan T. Gill †,
RECONSTRUCTING A “UNIVERSAL TREE” Classical view Prokaryotes Eukaryotes 1977: C. Woese 3 “primordial kingdoms” (or domains) - based on ribosomal RNA sequence.
Significance Tests for Max-Gap Gene Clusters Rose Hoberman joint work with Dannie Durand and David Sankoff.
Brückner et al., Fig. 1b Brückner et al., Fig. 1B a c b 6 Fig. 1. Circular representation of Streptococcus pneumoniae genome comparisons.
Ayesha M.Khan Spring Phylogenetic Basics 2 One central field in biology is to infer the relation between species. Do they possess a common ancestor?
Statistical Tests We propose a novel test that takes into account both the genes conserved in all three regions ( x 123 ) and in only pairs of regions.
Patterns of HIV-1 evolution in individuals with differing rates of CD4 T cell decline Markham RB, Wang WC, Weisstein AE, Wang Z, Munoz A, Templeton A,
Substitution Matrices and Alignment Statistics BMI/CS 776 Mark Craven February 2002.
Visualizing Biosciences Genomics & Proteomics. “Scientists Complete Rough Draft of Human Genome” - New York Times, June 26, 2000 The problem: –3 billion.
Announcements Seminar today after class! Seminar Wednesday!
3. genome analysis.
Fig. 1. — The life cycle of S. papillosus. (A) The life cycle of S
Figure 1. Exploring and comparing context-dependent mutational profiles in various cancer types. (A) Mutational profiles of pan-cancer somatic mutations,
Hair Keratin Associated Proteins: Characterization of a Second High Sulfur KAP Gene Domain on Human Chromosome 211  Michael A. Rogers, Hermelita Winter,
Hair Keratin Associated Proteins: Characterization of a Second High Sulfur KAP Gene Domain on Human Chromosome 211  Michael A. Rogers, Hermelita Winter,
Volume 10, Issue 3, Pages (September 2011)
Ratio distributions of gene expression in each trisomy and ploidy compared with diploids. Ratio distributions of gene expression in each trisomy and ploidy.
Varodom Charoensawan, Derek Wilson, Sarah A. Teichmann 
Lateral Transfer of an EF-1α Gene
Bacterial genomics: The controlled chaos of shifty pathogens
Volume 19, Issue 5, Pages (May 2011)
Recombination between Palindromes P5 and P1 on the Human Y Chromosome Causes Massive Deletions and Spermatogenic Failure  Sjoerd Repping, Helen Skaletsky,
Impulse Control: Temporal Dynamics in Gene Transcription
Hox Gene Loss during Dynamic Evolution of the Nematode Cluster
Volume 18, Issue 5, Pages (November 2015)
Circular map of the chromosome of M. tuberculosis H37Rv.
Archaea and the Origin(s) of DNA Replication Proteins
The Complete Genome Sequence of Escherichia coli K-12
Cell-type Phylogenetics and the Origin of Endometrial Stromal Cells
Figure 1. The 12 species in this study and details of the improved G4-seq method. (A) Phylogenetic representation of ... Figure 1. The 12 species in this.
The mosaic genome structure and phylogeny of Shiga toxin-producing Escherichia coli O104:H4 is driven by short-term adaptation  K. Zhou, M. Ferdous, R.F.
Analysis of the complete genome sequences of human rhinovirus
Volume 39, Issue 2, Pages (October 2016)
A DNA Replication Mechanism for Generating Nonrecurrent Rearrangements Associated with Genomic Disorders  Jennifer A. Lee, Claudia M.B. Carvalho, James.
Volume 109, Issue 6, Pages (September 2015)
Maximum likelihood phylogeny of USA500 and other CC8 strains.
by Meru J. Sadhu, Joshua S. Bloom, Laura Day, and Leonid Kruglyak
Matthew A. Campbell, Piotr Łukasik, Chris Simon, John P. McCutcheon 
Fig. 2 Overview of transcriptional mutagenesis in yeast.
Genetic and Epigenetic Regulation of Human lincRNA Gene Expression
Volume 13, Issue 2, Pages (January 2003)
by Peter J. Turnbaugh, Vanessa K. Ridaura, Jeremiah J
Volume 21, Issue 23, Pages (December 2011)
by Pan Tao, Xiaorong Wu, and Venigalla Rao
Whole-genome sequencing of the blue whale and other rorquals finds signatures for introgressive gene flow by Úlfur Árnason, Fritjof Lammers, Vikas Kumar,
Integrated analysis of gene expression and copy number alterations.
Fig. 4 Scaling laws distinguish biochemical networks from random networks across levels of organization. Scaling laws distinguish biochemical networks.
Presentation transcript:

In brief Vertical vs. Horizontal Homologous vs. Unequal Prokaryotes vs. Eukaryotes Mechanisms and Vectors Impact on Tree of Life Implications for prokaryotic species

Possible mechanisms for HT in Drosophila From Heredity (2008) 100, 545–554

EVOLUTION: Genome Data Shake Tree of Life E Pennisi - Science, 1998 - sciencemag.org The ring of life provides evidence for a genome fusion origin of eukaryotes MC Rivera, JA Lake - Nature, 2004 The net of life: reconstructing the microbial phylogenetic network V Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005 The tree of one percent T Dagan, W Martin - Genome biology, 2006 Uprooting the tree of life WF Doolittle - Evolution: a Scientific American reader, 2006

Clusters of Orthologous Groups (COGs)

Puigbo et al. 6901 ML trees 100 taxa total Objective – compare topological distance between trees New metric called IS (inconsistency score) = fraction of the time splits in a tree are found all trees

Many genes are not found in all taxa Define 102 NUTs or “nearly universal trees” that include 90% of the prokaryotes under comparison. Mostly translation and core transcription related

J Biol. 2009;8(6):59.

The big divide? Look for evidence of HGT between bacteria and archaea 56% of NUTs separated the groups perfectly 44% show at least on HGT 13% from archaea to bacteria 23% from bacteria to archaea 8% both directions

The network of similarities among the nearly universal trees (NUTs) The network of similarities among the nearly universal trees (NUTs). (a) Each node (green dot) denotes a NUT, and nodes are connected by edges if the similarity between the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs and the 14 1:1 NUTs depending on the topological similarity threshold.

The supernetwork of the NUTs The supernetwork of the NUTs. For spcies abbreviations see Additional File 1. Puigbò et al. Journal of Biology 2009 8:59   doi:10.1186/jbiol159

Network representation of the 6,901 trees of the forest of life Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as red circles in the middle. The NUTs are connected to trees with similar topologies: trees with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple circles and connected to the NUTs. The rest of the trees are shown as green circles. Puigbò et al. Journal of Biology 2009 8:59   doi:10.1186/jbiol159

Similarity of the trees in the forest of life to the NUTs Similarity of the trees in the forest of life to the NUTs. (a) For each of the 102 NUTs, the breakdown of the rest of the trees in the forest by percent similarity is shown. (b) The same breakdown for 102 random trees generated from the NUTs. Puigbò et al. Journal of Biology 2009 8:59   doi:10.1186/jbiol159

Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7.

Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved. Each oval represents a prokaryotic group, with the name and number of taxa in that group indicated on the first line. Numbers below taxon names report inferred transfers within that taxon, whereas numbers on the linking edges report inferred transfers between taxa. Ovals representing groups with one or more thermophilic organisms are drawn with dashed lines. The type of line shown between each pair of taxa indicates the number of obligate edits in this analysis: >100 with thick solid lines, 10-99 with thin solid lines, and 5-9 with dashed lines. Relationships between taxonomic groups with fewer than five obligate edits are not shown. Note that transfers cannot be identified within phyla with one (e.g., Nanoarchaeota) or two (e.g., Bacteroidetes) genomes in our data set. Beiko R G et al. PNAS 2005;102:14332-14337 ©2005 by National Academy of Sciences

Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category groupings Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category groupings. The expected number of discordant bipartitions in each category is equal to the total number of strongly supported bipartitions (concordant and discordant) within that category, multiplied by 0.134, the proportion of bipartitions that are discordant across all categories at PP ≥0.95. Observed numbers of discordant bipartitions range from 78 for “mobile elements” to 3,015 for “hypothetical proteins.” Beiko R G et al. PNAS 2005;102:14332-14337 ©2005 by National Academy of Sciences

Fig. 1. Two methods for assessing LGT in bacterial genomes, applied to available quartets of closely related, fully sequenced bacterial taxa. The reference topology, based on SSU rRNA, is shown in the upper left, with taxon names listed in the rows below. The yellow box contains the numbers of gene acquisitions in genomes A and B, as determined by parsimony in comparisons of complete genome contents. The blue box contains the numbers of orthologous genes supporting a topology that conflicts with the reference topology. "Interspecies" and "Intraspecies" comparisons represent quartets of taxa in which phylogenetic incongruence can be explained, respectively, by a transfer from another species or from another strain of the same species. For intraspecies comparisons, numbers of acquired and lost genes were not calculated because of uncertainty about the actual tree topology (nd, not determined). (B. aphidicola strains are entirely isolated in different hosts and were thus considered as different species despite having a single name. In B. aphidicola, amounts of gene loss and gene gain are similar, suggesting that LGT is overestimated due to independent losses of genes.)

Fig. 2. Relative frequencies of the three categories of alignments, i Fig. 2. Relative frequencies of the three categories of alignments, i.e., those supporting the reference phylogeny (SSU rRNA), those supporting an alternate phylogeny (LGT), and those with no statistical support for any phylogeny. Points represent quartets of genomes for which orthologous genes have been inferred, aligned, and evaluated at the nucleic acid sequences level based on the SH test implemented in Puzzle 5.1 (19). The left part of the plot (in blue) represents the area where LGT predominates.

“THE” E. coli genome Blattner et al., Science 5 September 1997 277: 1453-1462 Figure 1. The overall structure of the E. coli genome. The origin and terminus of replication are shown as green lines, with blue arrows indicating replichores 1 and 2. A scale indicates the coordinates both in base pairs and in minutes (actually centisomes, or 100 equal intervals of the DNA). The distribution of genes is depicted on two outer rings: The orange boxes are genes located on the presented strand, and the yellow boxes are genes on the opposite strand. Red arrows show the location and direction of transcription of rRNA genes, and tRNA genes are shown as green arrows. The next circle illustrates the positions of REP sequences around the genome as radial tick marks. The central orange sunburst is a histogram of inverse CAI (1 - CAI), in which long yellow rays represent clusters of low (<0.25) CAI. The CAI plot is enclosed by a ring indicating similarities between previously described bacteriophage proteins and the proteins encoded by the complete E. coli genome; the similarity is plotted as described in Fig. 3 for the complete genome comparisons.

Perna et al., Nature 409, 529-533(25 January 2001) Outer circle shows the distribution of islands: shared co-linear backbone (blue); position of EDL933-specific sequences (O-islands) (red); MG1655-specific sequences (K-islands) (green); O-islands and K-islands at the same locations in the backbone (tan); hypervariable (purple). Second circle shows the G+C content calculated for each gene longer than 100 amino acids, plotted around the mean value for the whole genome, colour-coded like outer circle. Third circle shows the GC skew for third-codon position, calculated for each gene longer than 100 amino acids: positive values, lime; negative values, dark green. Fourth circle gives the scale in base pairs. Fifth circle shows the distribution of the highly skewed octamer Chi (GCTGGTGG), where bright blue and purple indicate the two DNA strands. The origin and terminus of replication, the chromosomal inversion and the locations of the sequence gaps are indicated. Figure created by Genvision from DNASTAR.

Shared E. coli proteins Shared E. coli proteins. Comparison of the predicted proteins of the three E. coli strains shows the number of orthologs in each shared category and numbers of strain-specific proteins. Hypervariable proteins and proteins spanning island–backbone junctions were excluded from the analysis. Number of proteins counted: K-12, 4,288; CFT073, 5,016; EDL933, 5,063. In the totals for the three strains, orthologous proteins are counted only once. Orthologous proteins meet the same match criteria used for designation of backbone (see Materials and Methods). Welch R A et al. PNAS 2002;99:17020-17024 ©2002 by National Academy of Sciences