Presentation is loading. Please wait.

Presentation is loading. Please wait.

In brief Vertical vs. Horizontal Homologous vs. Unequal

Similar presentations


Presentation on theme: "In brief Vertical vs. Horizontal Homologous vs. Unequal"— Presentation transcript:

1 In brief Vertical vs. Horizontal Homologous vs. Unequal
Prokaryotes vs. Eukaryotes Mechanisms and Vectors Impact on Tree of Life Implications for prokaryotic species

2 Possible mechanisms for HT in Drosophila
From Heredity (2008) 100, 545–554

3

4

5 EVOLUTION: Genome Data Shake Tree of Life
E Pennisi - Science, sciencemag.org The ring of life provides evidence for a genome fusion origin of eukaryotes MC Rivera, JA Lake - Nature, 2004 The net of life: reconstructing the microbial phylogenetic network V Kunin, L Goldovsky, N Darzentas, CA … - Genome Research 2005 The tree of one percent T Dagan, W Martin - Genome biology, 2006 Uprooting the tree of life WF Doolittle - Evolution: a Scientific American reader, 2006

6

7 Clusters of Orthologous Groups (COGs)

8 Puigbo et al. 6901 ML trees 100 taxa total
Objective – compare topological distance between trees New metric called IS (inconsistency score) = fraction of the time splits in a tree are found all trees

9 Many genes are not found in all taxa
Define 102 NUTs or “nearly universal trees” that include 90% of the prokaryotes under comparison. Mostly translation and core transcription related

10 J Biol. 2009;8(6):59.

11 The big divide? Look for evidence of HGT between bacteria and archaea
56% of NUTs separated the groups perfectly 44% show at least on HGT 13% from archaea to bacteria 23% from bacteria to archaea 8% both directions

12 The network of similarities among the nearly universal trees (NUTs)
The network of similarities among the nearly universal trees (NUTs). (a) Each node (green dot) denotes a NUT, and nodes are connected by edges if the similarity between the respective edges exceeds the indicated threshold. (b) The connectivity of 102 NUTs and the 14 1:1 NUTs depending on the topological similarity threshold.

13

14 The supernetwork of the NUTs
The supernetwork of the NUTs. For spcies abbreviations see Additional File 1. Puigbò et al. Journal of Biology :59   doi: /jbiol159

15

16 Network representation of the 6,901 trees of the forest of life
Network representation of the 6,901 trees of the forest of life. The 102 NUTs are shown as red circles in the middle. The NUTs are connected to trees with similar topologies: trees with at least 50% of similarity with at least one NUT (P-value < 0.05) are shown as purple circles and connected to the NUTs. The rest of the trees are shown as green circles. Puigbò et al. Journal of Biology :59   doi: /jbiol159

17 Similarity of the trees in the forest of life to the NUTs
Similarity of the trees in the forest of life to the NUTs. (a) For each of the 102 NUTs, the breakdown of the rest of the trees in the forest by percent similarity is shown. (b) The same breakdown for 102 random trees generated from the NUTs. Puigbò et al. Journal of Biology :59   doi: /jbiol159

18

19 Proc Natl Acad Sci U S A. 2005 Oct 4;102(40):14332-7.

20 Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved Highways of obligate gene transfer within and among phyla and divisions of prokaryotes, based on analysis of the 22,348 protein trees for which a minimal edit path could be resolved. Each oval represents a prokaryotic group, with the name and number of taxa in that group indicated on the first line. Numbers below taxon names report inferred transfers within that taxon, whereas numbers on the linking edges report inferred transfers between taxa. Ovals representing groups with one or more thermophilic organisms are drawn with dashed lines. The type of line shown between each pair of taxa indicates the number of obligate edits in this analysis: >100 with thick solid lines, with thin solid lines, and 5-9 with dashed lines. Relationships between taxonomic groups with fewer than five obligate edits are not shown. Note that transfers cannot be identified within phyla with one (e.g., Nanoarchaeota) or two (e.g., Bacteroidetes) genomes in our data set. Beiko R G et al. PNAS 2005;102: ©2005 by National Academy of Sciences

21 Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category groupings Ratio of observed to expected discordant bipartitions among proteins in major TIGR role category groupings. The expected number of discordant bipartitions in each category is equal to the total number of strongly supported bipartitions (concordant and discordant) within that category, multiplied by 0.134, the proportion of bipartitions that are discordant across all categories at PP ≥0.95. Observed numbers of discordant bipartitions range from 78 for “mobile elements” to 3,015 for “hypothetical proteins.” Beiko R G et al. PNAS 2005;102: ©2005 by National Academy of Sciences

22

23 Fig. 1. Two methods for assessing LGT in bacterial genomes, applied to available quartets of closely related, fully sequenced bacterial taxa. The reference topology, based on SSU rRNA, is shown in the upper left, with taxon names listed in the rows below. The yellow box contains the numbers of gene acquisitions in genomes A and B, as determined by parsimony in comparisons of complete genome contents. The blue box contains the numbers of orthologous genes supporting a topology that conflicts with the reference topology. "Interspecies" and "Intraspecies" comparisons represent quartets of taxa in which phylogenetic incongruence can be explained, respectively, by a transfer from another species or from another strain of the same species. For intraspecies comparisons, numbers of acquired and lost genes were not calculated because of uncertainty about the actual tree topology (nd, not determined). (B. aphidicola strains are entirely isolated in different hosts and were thus considered as different species despite having a single name. In B. aphidicola, amounts of gene loss and gene gain are similar, suggesting that LGT is overestimated due to independent losses of genes.)

24 Fig. 2. Relative frequencies of the three categories of alignments, i
Fig. 2. Relative frequencies of the three categories of alignments, i.e., those supporting the reference phylogeny (SSU rRNA), those supporting an alternate phylogeny (LGT), and those with no statistical support for any phylogeny. Points represent quartets of genomes for which orthologous genes have been inferred, aligned, and evaluated at the nucleic acid sequences level based on the SH test implemented in Puzzle 5.1 (19). The left part of the plot (in blue) represents the area where LGT predominates.

25 “THE” E. coli genome Blattner et al., Science 5 September : Figure 1. The overall structure of the E. coli genome. The origin and terminus of replication are shown as green lines, with blue arrows indicating replichores 1 and 2. A scale indicates the coordinates both in base pairs and in minutes (actually centisomes, or 100 equal intervals of the DNA). The distribution of genes is depicted on two outer rings: The orange boxes are genes located on the presented strand, and the yellow boxes are genes on the opposite strand. Red arrows show the location and direction of transcription of rRNA genes, and tRNA genes are shown as green arrows. The next circle illustrates the positions of REP sequences around the genome as radial tick marks. The central orange sunburst is a histogram of inverse CAI (1 - CAI), in which long yellow rays represent clusters of low (<0.25) CAI. The CAI plot is enclosed by a ring indicating similarities between previously described bacteriophage proteins and the proteins encoded by the complete E. coli genome; the similarity is plotted as described in Fig. 3 for the complete genome comparisons.

26 Perna et al., Nature 409, 529-533(25 January 2001)
Outer circle shows the distribution of islands: shared co-linear backbone (blue); position of EDL933-specific sequences (O-islands) (red); MG1655-specific sequences (K-islands) (green); O-islands and K-islands at the same locations in the backbone (tan); hypervariable (purple). Second circle shows the G+C content calculated for each gene longer than 100 amino acids, plotted around the mean value for the whole genome, colour-coded like outer circle. Third circle shows the GC skew for third-codon position, calculated for each gene longer than 100 amino acids: positive values, lime; negative values, dark green. Fourth circle gives the scale in base pairs. Fifth circle shows the distribution of the highly skewed octamer Chi (GCTGGTGG), where bright blue and purple indicate the two DNA strands. The origin and terminus of replication, the chromosomal inversion and the locations of the sequence gaps are indicated. Figure created by Genvision from DNASTAR.

27 Shared E. coli proteins Shared E. coli proteins. Comparison of the predicted proteins of the three E. coli strains shows the number of orthologs in each shared category and numbers of strain-specific proteins. Hypervariable proteins and proteins spanning island–backbone junctions were excluded from the analysis. Number of proteins counted: K-12, 4,288; CFT073, 5,016; EDL933, 5,063. In the totals for the three strains, orthologous proteins are counted only once. Orthologous proteins meet the same match criteria used for designation of backbone (see Materials and Methods). Welch R A et al. PNAS 2002;99: ©2002 by National Academy of Sciences


Download ppt "In brief Vertical vs. Horizontal Homologous vs. Unequal"

Similar presentations


Ads by Google