Presentation is loading. Please wait.

Presentation is loading. Please wait.

National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V.

Similar presentations


Presentation on theme: "National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V."— Presentation transcript:

1 National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V. Koonin National Center for Biotechnology Information, NIH, Bethesda, MD

2 National Center for Biotechnology Information “ In my own subjects, genetics and molecular biology, research has become so directed toward medical problems and the needs of the pharmaceutical companies that most people do not recognize that the most challenging intellectual problem of all time, the reconstruction of our biological past, can now be tackled with some hope of success. “ Sydney Brenner, Science 282, 1411-1412 (20 Nov 1998)

3 National Center for Biotechnology Information Comprehensive evolutionary classification of genes from sequenced genomes

4 National Center for Biotechnology Information Ancient conserved eukaryotic genes

5 National Center for Biotechnology Information Current status of evolutionary classification of proteins from 7 complete eukaryotic genomes: 112920 proteins = 65170 in KOGs + 23436 in LSEs + 24314 singletons Lineage-specific expansions Tatusov et al., BMC Bionformatics, 2003 Sep 11;4(1):41.

6 National Center for Biotechnology Information Breakdown of eukaryotic proteins into KOGs, LSEs and singletons Current status of evolutionary classification of proteins from 7 complete genomes

7 National Center for Biotechnology Information Define a phyletic pattern

8 National Center for Biotechnology Information

9

10

11

12 858 921 186 188 142 1109 271 1947 All All-Ec Animals-Fungi Plant+fungi Plant+animals All animals All fungi Other patterns Phyletic patterns of eukaryotic KOGs

13 National Center for Biotechnology Information S. cerevisiae 717 497 1004 273 1120 115 1463 221 0% 25% 50% 75% 100% non-essential 1 2-5 6 7 Phyletic patterns of KOGs and phenotypic effect of knockouts Essential genes tend not to be lost during evolution

14 National Center for Biotechnology Information C. elegans 736 312 917 154 3602 181 7282 163 0% 25% 50% 75% 100% non-essential 1 2-5 6 7 Phyletic patterns of KOGs and phenotypic effect of knockouts Essential genes tend not to be lost during evolution

15 National Center for Biotechnology Information The traditional application of the evolutionary parsimony principle: Given the distribution of a set of binary characters in a set of species, construct the shortest tree (maximum parsimony tree) A 10111100 B 00110111 C 00010111 D 10111010 A D B C

16 National Center for Biotechnology Information However, parsimony can be used with equal ease to address the reverse task: given the distribution of a set of binary characters in a set of species AND the *true* tree topology, construct the most parsimonious scenario of evolution (which, of course, might include many more events than the overall most economical scenario) A 10111100 B 00110111 C 00010111 D 10111010 ABCD 21 3 2 22 1011101000010111

17 National Center for Biotechnology Information Ec Sc Sp Ce Dm Hs AtAt 100% Maximum parsimony (Dollo) tree for eukaryotes based on the phyletic patterns of KOGs

18 National Center for Biotechnology Information The phylogenetic parsimony tree built on the basis of KOG phyletic patterns did not follow the species tree However, the parsimony principle can be applied in the opposite direction: given a species tree topology, construct the most parsimonious scenario for the evolution of eukaryotic gene repertoire (mapping of gene (KOG) gain and loss events on the tree branches): 1/0 0/1 gain loss

19 National Center for Biotechnology Information 3491 520 Dm Hs Ce Sc Sp At 13688 162 4503 541 - 3711 398 37 1358 193 422 - 55 Ec 3260 5361 5000 3048 3835 3413 15 802 1679 299 1969 202 842 586 267 The most parsimonious scenario of gene loss and birth in eukaryotic evolution and ancestral gene sets Gene gain Gene loss Koonin et al. 2004. Genome Biol. 5: R7.

20 National Center for Biotechnology Information Exon/intron structure of eukaryotic genes Eukaryotic nuclear, protein-coding genes usually contain multiple spliceosomal introns that are spliced out of pre-mRNAs by an RNA-protein complex, the spliceosome. GUAG exon1 exon2 intron

21 National Center for Biotechnology Information Evolution of introns and the exonic structure of eukaryotic genes Tempo and mode of intron evolution remain poorly understood. When did introns invade eukaryotic genes: prior to the origin of eukaryotes (introns early), early in eukaryotic evolution, or late? The common ancestor of animals, plants and fungi: intron-rich or intron-poor? What fraction of introns is conserved over long evolutionary spans?

22 National Center for Biotechnology Information Origin of introns The "intron-early" hypothesis suggests that introns existed before the divergence of prokaryotes and eukaryotes (W. Gilbert). The "intron-late" hypothesis posits that introns were inserted into eukaryotic genes after this divergence (T.Cavalier-Smith, Doolittles, J.Palmer) Loss and sliding Gain and loss

23 National Center for Biotechnology Information Three mechanisms of intron evolution have been invoked by proponents of both theories: - intron loss - intron gain - intron sliding Mechanisms of intron evolution

24 National Center for Biotechnology Information Mechanisms of intron evolution: intron loss intron loss Complete loss of introns: re-integration of reverse-transcribed mRNAs into the genome Loss of one or few introns (recombination/gene conversion between cDNAs and genomic sequences (Feiber et al. 2002 ))

25 National Center for Biotechnology Information Mechanisms of intron evolution: intron gain intron gain ? A common event

26 National Center for Biotechnology Information Mechanisms of intron evolution Why is our understanding of intron evolution so limited? - Lack of information on exon/intron structure of orthologous genes Can we use completely sequenced genomes? - This is a great source of information but … they are not necessarily easy to work with...

27 National Center for Biotechnology Information Analysis of introns in completely sequenced genomes We used sets of orthologous genes which contained a member from each of 8 eukaryotic genomes: Human (HS) Fly (DM) Mosquito (AG) Worm (CE) Plant (Arabidopsis) (AT) Baker’s yeast (SC) Fission yeast (SP) Malaria Plasmodium (PF) KOG database

28 National Center for Biotechnology Information KOG analysis (8 species) Multiple alignment (MAP) Identification of conserved blocks Projection of introns on alignment Extraction of intron positions from genomes Pipeline for analysis of evolution of intron-exon structure

29 National Center for Biotechnology Information HS …ATGTCGATCGTGCTCGTCGTACTCTCGTAC… DM …ATGTGGATCGTGCTCGTCGTACTCTCGTAC… CE …ATGTGGATTGTGCTCGTCGTACTCTCGTAC… AT …ATGTTGATGGTGCTCGTCGTACTCTCGTAC… SC …ATGTTGATTGTGCTCGTCGTACTCTCGTAC… SP …ATGTTGATT---CTCGTCGTACTCTCGTAC… All positions with gaps were deleted to ensure robustness of the analysis… but we also analyzed the complete alignments Conserved introns (found in two or more species) Non-conserved introns (one species only)

30 National Center for Biotechnology Information Statistical analysis: shuffling of intron positions, Monte Carlo simulation HS …ATGTCGATCGTGCTCGTCGTACTCTCGTAC… DM …ATGTGGATCGTGCTCGTCGTACTCTCGTAC… CE …ATGTGGATTGTGCTCGTCGTACTCTCGTAC… AT …ATGTTGATGGTGCTCGTCGTACTCTCGTAC… SC …ATGTTGATTGTGCTCGTCGTACTCTCGTAC… SP …ATGTTGATTGTCCTCGTCGTACTCTCGTAC…

31 National Center for Biotechnology Information CONSERVATION OF INTRON POSITIONS IN 8 EUKARYOTIC SPECIES

32 National Center for Biotechnology Information 104 3 207 787 557 433 403 3345/6930 Conservation of intron positions among eukaryotes Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Curr Biol. 2003 Sep 2;13(17):1512-7.

33 National Center for Biotechnology Information Example: KOG0473 – ribosomal protein L37 Alignment with mapped intron positions is converted to a matrix of intron presence/absence

34 National Center for Biotechnology Information Conserved intron positions - phylogenetic signal Example: KOG0419 - TCP-1a subunit of chaperonin complex The only intron among 684 genes conserved in 7 species Matrices for all analyzed genes were concatenated and employed to build a single tree - 684 KOGs, 7236 intron positions

35 National Center for Biotechnology Information Phylogenetic tree of crown group eukaryotes based on conservation of intron positions: parsimony The topology of this tree is a bit unexpected...

36 National Center for Biotechnology Information The phylogenetic parsimony tree built on the basis of the pattern of intron conservation did not follow the species tree. However, the parsimony principle can be applied in the opposite direction: given a species tree topology, construct the most parsimonious scenario for the evolution of eukaryotic gene structure: distribution of intron gain and loss events over the tree branches 1/0 0/1 gain loss

37 National Center for Biotechnology Information Parsimonious evolutionary scenario for the most realistic topology of the eukaryotic tree 147 156 Dm Ag Hs Ce Sc Sp At Pf 137 194 1844 77 798 735 15 247 197 1 2001 46 307 - 87 933 244 71 386 27 92 24 835 - 3 795 143 Intron loss Intron gain Rogozin IB, Wolf YI, Sorokin AV, Mirkin BG, Koonin EV. Curr Biol. 2003 Sep 2;13(17):1512-7.

38 National Center for Biotechnology Information Roy SW, Fedorov A, Gilbert W. Large-scale comparison of intron positions in mammalian genes shows intron loss but no gain. Proc Natl Acad Sci U S A. 2003 Jun 10;100(12):7158-62. A. S. Kondrashov, personal communication There seems to have been virtually no intron gain and limited intron loss during mammalian evolution Humanmouserat ~100 introns lost ~0 introns gained ~100 Mya Fish

39 National Center for Biotechnology Information A conundrum of intron evolution: practically no intron gain during (at least) ~100 mln yrs of mammalian evolution apparent massive gain during evolution of animal phyla (e.g., chordates) ~500-700 mln yr scale Are major transitions in eukaryotic evolution associated with bursts of intron insertion?

40 National Center for Biotechnology Information Koonin, 2004, Cell Cycle 3, 280

41 National Center for Biotechnology Information Gain/loss of genes and gain/loss of introns in conserved genes occur in parallel in eukaryotic evolution – probably manifestation of the same, general lineage-specific trends ‘…by magnifying the power of random genetic drift, reduced population size provides a permissive environment for the proliferation of various genomic features that would otherwise be eliminated by purifying selection.’ Lynch, M., Conery, J.S. (2003) The Origins of Genome Complexity. Science 302, 1401-4.

42 National Center for Biotechnology Information Comparing old and new introns: gaining insight into the origin of introns Sverdlov, Babenko, Rogozin, Koonin. Curr. Biol. (2003); Gene (2004, in press)

43 National Center for Biotechnology Information Distribution of old and new introns along the gene length All genomes pooled

44 National Center for Biotechnology Information Distribution of old and new introns along the gene length S. pombe – an intron-poor genome – nearly identical distributions of old and new introns

45 National Center for Biotechnology Information Distribution of old and new introns along the gene length H. sapiens – an intron-rich genome – enrichment for new introns in the 3’-region

46 National Center for Biotechnology Information Reverse transcription duplication TTTTTTT T GT AG AAAAAAAAA5’3’ Genomic DNA Homologous recombination new intron GT AG A reverse-transcription based model of intron insertion – almost the same as for intron loss (Fink, 1987) but includes an error of reverse transcription Introns seem to be preferentially lost AND inserted near the 3’-end of the coding region – could there be similar mechanisms for intron loss AND insertion? Role of duplication in the origin of alternative exons has been demonstrated Kondrashov, F.A, Koonin, E.V. Hum. Molec. Genet., 2001 Letunic, I. et al., Hum. Molec. Genet., 2002

47 National Center for Biotechnology Information Conclusions Evolutionary classification of genes from sequenced genomes (orthologs and paralogs) allows us to address genome-wide evolutionary trends by applying rather straightforward adaptations of known phylogenetic approaches Introns invaded protein-coding genes very early in evolution of eukaryotes - prior to the origin of multicellular forms - and many of these ancient introns survive to this day Remarkable conservation of ancestral introns in some eukaryotic lineages, with as many as 25-30% of the introns in humans and Arabidopsis being apparently inherited from the common ancestor of animals, fungi and plants, and ~30% Plasmodium introns conserved in the crown group. Even the earliest ancestral eukaryotes seem to have had many genes and introns.

48 National Center for Biotechnology Information Massive gene and intron loss occurred on multiple, independent occasions during eukaryotic evolution, especially in fungi, but also in arthropods and nematodes (and probably many more lineages). Classification of introns by age allows one to follow the evolution of splice signals, intron sequences themselves… and might even suggest mechanisms of intron insertion Lineage-specific expansion of paralogous gene families is accompanied by substantial loss and even more extensive acquisition of introns Loss and gain of introns and genes occur in parallel, reflecting the same lineage-specific trends in genome evolution – perhaps largely dramatic changes in characteristic population sizes entailing changes in selection strength Conclusions

49 National Center for Biotechnology Information Acknowledgments Igor Rogozin (NCBI) The COG group (NCBI): Yuri Wolf (NCBI) Boris Mirkin (Birkbeck College, London) Alexander Sorokin (NCBI) Alexander Sverdlov (NCBI, now Columbia U) Vladimir Babenko (NCBI) Fyodor Kondrashov (NCBI, now UC Davis) Alexei Kondrashov (NCBI) Natalie D. Fedorova, John D. Jackson, Aviva R. Jacobs, Dmitri M. Krylov, Kira S. Makarova, Raja Mazumder 1, Sergei L. Mekhedov, Anastasia N. Nikolskaya 1, B. Sridhar Rao, Sergei Smirnov, Alexander V. Sverdlov, Roman L. Tatusov, Sona Vasudevan, Jodie J. Yin, Darren A. Natale 1 1 Currently PIR, Georgetown University


Download ppt "National Center for Biotechnology Information Evolution of eukaryotic genomes: remarkable conservation and massive loss of genes and introns Eugene V."

Similar presentations


Ads by Google