Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission.

Similar presentations


Presentation on theme: "Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission."— Presentation transcript:

1 Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission Problems

2 Alternative splicing of human (and mouse) genes

3 Evolution of alternative exon- intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

4 Data and Methods (routine) known alternative splicing –HASDB (human, ESTs+mRNAs) –ASMamDB (mouse, mRNAs+genes) additional variants –UniGene (human and mouse EST clusters) complete genes and genomic DNA –GenBank (full-length mouse genes) –human genome TBLASTN (initial identification of orthologs: mRNAs against genomic DNA) BLASTN (human mRNAs against genome) Pro-EST (spliced alignment, ESTs and mRNA against genomic DNA)

5 Pro-Frame (spliced alignment of proteins against genomic DNA) –confirmation of orthology: same exon-intron structure for at least one isoform >70% identity over the entire protein length –analysis of conservation of human alternative splicing in the mouse genome: align human protein to mouse genomic DNA; the isoform is conserved if all exons or parts of exons are conserved all sites are conserved –same procedure for mouse proteins and human DNA We do not require that the isoform is actually observed as mRNA or ESTs

6 166 gene pairs 424284844040 human mouse Known alternative splicing: 126124124

7 Elementary alternatives Cassette exon Alternative donor site Alternative acceptor site Retained intron

8 Human genes mRNAEST cons.non-cons.cons.non-cons. Cassette exons56257426 Alt. donors1871610 Alt. acceptors1351915 Retained introns4350 Total963011451 Total genes45284144 Conserved elementary alternatives: 69% (EST) - 76% (mRNA) Genes with all isoforms conserved: 57 (45%)

9 Mouse genes mRNAEST cons.non-cons.cons.non-cons. Cassette exons705399 Alt. donors246176 Alt. acceptors156169 Retained introns87104 Total117248228 Total genes68223026 Conserved elementary alternatives: 75% (EST) - 83% (mRNA) Genes with all isoforms conserved: 79 (64%)

10 Real or aberrant non-conserved AS? 24-31% human vs. 17-25% mouse elementary alternatives are not conserved 55% human vs 36% mouse genes have at least one non-conserved variant denser coverage of human genes by ESTs: –pick up rare (tissue- and stage-specific) => younger variants –pick up aberrant (non-functional) variants 17-24% mRNA-derived elementary alternatives are non-conserved (compared to 25-32% EST- derived ones)

11 Comparison to other studies. Modrek and Lee, 2003: skipped exons inclusion level is a good predictor of conservation –98% constitutive exons are conserved –98% major form exons are conserved –28% minor form exons are conserved inclusion level of conserved exons in human and mouse is highly correlated Minor non-conserved form exons are errors? No: –minor form exons are supported by multiple ESTs –28% of minor form exons are upregulated in one specific tissue –70% of tissue-specific exons are not conserved –splicing signals of conserved and non-conserved exons are similar

12 Evolution of alternative exon- intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

13 Fruit fly and mosquito Technically more difficult than human- mouse: –incomplete genomes –difficulties in alignment, especially at gene termini –changes in exon-intron structure irrespective of alternative splicing (~4.7 introns per gene in Drosophila vs. ~3.5 introns per gene in Anopheles)

14 Methods Pro-Frame: Align Dme protein isoforms to Dps and Aga genes coding segments: regions in Dme genes between Dme intron shadows We follow the fate of Dme exons and coding segments in Dps and Aga genomes slices: regions between all exon-exon junctions (intron shadows) from all three genomes (Dme, Dps, Aga) mapped to Dme isoforms slice is conserved if it aligns with  35% identity

15 Conservation of coding segments constitutive segments alternative segments D. melanogaster – D. pseudoobscura 97%75-80% D. melanogaster – Anopheles gambiae 77%~45%

16 Conservation of D.melanogaster elementary alternatives in D. pseudoobscura genes blue – exact green – divided exons yellow – joined exon orange – mixed red – non-conserved retained introns are the least conserved mutually exclusive exons are as conserved as constitutive exons

17 Conservation of D.melanogaster elementary alternatives in Anopheles gambiae genes blue – exact green – divided exons yellow – joined exons orange – mixed red – non-conserved ~30% joined, ~10% divided exons (less introns in Aga) mutually exclusive exons are conserved exactly cassette exons are the least conserved

18 CG1517: cassette exon in Drosophila, alternative acceptor site in Anopheles Dme, Dps Aga a)

19 CG31536: cassette exon in Drosophila, shorter cassette exon and alternative donor site in Anopheles Dme, Dps Aga

20 CG1587: alternative acceptor site in Drosophila, candidate retained intron in intronless gene of Anopheles Dme Aga Dps

21 Evolution of alternative exon-intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

22 Alternative splicing in a multigene family: the MAGEA family of cancer/testis specific antigens A locus at the X chromosome containing eleven recently duplicated genes: two subfamilies of four genes each and three single genes Retrogene: one protein-coding exon, multiple different 5’-UTR exons Mutations create new splicing sites or disrupt existing sites

23 Birth of donor sites (new GT in alternative intial exon 5)

24 Birth of an acceptor site (new AG and polyY tract in MAGEA8-specific cassette exon 3)

25 Birth of an alternative donor site (enhanced match to the consensus (AG) in cassette exon 2)

26 Birth of an alternative acceptor site (enhanced polyY tract in cassette exon 4)

27 Evolution of alternative exon-intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

28 Concatenates of constitutive and alternative regions in all genes: different evolutionary rates Columns (left-to-right) – (1) constitutive regions; (2–4) alternative regions: N-end, internal, C-end Relatively more non-synonimous substitutions in alternative regions (higher dN/dS ratio) Less amino acid identity in alternative regions

29 Individual genes: the rate of non-synonymous to synonymous substitutions d n /d s tends to be larger in alternative regions (vertical acis) than in constitutive regions (horizontal acis)

30 d n /d s (con) – d n /d s (alt) N-terminal regions complete genes internal regions C-terminal regions

31 Evolution of alternative exon-intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

32 Na/Ns (alternative) > Na/Ns (constitutive) for all evidence levels

33 Evolution of alternative exon-intron structure –human-mouse –Drosophila and Anopheles Evolution of alternative splicing sites: MAGE-A family of CT antigens Evolutionary rate in constitutive and alternative regions –human-mouse –human SNPs Alternative splicing and protein structure

34 Alternative splicing avoids disrupting domains (and non-domain units) Control: fix the domain structure; randomly place alternative regions

35 … and this is not simply a consequence of the (disputed) exon-domain correlation

36 Positive selection towards domain shuffling (not simply avoidance of disrupting domains)

37 Short (<50 aa) alternative splicing events within domains target protein functional sites c) Prosite patterns unaffected Prosite patterns affected FT positions unaffected FT positions affected ExpectedObserved

38 An attempt of integration AS is often young (as opposed to degenerating) young AS isoforms are often minor and tissue-specific … but still functional –although unique isoforms may be result of aberrant splicing AS often arises from duplication of exons … or point mutations creating splicing sites … or intron insertions AS regions show evidence for positive selection –excess non-synonymous and damaging SNPs –excess non-synonymous codon substitutions AS tends to shuffle exons and target functional sites in proteins Thus AS may serve as a testing ground for new functions without sacrificing old ones

39 Acknowledgements Discussions –Vsevolod Makeev (GosNIIGenetika) –Eugene Koonin (NCBI) –Igor Rogozin (NCBI) –Dmitry Petrov (Stanford) –Dmitry Frishman (GSF, TUM) Data –King Jordan (NCBI) Support –Ludwig Institute of Cancer Research –Howard Hughes Medical Institute –Russian Academy of Sciences (program “Molecular and Cellular Biology”) –Russian Fund of Basic Research

40 Authors Andrei Mironov (Moscow State University) – spliced alignment Ramil Nurtdinov (Moscow State University) – human/mouse, data Irena Artamonova (GSF/MIPS) – human/mouse, MAGE-A Dmitry Malko (GosNIIGenetika, Moscow) – mosquito/drosophila Ekaterina Ermakova (Moscow State University) – evolution of alternative/constitutive regions Vasily Ramensky (Institute of Molecular Biology, Moscow) – SNPs Shamil Sunyaev (EMBL, now Harvard University Medical School) – protein structure Eugenia Kriventseva (EBI, now EMBL) – protein structure


Download ppt "Alternative splicing: A playground of evolution Mikhail Gelfand Research and Training Center for Bioinformatics Institute for Information Transmission."

Similar presentations


Ads by Google