Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.

Similar presentations


Presentation on theme: "Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06."— Presentation transcript:

1 Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

2 What is alternative splicing? The first result of transcription is “pre-mRNA” This undergoes “splicing”, i.e., introns are excised out, and exons remain, to form mRNA This splicing process may involve different combinations of exons, leading to different mRNAs, and different proteins This is alternative splicing

3 Alternative splicing Important regulatory mechanism, for modulating gene and protein content in the cell Large-scale genomic data today suggests that as many as 60% of the human genes undergo alternative splicing

4 Significance Number of human genes has recently been estimated to be about 20-25 K. Not significantly greater than much less complex organisms Alternative splicing is a potential explanation of how a large variety of proteins can be achieves with a small number of genes Errors in splicing mechanism implicated in diseases such as cancers

5 What happens in alternative splicing? Different combinations of exons within a gene are spliced from the RNA precursor, to be included in mRNA The combination depends on tissue type, developmental stage, disease etc. Thus different proteins in these different conditions Different types of alternative splicing on next slide

6 http://bib.oxfordjournals.org/cgi/content/full/7/1/55/F1 exon inclusion/exclusion alternative 5’ exon alternative 3’ exon intron retention 5’ alternative UTR 3’ alternative UTR

7 Bioinformatics of Alt. splicing Two main goals: –Find out cases of alt. splicing What are the different forms (“isoforms”) of a gene? –Find out how alt. splicing is regulated What are the sequence motifs controlling alt. splicing, and deciding which isoform will be produced

8 Identification of splice variants All cells have same genome But all cells don’t have the same “transcriptome” (i.e., transcripts) –Different cells may express different (alternative) transcripts of the same gene Goal of bioinformatics is to find “splice forms”, i.e., what are the alternative splicing events?

9 Identification of splice variants Direct comparison between sequences of different cDNA isoforms –Q: What is cDNA? How is this different from a gene’s DNA? –cDNA is “complementary DNA”, obtained by reverse transcription from mRNA. It has no introns Direct comparison reveals differences in the isoforms But this difference could be part of an exon, a whole exon, or a set of exons

10 Copyright restrictions may apply. Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005 Bioinformatics methods for identifying alternative splicing direct comparison

11 Identification of splice variants Comparison of exon-intron structures (the gene’s architecture) Where do the exon-intron structures come from? –Align cDNA (no introns) with genomic sequence (with introns) –This gives us the intron and exon structure

12 Copyright restrictions may apply. Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005 Bioinformatics methods for identifying alternative splicing comparison of exon-intron structures

13 Identification of splice variants Alignment tools. Align cDNA sequence to genomic sequence Why shouldn’t this be a perfect match with gaps (introns)? –Sequencing errors, polymorphisms, etc. Special purpose alignment programs for this purpose

14 Identifying full lengh alt. spliced transcripts Previous methods identified parts of alt. spliced transcript Much more difficult to identify full length alternatively spliced transcripts Such methods include “gene indices”

15 Gene indices Compare all EST sequences against one another Identify significant overlaps Group and assemble sequences with compatible overlaps into clusters

16 Gene indices

17 Problems with gene indices Overclustering: paralogs may get clustered together. –What are paralogs? –Related but distinct genes in the same species Underclustering: if number of ESTs is not sufficient Computationally expensive: –Quadratic time complexity

18 Splice graphs Nodes: Exons Edges: Introns Gene: directed acyclic graph Each path in this DAG is an alternative transcript

19 Splice graph

20 Splice graphs Combinatorially generate all possible alt. transcripts But not all such transcripts are going to be present Need scores for candidate transcripts, in order to differentiate between the biologically relevant ones and the artifactual ones

21 Splice variants from microarray data Affymetrix GeneChip technology uses 22 probes collected from exons or straddling exon boundaries When an exon is alternatively spliced, expression level of its probes will be different in different experiments

22 Copyright restrictions may apply. Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005 Bioinformatics methods for identifying alternative splicing splice variants from micro array data

23 Part 2: Regulation of alternative splicing

24 Biological mechanism Splicing of pre-mRNA is a complex cellular process “Spliceosome” is a complex of several molecules that assembles onto each intron and catalyzes the excision of the intron Splice sites (5’ or donor splice site and 3’ or acceptor splice site) play a major role in splicing More sites, apart from the splice signals, in introns and exons, contribute to splicing

25 Biological mechanism Cis-regulatory elements (again !) Promote (“splicing enhancers”) or repress (“splicing silencers”) the inclusion of the exon in the mRNA Can be located in exons or introns

26 Bioinformatics methods Goal: find the cis-regulatory elements that mediate splicing (alternative splicing) Early work: find consensus sequences (motifs) of splicing enhancers More advanced work: Position weight matrices (PWMs)

27 Copyright restrictions may apply. Florea, L. Brief Bioinform 2006 7:55-69; doi:10.1093/bib/bbk005 Bioinformatics representations of splicing regulatory motifs: (a) consensus sequence and (b) position weight matrix (PWM)

28 Motif finding (again !) Statistical overrepresentation Find k-mers that occur more often in one class of sequences than in another; Should be statistically significant Exonic splicing enhancers (ESE) are more likely to occur in exons than in introns; hence find 6-mers (k=6) statistically overrepresented in exons compared to introns Calculate z-score of count –(Count - mean)/(standard deviation) –Homework 1

29 Motif finding Other standard approaches of motif finding also adopted: –MEME & Gibbs sampling Comparative genomics –Find conserved sites in introns –Find conserved sites in exons. This has to be done carefully. Because exons already have selective pressure.

30 Summary Alternative splicing is very important Bioinformatics for finding alternative spliced forms Bioinformatics for finding regulatory mechanisms


Download ppt "Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06."

Similar presentations


Ads by Google