Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06.

Slides:



Advertisements
Similar presentations
Exploiting transcription factor binding site clustering to identify cis-regulatory modules involved in pattern formation in the Drosophila genome ECS289A.
Advertisements

A very short introduction (in plants)
Control of Gene Expression
Differential Gene Expression
Combined analysis of ChIP- chip data and sequence data Harbison et al. CS 466 Saurabh Sinha.
Lecture 4: DNA transcription
Bioinformatics Motif Detection Revised 27/10/06. Overview Introduction Multiple Alignments Multiple alignment based on HMM Motif Finding –Motif representation.
Two short pieces MicroRNA Alternative splicing.
1 Computational Molecular Biology MPI for Molecular Genetics DNA sequence analysis Gene prediction Gene prediction methods Gene indices Mapping cDNA on.
How many transcripts does it take to reconstruct the splice graph? Introduction Alternative splicing is the process by which a single gene may be used.
1 Alternative Splicing. 2 Eukaryotic genes Splicing Mature mRNA.
1 Gene Finding Charles Yan. 2 Gene Finding Genomes of many organisms have been sequenced. We need to translate the raw sequences into knowledge. Where.
Microarrays and Cancer Segal et al. CS 466 Saurabh Sinha.
Introduction to BioInformatics GCB/CIS535
Gene Finding Charles Yan.
CSE182-L12 Gene Finding.
Alternative splicing and evolution Daniel Jeffares.
Alternative Splicing As an introduction to microarrays.
Bioinformatics Alternative splicing Multiple isoforms Exonic Splicing Enhancers (ESE) and Silencers (ESS) SpliceNest Lecture 13.
Characterizing Alternative Splicing With Respect To Protein Domains BME 220 Project Charlie Vaske.
The Influence of Alternative Splicing in Protein Structure The fact that gene number is not significantly different between mammals and some invertebrates.
Step 1 of Protein Synthesis
Proteins, Mutations and Genetic Disorders. What you should know One gene, many proteins as a result of RNA splicing and post translational modification.
Sequencing a genome and Basic Sequence Alignment
Day 2! Chapter 15 Eukaryotic Gene Regulation Almost all the cells in an organism are genetically identical. Differences between cell types result from.
Computational Molecular Biology Biochem 218 – BioMedical Informatics Gene Regulatory.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
International Livestock Research Institute, Nairobi, Kenya. Introduction to Bioinformatics: NOV David Lynn (M.Sc., Ph.D.) Trinity College Dublin.
Genome Informatics 2005 ~ 220 participants 1 keynote speaker: David Haussler 47 talks 121 posters.
20.1 – 1 Look at the illustration of “Cloning a Human Gene in a Bacterial Plasmid” (Figure 20.4 in the orange book). If the medium used for plating cells.
Chapter 11 Table of Contents Section 1 Control of Gene Expression
DNA MICROARRAYS WHAT ARE THEY? BEFORE WE ANSWER THAT FIRST TAKE 1 MIN TO WRITE DOWN WHAT YOU KNOW ABOUT GENE EXPRESSION THEN SHARE YOUR THOUGHTS IN GROUPS.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Regulation of Gene Expression Eukaryotes
Motif finding with Gibbs sampling CS 466 Saurabh Sinha.
Grupo 5. 5’site 3’site branchpoint site exon 1 intron 1 exon 2 intron 2 AG/GT CAG/NT.
Sequencing a genome and Basic Sequence Alignment
MPL Identification of alternative spliced mRNA variants related to cancers by genome-wide ESTs alignment KIM DAE SOO Oncogene Apr.
8.6 Gene Expression and Regulation TEKS 5C, 6C, 6D, 6E KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells.
Fea- ture Num- ber Feature NameFeature description 1 Average number of exons Average number of exons in the transcripts of a gene where indel is located.
Exploring Alternative Splicing Features using Support Vector Machines Feature for Alternative Splicing Alternative splicing is a mechanism for generating.
Computational Genomics and Proteomics Lecture 8 Motif Discovery C E N T R F O R I N T E G R A T I V E B I O I N F O R M A T I C S V U E.
Gene expression. The information encoded in a gene is converted into a protein  The genetic information is made available to the cell Phases of gene.
Transcription Packet #10 Chapter #8.
MCDB 4650 Developmental Control of Gene Expression.
1 Global expression analysis Monday 10/1: Intro* 1 page Project Overview Due Intro to R lab Wednesday 10/3: Stats & FDR - * read the paper! Monday 10/8:
Gene Prediction: Similarity-Based Methods (Lecture for CS498-CXZ Algorithms in Bioinformatics) Sept. 15, 2005 ChengXiang Zhai Department of Computer Science.
From Genomes to Genes Rui Alves.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Eukaryotic Gene Prediction Rui Alves. How are eukaryotic genes different? DNA RNA Pol mRNA Ryb Protein.
Central dogma: the story of life RNA DNA Protein.
Complexities of Gene Expression Cells have regulated, complex systems –Not all genes are expressed in every cell –Many genes are not expressed all of.
Comparative Genomics Methods for Alternative Splicing of Eukaryotic Genes Liliana Florea Department of Computer Science Department of Biochemistry GWU.
ANALYSIS OF GENE EXPRESSION DATA. Gene expression data is a high-throughput data type (like DNA and protein sequences) that requires bioinformatic pattern.
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
GENE REGULATION RESULTS IN DIFFERENTIAL GENE EXPRESSION, LEADING TO CELL SPECIALIZATION Eukaryotic DNA.
Enhanced Regulatory Sequence Prediction Using Gapped k-mer Features 王荣 14S
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
Transcriptome What is it - genome wide transcript abundance How do you obtain it - Arrays + MPSS What do you do with it when you have it - ?
Biotechnology and Bioinformatics: Bioinformatics Essential Idea: Bioinformatics is the use of computers to analyze sequence data in biological research.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
KEY CONCEPT Gene expression is carefully regulated in both prokaryotic and eukaryotic cells. Chapter 11 – Gene Expression.
BIO 1140 – SLIDE #1 Unit 2 – Information flow Unit 2 – What explains the variety of systems and their regulation? DNA RNA Protein The Central Dogma Replication.
CAMPBELL BIOLOGY IN FOCUS © 2014 Pearson Education, Inc. Urry Cain Wasserman Minorsky Jackson Reece Lecture Presentations by Kathleen Fitzpatrick and Nicole.
Factors Involved In RNA synthesis and processing Presented by Md. Anower Hossen ID: MS in Biotechnology.
Alternative Splicing. mRNA Splicing During RNA processing internal segments are removed from the transcript and the remaining segments spliced together.
bacteria and eukaryotes
The Transcriptional Landscape of the Mammalian Genome
Presentation transcript:

Alternative Splicing (a review by Liliana Florea, 2005) CS 498 SS Saurabh Sinha 11/30/06

What is alternative splicing? The first result of transcription is “pre-mRNA” This undergoes “splicing”, i.e., introns are excised out, and exons remain, to form mRNA This splicing process may involve different combinations of exons, leading to different mRNAs, and different proteins This is alternative splicing

Alternative splicing Important regulatory mechanism, for modulating gene and protein content in the cell Large-scale genomic data today suggests that as many as 60% of the human genes undergo alternative splicing

Significance Number of human genes has recently been estimated to be about K. Not significantly greater than much less complex organisms Alternative splicing is a potential explanation of how a large variety of proteins can be achieves with a small number of genes Errors in splicing mechanism implicated in diseases such as cancers

What happens in alternative splicing? Different combinations of exons within a gene are spliced from the RNA precursor, to be included in mRNA The combination depends on tissue type, developmental stage, disease etc. Thus different proteins in these different conditions Different types of alternative splicing on next slide

exon inclusion/exclusion alternative 5’ exon alternative 3’ exon intron retention 5’ alternative UTR 3’ alternative UTR

Bioinformatics of Alt. splicing Two main goals: –Find out cases of alt. splicing What are the different forms (“isoforms”) of a gene? –Find out how alt. splicing is regulated What are the sequence motifs controlling alt. splicing, and deciding which isoform will be produced

Identification of splice variants All cells have same genome But all cells don’t have the same “transcriptome” (i.e., transcripts) –Different cells may express different (alternative) transcripts of the same gene Goal of bioinformatics is to find “splice forms”, i.e., what are the alternative splicing events?

Identification of splice variants Direct comparison between sequences of different cDNA isoforms –Q: What is cDNA? How is this different from a gene’s DNA? –cDNA is “complementary DNA”, obtained by reverse transcription from mRNA. It has no introns Direct comparison reveals differences in the isoforms But this difference could be part of an exon, a whole exon, or a set of exons

Copyright restrictions may apply. Florea, L. Brief Bioinform :55-69; doi: /bib/bbk005 Bioinformatics methods for identifying alternative splicing direct comparison

Identification of splice variants Comparison of exon-intron structures (the gene’s architecture) Where do the exon-intron structures come from? –Align cDNA (no introns) with genomic sequence (with introns) –This gives us the intron and exon structure

Copyright restrictions may apply. Florea, L. Brief Bioinform :55-69; doi: /bib/bbk005 Bioinformatics methods for identifying alternative splicing comparison of exon-intron structures

Identification of splice variants Alignment tools. Align cDNA sequence to genomic sequence Why shouldn’t this be a perfect match with gaps (introns)? –Sequencing errors, polymorphisms, etc. Special purpose alignment programs for this purpose

Identifying full lengh alt. spliced transcripts Previous methods identified parts of alt. spliced transcript Much more difficult to identify full length alternatively spliced transcripts Such methods include “gene indices”

Gene indices Compare all EST sequences against one another Identify significant overlaps Group and assemble sequences with compatible overlaps into clusters

Gene indices

Problems with gene indices Overclustering: paralogs may get clustered together. –What are paralogs? –Related but distinct genes in the same species Underclustering: if number of ESTs is not sufficient Computationally expensive: –Quadratic time complexity

Splice graphs Nodes: Exons Edges: Introns Gene: directed acyclic graph Each path in this DAG is an alternative transcript

Splice graph

Splice graphs Combinatorially generate all possible alt. transcripts But not all such transcripts are going to be present Need scores for candidate transcripts, in order to differentiate between the biologically relevant ones and the artifactual ones

Splice variants from microarray data Affymetrix GeneChip technology uses 22 probes collected from exons or straddling exon boundaries When an exon is alternatively spliced, expression level of its probes will be different in different experiments

Copyright restrictions may apply. Florea, L. Brief Bioinform :55-69; doi: /bib/bbk005 Bioinformatics methods for identifying alternative splicing splice variants from micro array data

Part 2: Regulation of alternative splicing

Biological mechanism Splicing of pre-mRNA is a complex cellular process “Spliceosome” is a complex of several molecules that assembles onto each intron and catalyzes the excision of the intron Splice sites (5’ or donor splice site and 3’ or acceptor splice site) play a major role in splicing More sites, apart from the splice signals, in introns and exons, contribute to splicing

Biological mechanism Cis-regulatory elements (again !) Promote (“splicing enhancers”) or repress (“splicing silencers”) the inclusion of the exon in the mRNA Can be located in exons or introns

Bioinformatics methods Goal: find the cis-regulatory elements that mediate splicing (alternative splicing) Early work: find consensus sequences (motifs) of splicing enhancers More advanced work: Position weight matrices (PWMs)

Copyright restrictions may apply. Florea, L. Brief Bioinform :55-69; doi: /bib/bbk005 Bioinformatics representations of splicing regulatory motifs: (a) consensus sequence and (b) position weight matrix (PWM)

Motif finding (again !) Statistical overrepresentation Find k-mers that occur more often in one class of sequences than in another; Should be statistically significant Exonic splicing enhancers (ESE) are more likely to occur in exons than in introns; hence find 6-mers (k=6) statistically overrepresented in exons compared to introns Calculate z-score of count –(Count - mean)/(standard deviation) –Homework 1

Motif finding Other standard approaches of motif finding also adopted: –MEME & Gibbs sampling Comparative genomics –Find conserved sites in introns –Find conserved sites in exons. This has to be done carefully. Because exons already have selective pressure.

Summary Alternative splicing is very important Bioinformatics for finding alternative spliced forms Bioinformatics for finding regulatory mechanisms