Predicting RNA Structure and Function. 1989 2006 2009.

Slides:



Advertisements
Similar presentations
RNA Secondary Structure Prediction
Advertisements

Gene expression From Gene to Protein
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Two short pieces MicroRNA Alternative splicing.
Improving miRNA Target Genes Prediction Rikky Wenang Purbojati.
MiRNA in computational biology 1 The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig C. Mello for their discovery of "RNA interference.
RNA Structure Prediction
Predicting RNA Structure and Function. Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns mRNA untranslated region (UTR)
RNA structure prediction. RNA functions RNA functions as –mRNA –rRNA –tRNA –Nuclear export –Spliceosome –Regulatory molecules (RNAi) –Enzymes –Virus –Retrotransposons.
Introduction to Bioinformatics - Tutorial no. 9 RNA Secondary Structure Prediction.
Non-coding RNA William Liu CS374: Algorithms in Biology November 23, 2004.
MicroRNA The Computational Challenge Bioinformatics Seminar, March 9, 2005 By Yaron Levy.
RNA Structure Prediction Rfam – RNA structures database RNAfold – RNA secondary structure prediction tRNAscan – tRNA prediction.
RNA Secondary Structure Prediction
Predicting RNA Structure and Function. Nobel prize 1989Nobel prize 2009 Ribozyme Ribosome RNA has many biological functions The function of the RNA molecule.
Presenting: Asher Malka Supervisor: Prof. Hermona Soreq.
MicroRNA genes Ka-Lok Ng Department of Bioinformatics Asia University.
Predicting RNA Structure and Function. Following the human genome sequencing there is a high interest in RNA “Just when scientists thought they had deciphered.
[Bejerano Fall10/11] 1.
. Class 5: RNA Structure Prediction. RNA types u Messenger RNA (mRNA) l Encodes protein sequences u Transfer RNA (tRNA) l Adaptor between mRNA molecules.
1 Ref: Ch. 5 Mount: Bioinformatics i.Protein synthesis: ribosomal RNA transfer RNA messenger RNA ii.Catalysis e.g. ribozymes iii.Regulatory molecules 17.1.
Predicting RNA Structure and Function
Predicting RNA Structure and Function. Nobel prize 1989 Nobel prize 2009 Ribozyme Ribosome.
Dynamic Programming (cont’d) CS 466 Saurabh Sinha.
RNA-Seq and RNA Structure Prediction
RNA informatics Unit 12 BIOL221T: Advanced Bioinformatics for Biotechnology Irene Gabashvili, PhD.
Non-coding RNA gene finding problems. Outline Introduction RNA secondary structure prediction RNA sequence-structure alignment.
Transcription Transcription is the synthesis of mRNA from a section of DNA. Transcription of a gene starts from a region of DNA known as the promoter.
MicroRNA Targets Prediction and Analysis. Small RNAs play important roles The Nobel Prize in Physiology or Medicine for 2006 Andrew Z. Fire and Craig.
Genomics and Personalized Care in Health Systems Lecture 9 RNA and Protein Structure Leming Zhou, PhD School of Health and Rehabilitation Sciences Department.
From Structure to Function. Given a protein structure can we predict the function of a protein when we do not have a known homolog in the database ?
RNA folding & ncRNA discovery I519 Introduction to Bioinformatics, Fall, 2012.
MicroRNA identification based on sequence and structure alignment Presented by - Neeta Jain Xiaowo Wang†, Jing Zhang†, Fei Li, Jin Gu, Tao He, Xuegong.
RNA Folding. RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four.
1 Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine Chenghai Xue, Fei Li, Tao He,
RNA Secondary Structure Prediction. 16s rRNA RNA Secondary Structure Hairpin loop Junction (Multiloop)Bulge Single- Stranded Interior Loop Stem Image–
© Wiley Publishing All Rights Reserved. RNA Analysis.
Lecture 9 CS5661 RNA – The “REAL nucleic acid” Motivation Concepts Structural prediction –Dot-matrix –Dynamic programming Simple cost model Energy cost.
RNA secondary structure RNA is (usually) single-stranded The nucleotides ‘want’ to pair with their Watson-Crick complements (AU, GC) They may ‘settle’
RNA Structure Prediction
CSLS Retreat 2007 Matan Hofree & Assaf Weiner 1. Outline  A brief introduction to microRNA  Project motivation and goal  Selecting the data sets 
Protein and RNA Families
MicroRNAs and Other Tiny Endogenous RNAs in C. elegans Annie Chiang JClub Ambros et al. Curr Biol 13:
Eukaryotic Genomes  The Organization and Control of Eukaryotic Genomes.
Proposed redefinition of “gene” requires it to have a biological role Gerstein MB, …, Snyder M Genome Res 17: example of complexities observed.
Questions?. Novel ncRNAs are abundant: Ex: miRNAs miRNAs were the second major story in 2001 (after the genome). Subsequently, many other non-coding genes.
Mark D. Adams Dept. of Genetics 9/10/04
Improving Intergenic miRNA Target Genes Prediction Rikky Wenang Purbojati.
RNA Structure Prediction RNA Structure Basics The RNA ‘Rules’ Programs and Predictions BIO520 BioinformaticsJim Lund Assigned reading: Ch. 6 from Bioinformatics:
This seems highly unlikely.
Motif Search and RNA Structure Prediction Lesson 9.
Classification of real and pseudo microRNA precursors using local structure-sequence features and support vector machine 朱林娇 14S
Tracking down ncRNAs in the genomes. How to find ncRNA gene The stability of ncRNA secondary structure is not sufficiently different from the predicted.
MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.
Abstract Premise Figure 1: Flowchart pri-miRNAs were collected from miRBase 10.0 pri-miRNAs were compared to hsa and ptr genomes using BlastN and potential.
RNA Structure Prediction
Rapid ab initio RNA Folding Including Pseudoknots via Graph Tree Decomposition Jizhen Zhao, Liming Cai Russell Malmberg Computer Science Plant Biology.
Poster Design & Printing by Genigraphics ® Esposito, D., Heitsch, C. E., Poznanovik, S. and Swenson, M. S. Georgia Institute of Technology.
RNAs. RNA Basics transfer RNA (tRNA) transfer RNA (tRNA) messenger RNA (mRNA) messenger RNA (mRNA) ribosomal RNA (rRNA) ribosomal RNA (rRNA) small interfering.
AAA AAAU AAUUC AUUC UUCCG UCCG CCGG G G Karen M. Pickard CISC889 Spring 2002 RNA Secondary Structure Prediction.
Vienna RNA web servers
Lab 8.3: RNA Secondary Structure
Predicting RNA Structure and Function
RNA Secondary Structure Prediction
RNA Secondary Structure Prediction
MicroRNAs: regulators of gene expression and cell differentiation
Identification and Characterization of pre-miRNA Candidates in the C
mRNA Degradation and Translation Control
RNA 2D and 3D Structure Craig L. Zirbel October 7, 2010.
Noncoding RNA roles in Gene Expression
Presentation transcript:

Predicting RNA Structure and Function

The Ribosome : The protein factory of the cell mainly made of RNA

Non coding DNA (98.5% human genome) Intergenic Repetitive elements Promoters Introns untranslated region (UTR)

Some biological functions of ncRNA Control of mRNA stability (UTR) Control of splicing (snRNP) Control of translation (microRNA) The function of the RNA molecule depends on its folded structure

Example: Control of Iron levels by mRNA structure G U A G C N N N’ C N N’ 5’3’ conserved Iron Responsive Element IRE Recognized by IRP1, IRP2

IRP1/2 5’ 3’ F mRNA 5’ 3’ TR mRNA IRP1/2 F: Ferritin = iron storage TR: Transferin receptor = iron uptake IRE Low Iron IRE-IRP inhibits translation of ferritin IRE-IRP Inhibition of degradation of TR High Iron IRE-IRP off -> ferritin translated Transferin receptor degradated

RNA Structural levels tRNA Secondary Structure Tertiary Structure

RNA Secondary Structure U U C G U A A U G C 5’ 3’ 5’ G A U C U U G A U C 3’ STEM LOOP The RNA molecule folds on itself. The base pairing is as follows: G C A U G U hydrogen bond.

RNA Secondary structure Short Range Interactions G G A U U G C C G G A U A A U G C A G C U U INTERNAL LOOP HAIRPIN LOOP BULGE STEM DANGLING ENDS 5’3’

long range interactions of RNA secondary structural elements Pseudo-knot Kissing hairpins Hairpin-bulge contact These patterns are excluded from the prediction schemes as their computation is too intensive.

Predicting RNA secondary Structure Searching for a structure with Minimal Free Energy (MFE)

Free energy model Free energy of structure (at fixed temperature, ionic concentration) = sum of loop energies Standard model uses experimentally determined thermodynamic parameters

Why is MFE secondary structure prediction hard? MFE structure can be found by calculating free energy of all possible structures BUT number of potential structures grows exponentially with the number, n, of bases

RNA folding with Dynamic programming (Zuker and Steigler) W(i,j): MFE structure of substrand from i to j ij W(i,j)

RNA folding with dynamic programming Assume a function W(i,j) which is the MFE for the sequence starting at i and ending at j (i<j) Define scores, for example a base pair’s score is less than a non-pair Consider 4 possibilities: –i,j are a base pair, added to the structure for i+1..j-1 –i is unpaired, added to the structure for i+1..j –j is unpaired, added to the structure for i..j-1 –i,j are paired, but not to each other; Choose the minimal energy possibility i (i+1) W(i,j) (j-1) j

Simplifying Assumptions for Structure Prediction RNA folds into one minimum free-energy structure. There are no knots (base pairs never cross). The energy of a particular base pair in a double stranded regions is calculated independently –Neighbors do not influence the energy.

Sequence dependent free-energy Nearest Neighbor Model U U C G G C A U G C A UCGAC 3’ 5’ U U C G U A A U G C A UCGAC 3’ 5’ Assign negative energies to interactions between base pair regions. Energy is influenced by the previous base pair (not by the base pairs further down).

Sequence dependent free-energy values of the base pairs (nearest neighbor model) U U C G G C A U G C A UCGAC 3’ 5’ U U C G U A A U G C A UCGAC 3’ 5’ Example values: GC GC AU GC CG UA These energies are estimated experimentally from small synthetic RNAs.

Mfold :Adding Complexity to Energy Calculations Positive energy - added for destabilizing regions such as bulges, loops, etc. More than one structure can be predicted

Free energy computation U U A G C A G C U A A U C G A U A 3’ A 5’ mismatch of hairpin -2.9 stacking nt bulge -2.9 stacking -1.8 stacking 5’ dangling -0.9 stacking -1.8 stacking -2.1 stacking G= -4.6 KCAL/MOL nt loop

Prediction Tools based on Energy Calculation Fold, Mfold Zucker & Stiegler (1981) Nuc. Acids Res. 9: Zucker (1989) Science 244:48-52 RNAfold Vienna RNA secondary structure server Hofacker (2003) Nuc. Acids Res. 31:

Insight from Multiple Alignment Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C

Compensatory Substitutions U U C G U A A U G C A UCGAC 3’ G C 5’ Mutations that maintain the secondary structure

RNA secondary structure can be revealed by identification of compensatory mutations G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C U C U G C G N N’ G C

Insight from Multiple Alignment Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. Conservation – no additional information Consistent mutations (GC  GU) – support stem Inconsistent mutations – does not support stem. Compensatory mutations – support stem.

RNAalifold (Hofacker 2002) From the vienna RNA package Predicts the consensus secondary structure for a set of aligned RNA sequences by using modified dynamic programming algorithm that add alignment information to the standard energy model Improvement in prediction accuracy

Other related programs COVE RNA structure analysis using the covariance model (implementation of the stochastic free grammar method) QRNA (Rivas and Eddy 2001) Searching for conserved RNA structures tRNAscan-SE tRNA detection in genome sequences Sean Eddy’s Lab WU

RNA families Rfam : General non-coding RNA database (most of the data is taken from specific databases) Includes many families of non coding RNAs and functional motifs, as well as their alignment and their secondary structures

Rfam /Pfam Pfam uses the HMMER (based on Hidden Markov Models) Rfam uses the INFERNAL (based on Covariation Model)

Rfam (currently version 7.0) Different RNA families or functional Motifs from mRNA, UTRs etc.  View and download multiple sequence alignments  Read family annotation  Examine species distribution of family members  Follow links to other databases

An example of an RNA family miR-1 MicroRNAs mir-1 microRNA precursor family This family represents the microRNA (miRNA) mir-1 family. miRNAs are transcribed as ~70nt precursors (modelled here) and subsequently processed by the Dicer enzyme to give a ~22nt product. The products are thought to have regulatory roles through complementarity to mRNA.

Seed alignment (based on 7 sequences)

Predicting microRNA target

Predicting microRNA target genes Why is it hard?? –Lots of known miRNAs with similar seeds –Base pairing is required only for seed (7 nt) Very High probability to find by chance Initial methods –Look at conserved miRNAs –Look for conserved target sites –Consider the RNA fold

TargetScan Algorithm by Lewis et al 2003 The Goal – find miRNA candidate target genes of a given miRNA Stage 1: Select only the 3’UTR of all genes –Search for 7nts which are complementary to bases 2-8 from miRNA (miRNA seed”) in 5’UTRs

TargetScan Algorithm Stage 2: Extend seed matches in both directions –Allow G-U (wobble) pairs

TargetScan Algorithm Stage 3: Optimize base-pairing in remaining 3’ region of miRNA (not applied in the later versions)

Stage 4: Calculate the folding free energy (G) assigned to each putative miRNA:target interaction using RNAfold Low energy get high Score Stage 5: Calculate a final score for a UTR to be a target adding evolutionary conservation (by doing the same steps on UTR from other species) TargetScan Algorithm

Predicting targets of RNA-binding protein (RBP)

Predicting RBPs target Different types of RBPs –Proteins that regulate RNA stability (bind usually at the 3’UTR) –Splicing Factors (bind exonic and intronic regions) –…… Why is it hard –Different proteins bind different sequences –Most RBP sites are short and degenertaive (e.g. CTCTCT )

Predicting Exon Splicing Enhancers ESE-finder (Krainer) 1. Built PSSM for ESE, based on experimental data (SELEX)

ESE-finder 2. A given sequence is tested against 5 PSSM in overlapping windows 3. Each position in the sequence is given a score 4. Position which fit a PSSM (score above a cutoff) are predicted as ESEs

Predicting RBPs target RNA binding sites can be predicted by general motif finders -MEME overview.html -DRIM