Presentation is loading. Please wait.

Presentation is loading. Please wait.

Predicting RNA Structure and Function

Similar presentations


Presentation on theme: "Predicting RNA Structure and Function"ā€” Presentation transcript:

1 Predicting RNA Structure and Function

2 Ribozyme

3 The Ribosome : The protein factory of the cell mainly made of RNA

4 Non coding DNA (98.5% human genome)
Intergenic Repetitive elements Promoters Introns untranslated region (UTR)

5 Some biological functions of ncRNA
Control of mRNA stability (UTR) Control of splicing (snRNP) Control of translation (microRNA) The function of the RNA molecule depends on its folded structure

6 Example: Control of Iron levels by mRNA structure
Iron Responsive Element IRE G U A G C N N Nā€™ C conserved Recognized by IRP1, IRP2 5ā€™ 3ā€™

7 F: Ferritin = iron storage TR: Transferin receptor = iron uptake
IRP1/2 IRE 3ā€™ 5ā€™ F mRNA IRP1/2 3ā€™ TR mRNA 5ā€™ Low Iron IRE-IRP inhibits translation of ferritin IRE-IRP Inhibition of degradation of TR High Iron IRE-IRP off -> ferritin translated Transferin receptor degradated

8 RNA Structural levels Secondary Structure Tertiary Structure tRNA

9 RNA Secondary Structure
The RNA molecule folds on itself. The base pairing is as follows: G C A U G U hydrogen bond. LOOP U U C G U A A U G C 5ā€™ ā€™ STEM 5ā€™ G A U C U U G A U C 3ā€™

10 RNA Secondary structure Short Range Interactions
G G A U U G C C G G A U A G C A G C U U HAIRPIN LOOP BULGE INTERNAL LOOP STEM DANGLING ENDS 5ā€™ 3ā€™

11 long range interactions of RNA secondary structural elements
These patterns are excluded from the prediction schemes as their computation is too intensive. Pseudo-knot Kissing hairpins Hairpin-bulge contact

12 Predicting RNA secondary Structure
Searching for a structure with Minimal Free Energy (MFE)

13 Free energy model Free energy of structure (at fixed temperature, ionic concentration) = sum of loop energies Standard model uses experimentally determined thermodynamic parameters exclude coaxial stacking, metal ions, nonstandard bonds, folding pathway, etc

14 Why is MFE secondary structure prediction hard?
MFE structure can be found by calculating free energy of all possible structures but, number of potential structures grows exponentially with the number, n, of bases structures can be arbitrarily complex

15 RNA folding with Dynamic programming (Zuker and Steigler)
W(i,j): MFE structure of substrand from i to j W(i,j) i j

16 RNA folding with dynamic programming
Assume a function W(i,j) which is the MFE for the sequence starting at i and ending at j (i<j) Define scores, for example a base pairā€™s score is less than a non-pair Consider 4 recursion possibilities: i,j are a base pair, added to the structure for i+1..j-1 Define this as V(i,j) i is unpaired, added to the structure for i+1..j j is unpaired, added to the structure for i..j-1 i,j are paired, but not to each other; the structure for i..j adds together sub-structures for 2 sub-sequences: i..k and k+1..j a bifurcation (i<k<j) Choose the minimal energy possibility

17 Simplifying Assumptions for Structure Prediction
RNA folds into one minimum free-energy structure. There are no knots (base pairs never cross). The energy of a particular base pair in a double stranded regions is calculated independently Neighbors do not influence the energy.

18 Sequence dependent free-energy Nearest Neighbor Model
U U C G G C A U A UCGAC 3ā€™ U U C G U A A U G C A UCGAC 3ā€™ 5ā€™ 5ā€™ Assign negative energies to interactions between base pair regions. Energy is influenced by the previous base pair (not by the base pairs further down).

19 Sequence dependent free-energy values of the base pairs (nearest neighbor model)
U U C G G C A U A UCGAC 3ā€™ U U C G U A A U G C A UCGAC 3ā€™ 5ā€™ 5ā€™ These energies are estimated experimentally from small synthetic RNAs. Example values: GC GC GC GC AU GC CG UA

20 Mfold :Adding Complexity to Energy Calculations
Positive energy - added for destabilizing regions such as bulges, loops, etc. More than one structure can be predicted

21 Free energy computation
U U A A G C A U A A U C G A ā€™ 5ā€™ nt loop -1.1 mismatch of hairpin -2.9 stacking +3.3 1nt bulge -2.9 stacking -1.8 stacking -0.9 stacking -1.8 stacking 5ā€™ dangling -2.1 stacking -0.3 G= -4.6 KCAL/MOL -0.3

22 Prediction Tools based on Energy Calculation
Fold, Mfold Zucker & Stiegler (1981) Nuc. Acids Res. 9: Zucker (1989) Science 244:48-52 RNAfold Vienna RNA secondary structure server Hofacker (2003) Nuc. Acids Res. 31:

23 Insight from Multiple Alignment
Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C

24 Compensatory Substitutions
Mutations that maintain the secondary structure U U C G U A A U G C A UCGAC 3ā€™ G C 5ā€™

25 G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C
RNA secondary structure can be revealed by identification of compensatory mutations U C U G C G N Nā€™ G C G C C U U C G G G C G A C U U C G G U C G G C U U C G G C C

26 Insight from Multiple Alignment
Information from multiple sequence alignment (MSA) can help to predict the probability of positions i,j to be base-paired. Conservation ā€“ no additional information Consistent mutations (GCļƒ  GU) ā€“ support stem Inconsistent mutations ā€“ does not support stem. Compensatory mutations ā€“ support stem.

27 RNAalifold (Hofacker 2002)
From the vienna RNA package Predicts the consensus secondary structure for a set of aligned RNA sequences by using modified dynamic programming algorithm that add alignment information to the standard energy model Improvement in prediction accuracy

28 Other related programs
Sean Eddyā€™s Lab WU COVE RNA structure analysis using the covariance model (implementation of the stochastic free grammar method) QRNA (Rivas and Eddy 2001) Searching for conserved RNA structures tRNAscan-SE tRNA detection in genome sequences

29 RNA families Rfam : General non-coding RNA database
(most of the data is taken from specific databases) Includes many families of non coding RNAs and functional motifs, as well as their alignment and their secondary structures

30 Rfam /Pfam Pfam uses the HMMER (based on Hidden Markov Models)
Rfam uses the INFERNAL (based on Covariation Model)

31 Rfam (currently version 7.0)
Different RNA families or functional Motifs from mRNA, UTRs etc. View and download multiple sequence alignments Read family annotation Examine species distribution of family members Follow links to otherdatabases

32 An example of an RNA family miR-1 MicroRNAs
mir-1 microRNA precursor family This family represents the microRNA (miRNA) mir-1 family. miRNAs are transcribed as ~70nt precursors (modelled here) and subsequently processed by the Dicer enzyme to give a ~22nt product. The products are thought to have regulatory roles through complementarity to mRNA.

33 Seed alignment (based on 7 sequences)

34 Predicting microRNA target

35 Predicting microRNA target genes
Why is it hard?? Lots of known miRNAs Mostly unknown target genes Initial method outline Look at conserved miRNAs Look for conserved target sites

36 miRNAs in animals 0.5%-1.0% of predicted genes encode miRNA (!!)
One of the more abundant regulatory classes Tissue-specific or developmental stage-specific expression High evolutionary conservation

37 TargetScan Algorithm by Lewis et al 2003
The Goal ā€“ a ranked list of candidate target genes Stage 1: Search UTRs in one organism Bases 2-8 from miRNA = ā€œmiRNA seedā€ Perfect Watson-Crick complementarity No wobble pairs (G-U) 7nt matches = ā€œseed matchesā€

38 TargetScan Algorithm Stage 2: Extend seed matches
Allow G-U (wobble) pairs Both directions Stop at mismatches

39 TargetScan Algorithm Stage 3: Optimize basepairing
Remaining 3ā€™ region of miRNA 35 bases of UTR 5ā€™ to each seed match RNAfold program (Hofacker et al 1994)

40 TargetScan Algorithm Stage 4: Folding free energy (G) assigned to each putative miRNA:target interaction Assign rank to each UTR Repeat this process for each of the other organisms with UTR datasets

41 Predicting RNA-binding protein (RBP) targets

42 Predicting RBPs target
Different types of RBPs Proteins that regulate RNA stability (bind usually at the 3ā€™UTR) Splicing Factors (bind exonic and intronic regions) ā€¦ā€¦ Why is it hard Different proteins bind different sequences Most RBP sites are short and degenertaive (e.g. CTCTCT )

43 Predicting Exon Splicing Enhancers ESE-finder (Krainer)
1. Built PSSM for ESE, based on experimental data (SELEX)

44 ESE-finder 2. A given sequence is tested against
5 PSSM in overlapping windows 3. Each position in the sequence is given a score 4. Position which fit a PSSM (score above a cutoff) are predicted as ESEs

45 Predicting RBPs target
RNA binding sites can be predicted by general motif finders MEME DRIM


Download ppt "Predicting RNA Structure and Function"

Similar presentations


Ads by Google