Presentation is loading. Please wait.

Presentation is loading. Please wait.

RNA Folding. RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four.

Similar presentations


Presentation on theme: "RNA Folding. RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four."— Presentation transcript:

1 RNA Folding

2 RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four possibilities for S(i,...,j) i,j are paired, added to S(i+1,...,j-1) i is unpaired, added to S(i+1,...,j) j is unpaired, added so S(i,...,j-1) i,j are paired but not to each other, to S(i,...,k), S(k+1,...,j)

3 RNA Folding by DP Fill in a matrix of S(0,...,seq_length)

4 RNA Folding Assumptions RNA folding algorithms typically detect only nested structures and do not recognize pseudoknots Some folding algorithms identify pseudoknots but they are typically inefficient or limited (e.g., do not take stacking-dependent pairing models) Current algorithms get about 50-70% of the base pairs correct, on average

5 MicroRNA Identification

6 miRNAs are genomically encoded small RNAs processed into single stranded 21-23 mers incorporated into RNP complex (miRISC) miRISC binds to 3’UTRs, repression of translation modest mRNA degradation MicroRNAs: Introduction miRISC Ago1 Bartel, Cell 116, 2004

7 MicroRNA Transcription miRNA genes can be in intergenic and intronic regions miRNA genes can be clustered and co-expressed Estimates: 60% singletons, 25% introns, 15% clusters

8 MicroRNA Examples

9 MicroRNA Gene Conservation Some miRNAs are highly conserved (e.g. let-7) Conservation must preserve a dsRNA hairpin from which the miRNA is processed by Dicer

10 MicroRNA Gene Identification MicroRNA Cloning Map cloned ~22nt small RNAs to the genome Predict pre-miRNA secondary structures using m-fold Score pre-miRNAs based on known miRNA precursors Computational Identification Identify conserved genomic segments Predict pre-miRNA secondary structures using m-fold Scoring pre-miRNAs based on the known miRNA precursors

11 MirScan, MirSeeker, … MicroRNA Gene Identification More complex methods: additional features

12 MiRBase ~4500 miRNAs in 41 eukaryotes Examples: 474 human, 78 fly Eight viruses express microRNAs

13 MiRBase

14 MicroRNAs: Open Questions Promoter Transcritpional start site Transcriptional Termination Transcriptional complex Regulation of miRNA expression

15 MicroRNA Targets: Mechanism & Identification

16 Are All RNAs Regulated by miRNAs?

17 The Target Prediction Problem Target sites show imperfect sequence complementarity: Strong match in 5’ region (‘seed’) Varying complementarity on 3’ end Computational target predictions: Sensitive to exact pairing rules ~100 targets per miRNA within fly transcriptome ~25% of transcriptome under miRNA regulation 3’5’ mRNA 3’5’ miRNA seed 87654321 Existing algorithms focus on quality of the sequence match between miRNA and mRNA target introduce various filters, e.g. evolutionary conservation 3’ 5’ mRNA 3’5’ miRNA 987654321 Brennecke et al. 05 wtwt seed

18 miRanda Target prediction: sequence-based rules miRNA-target complementarity (strong in 5’, weaker in 3’) Refinement with binding free energy scores Use conservation to increase signal to noise

19 PicTAR: Combinatorial Targets mRNA Perfect nucleus Imperfect nucleus miRNA Filter - over 33% of mature miRNA binding energy to perfect complementary site

20 PicTAR: Combinatorial Targets Anchor

21 PicTAR: Combinatorial Targets

22 Prior (transition) probabilities p0p0 p1p1 p2p2 p3p3 pmpm... Emission probabilities A C U G ACUGUAC GGCAUUAC Generated mRNA U ACUGUAC C GGCAUUAC ACUGCAC... - Independency of binding sites (no overlapping) - Transition does not depend on current state (memoryless) - Competition between background and miRNA 1…m miRNAs Hidden states b 0.3 0.8 0.2 0.8 0.02

23 Accessibility: The Missing Component What about target accessibility? miRISC vs.

24 Experimental Method Drosophila tissue culture cells (S2) No miRNA overexpression establish miRNA expression profile use endogenous miRNA (50-500 copies per cell) (bantam, miR-2 family, miR-184) Dual luciferase reporter assay mutate target site sequence mutate sequence surrounding the target site to alter mRNA secondary structure firefly 3’UTR Renilla UTR engineering Renilla experiment, firefly as internal control mild overexpression of target sequence (<10fold) no target degradation (20h transfection) sensitive, quantitative, linear assay

25 3’UTR AAAAA target site ~200 b N: ~200 bp fragment, native structure C: ~200 bp fragment, closed structure 0.0 0.1 0.2 0.3 0.4 3’UTRNCC3C3+C5C5+ normalized luciferase ratio target site 5’ end A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC rpr (miR-2) Target miRNA The Role of Secondary Structure

26 Target Accessibility Matters 0.0 0.1 0.2 0.3 0.4 3’UTRNCC3C3+C5C5+ normalized luciferase ratio target site 5’ end A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC rpr (miR-2) Target miRNA grim (miR-2) 3’UTR NC A GCA U GCUC AUCAAAGC UUGUGAU CGAG UAGUUUCG GACACUA ACC U C AAUUAGUUUUCA AAUGAUCUCG UUAGUCGAAAGU UUACUAGAGU U hid (bantam) 3’UTR NC

27 Accessibility as Important as Sequence 0.0 0.1 0.2 0.3 0.4 3’UTRNCC3C3+C5C5+ normalized luciferase ratio target site 5’ end A GA 5 CUCAUCAAAGC UUGUGAUA 3’ 3’ GAGUAGUUUCG GACACUAU 5’ C ACC rpr (miR-2) Target miRNA A GA CUCAUCAAAGC UUGUGAUA ACC 87654321 D5D5 D 5+3 G M2M3M6I5 0.7 D5D5 D 5+3 target site mutations

28 Thermodynamic miRNA::RNA Model

29 UTR ∆G = -25.3 ∆G 5 = -15.1∆G 3 = -10.2 Thermodynamic miRNA::RNA Model

30 UTRCDSPoly(A) ∆G 0 = -28.3 ∆G 1 = -19.5 ∆G open = ∆G 0 - ∆G 1 folding area = target +70bp Thermodynamic miRNA::RNA Model

31 0.1 0.2 0.3 0.4 -30-28-26-24-22 normalized luciferase ratio 0.1 0.2 0.3 0.4 -30-20-1001020 D G duplex DD G grim hid rpr 22 constructs altering accessibility of target sites in rpr, hid, grim r=0.36 p<0.11 r=0.7 p<4x10 -4 30 -30-20-100102030 0.1 0.2 0.3 0.4 DD G with flank 17 up, 13 down r=0.77 p<3x10 -5 15 10 5 0 20 25 5102001525 r 0.70 0.72 0.74 0.76 0.68 exploring flank size downstream (bp) upstream (bp) ddG Predicts Measured Repression

32 ddG differential measured repression differential miR-184 targets r=0.87 190 validated targets 3’ 5’ mRNA 3’5’ miRNA 987654321 seed Native Target Analysis 12 miR-184 targets with weaker 3’ pairing, tested in different backgrounds to alter secondary structure non-redundant set of 190 experimentally tested miRNA:mRNA target pairs in Drosophila

33 miRNA target seeds favor highly accessible regions of the genome D G open overrepresentation vs. random accessibility ( D G open ) accessibility ( D G open ) fly human Genome-Wide Target Analysis

34 Assignment Download the set of human microRNAs Download the set of human UTRs Download the mFold software For each microRNA, identify the set of targets on each UTR, defined by a perfect match to the microRNA seed, bases 2-8 Partition the targets of each microRNA into conserved and non- conserved targets (define a conservation cutoff) Compare the RNA-accessibility of conserved and non- conserved targets for each microRNA For each putative target, extract the 100 bases that surround it Use mFold to compute the free energy of these 100 bases Create a dot-plot with points being microRNAs, and axes being the median (plot #1) or mean (plot #2) free energy of all conserved (x- axis) or non-conserved (y-axis) targets of the microRNA


Download ppt "RNA Folding. RNA Folding Algorithms Intuitively: given a sequence, find the structure with the maximal number of base pairs For nested structures, four."

Similar presentations


Ads by Google