Presentation is loading. Please wait.

Presentation is loading. Please wait.

RNA-Seq and RNA Structure Prediction

Similar presentations


Presentation on theme: "RNA-Seq and RNA Structure Prediction"— Presentation transcript:

1 RNA-Seq and RNA Structure Prediction
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520

2 Outline RNA-seq RNA structure prediction Experiments
Analysis: read mapping, expression index, isoform inference, differential expression RNA structure prediction Covariance model Base-pair maximization Free energy method

3 RNA-seq Mortazavi et al, Nat Meth 2008

4 RNA-Frag has Less 3’ Biase
Wang et al. 2009

5 RNA-Seq: Alternative to Microarrays
General expression profiling Novel genes Alternative splicing Detect gene fusion Can use on any sequenced genome Better dynamic range Cleaner and more informative data Data analysis challenges

6 Mapping Bowtie or Maq mapping identify transcribed known or novel exons Longer (e,g. 100bp) paired-end libraries are better

7 Transcript Abundances
More reads mapped to longer genes More reads mapped if sequencing is deep RPKM: reads per kb transcript per million reads: 1 RPKM ~ transcript / cell Low technical noise (Poisson distribution) but high biological noise (over dispersion, neg binomial)

8 Different Alternative Splicing

9 Isoform Inference If given known set of isoforms
Estimate x to maximize the likelihood of observing n

10 Known Isoform Abundance Inference

11 De novo isoform inference

12 Isoform Inference With known isoform set, sometimes the gene-level expression level inference is great, although isoform abundances have big uncertainty (e.g. known set is not complete) De novo isoform inference is a non-identifiable problem with current RNA-seq protocol and (short) read length (e.g. exon and isoform numbers are big)

13 Gene Fusion Down regulation of tumor suppressor or up regulation of oncogenes Maher et al, Nat 2009

14 A Few Algorithms Expression index and isoform inference
Cufflinks from Steve Salzburg Rseq from Wing Wong Scripture from Aviv Regev Differential expression Cufflinks DESeq from Wolfgang Huber EdgeR from Gordon Smyth Replicates are still preferred! Still need systematic evaluation

15 Why do we Care? RNA (tRNA, rRNA) structure determines function
Many non-coding RNA genes have special structure, which leads to special functions ncRNA genes later Mostly RNA 2nd structure: G-C and A-U; G-U

16 Simple RNA Structures

17 More Complex Interactions
Kissing hairpins Pseudoknots Hairpin-bulge contact

18 RNA Structure Representations

19 Covariance Models Get related RNA sequences, obtain multiple sequence alignment E.g. orthologous RNA from many species or family of RNA believed to have similar structure and function Require sequences be similar enough so that they can be initially aligned Look at every pair of columns and check for covarying substitutions Sequences should be dissimilar enough for covarying substitutions to be detected

20 Base-Pair Maximization
Find structure with the max # of base pairs Efficient dynamic programming solution introduced by Nussinov (1970s) Compare a sequence against itself in a dynamic programming matrix Since structure folds upon itself, only necessary to calculate half the matrix Four rules for scoring the structure at a particular point

21 Nussinov Algorithm Initialization: score for complementary matches along main diagonal and diagonal just below it are set to zero

22 Nussinov Algorithm Fill matrix: M[i][j] = max of the following
M[i+1][j-1] + S(xi, xj) M[i+1][j] M[i][j-1] M[i][j] = MAXi<k<j (M[i][k] + M[k+1][j])

23 Nussinov Algorithm Fill diagonal by diagonal (assume no bulge penalty, similar to SW gap penalty) j i

24 Nussinov Algorithm Trace back from upper right corner to get the structure

25 Free Energy Method Mfold: Mathews, JMB 1999
Predict the correct secondary structure by minimizing the free energy (G) Energy: Base pairing and base stacking

26 Energy Factors Consecutive basepairing, good Internal bulge, bad
Terminal basepairing, not stable Hairpin loop, interior and bulge loop destabilize energies

27 Energy Minimization Assume: the most likely structure is the most stable structure energetically Energy associated with any position is only influenced by local sequence and structure Does not consider pseudoknot formation Dynamic program

28 Energy Minimization

29 Vienna RNA Package Vienna RNA web

30 Summary RNA-seq RNA structure prediction methods
Cutflinks: read mapping, expression index, isoform inference, differential expression Different technique, analysis, and output for different tasks Awaiting RMA of RNA-seq 3rd generation sequencing might read whole transcript RNA structure prediction methods Covariance model: mutual information Base-pair maximization: Nussinov Free energy method: Mfold, Vienna RNA Caution: best is the enemy of the good


Download ppt "RNA-Seq and RNA Structure Prediction"

Similar presentations


Ads by Google