Presentation is loading. Please wait.

Presentation is loading. Please wait.

Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion.

Similar presentations


Presentation on theme: "Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion."— Presentation transcript:

1 Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion Mandoiu (UCONN) Monte-Carlo Regression Algorithm for Isoform Frequency Estimation from RNA-Seq Data IEEE ICCABS 2013, New Orleans, LA

2 RNA-Seq: Introduction MCReg: Monte Carlo Regression based Algorithm Experimental Results Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA Outline

3 Genome-Guided RNA-Seq Protocol RNA-Seq enables transcript-level resolution of gene expression From RNA – through the process of hybridization- Make cDNA & shatter into fragments Sequence fragment ends ABCDE Map reads to genome Gene Expression (GE)Isoform Expression (IE) ABC AC DE Isoform Discovery (ID) [Nicolae, et. al., 11] IEEE ICCABS 2013, New Orleans, LA

4 RNA-Seq: Introduction MCReg: Monte Carlo Regression based Algorithm -Observed Read Distribution -MC-Based Estimation of Expected Read Distribution -Regression-Based Estimation of Isoform Frequencies Experimental Results Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA Outline

5 MCReg: Monte-Carlo Regression MCReg Motivation: Reducing the error rate is critical for detecting similar transcripts especially in those cases when one is a subset of another: Screenshot from Genome browse: IEEE ICCABS 2013, New Orleans, LA

6 Map paired-end reads onto the library of known isoforms using an ungapped aligner (e.g., Bowtie) B. Langmead, C. Trapnell, et. al., “Ultrafast and memory-efficient alignment of short DNA sequences to the human genome,” Genome Biology, vol. 10, no. 3, p. R25, 2009. Group reads that have been mapped to the same transcripts into classes Monte-Carlo-Based Estimation of Expected Read Distribution using e.g. Grinder simulator F.E. Angly et. al. Grinder: a versatile amplicon and shotgun sequence simulator. Nucleic acids research, 2012 Solve the regression: The least-square formulation can be solved with a constrained quadratic programming solver M. S. Andersen et. al. CVXOPT: A Python package for convex optimization, Available at cvxopt.org, 2013.cvxopt.org General Method Overview

7 Observed Read Distribution IEEE ICCABS 2013, New Orleans, LA

8 Monte-Carlo-Based Estimation of Expected Read Distribution IEEE ICCABS 2013, New Orleans, LA

9 MC-Based Estimation of Expected Read Distribution IEEE ICCABS 2013, New Orleans, LA

10 Regression-Based Estimation of Isoform Frequencies IEEE ICCABS 2013, New Orleans, LA

11 Regression-Based Estimation of Isoform Frequencies IEEE ICCABS 2013, New Orleans, LA

12 RNA-Seq: Introduction MCReg: Monte Carlo Regression based Algorithm Experimental Results Conclusions and Future Work IEEE ICCABS 2013, New Orleans, LA Outline

13 Simulation Setup IEEE ICCABS 2013, New Orleans, LA

14 Experimental Results Frequency estimation accuracy was assessed using the coefficient of determination r 2. For IsoEM r 2 = 0.92, while for MCReg r 2 = 0.97. The results shows better correlation compared with IsoEM especially because of those cases of sub- transcripts where IsoEM skewed the estimated frequency toward super-transcripts. IEEE ICCABS 2013, New Orleans, LA

15 Thanks! IEEE ICCABS 2013, New Orleans, LA


Download ppt "Alex Zelikovsky Department of Computer Science Georgia State University Joint work with Adrian Caciula (GSU), Serghei Mangul (UCLA) James Lindsay, Ion."

Similar presentations


Ads by Google