1 30 Sept. 2010 Genome Sciences Centre BC Cancer Agency, Vancouver, BC, Canada Malachi Griffith ALEXA-Seq analysis reveals breast cell type specific mRNA.

Slides:



Advertisements
Similar presentations
RNA-Seq as a Discovery Tool
Advertisements

RNA-seq library prep introduction
An Introduction to Studying Expression Data Through RNA-seq
RNAseq.
Transcriptome Sequencing with Reference
Peter Tsai Bioinformatics Institute, University of Auckland
Transcriptome Assembly and Quantification from Ion Torrent RNA-Seq Data Alex Zelikovsky Department of Computer Science Georgia State University Joint work.
Timothy H. W. Chan, Calum MacAulay, Wan Lam, Stephen Lam, Kim Lonergan, Steven Jones, Marco Marra, Raymond T. Ng Department of Computer Science, University.
Canadian Bioinformatics Workshops
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
Gene Expression And Regulation Bioinformatics January 11, 2006 D. A. McClellan
Bioinformatics pipeline for detection of immunogenic cancer mutations by high throughput mRNA sequencing Jorge Duitama 1, Ion Mandoiu 1, and Pramod Srivastava.
Estimation of alternative splicing isoform frequencies from RNA-Seq data Ion Mandoiu Computer Science and Engineering Department University of Connecticut.
RNA-Seq An alternative to microarray. Steps Grow cells or isolate tissue (brain, liver, muscle) Isolate total RNA Isolate mRNA from total RNA (poly.
Characterizing Alternative Splicing With Respect To Protein Domains BME 220 Project Charlie Vaske.
High Throughput Sequencing
mRNA-Seq: methods and applications
Fine Structure and Analysis of Eukaryotic Genes
A Bioinformatics Meta-analysis of Differentially Expressed Genes in Colorectal Cancer Simon Chan, Thursday Trainee Seminar – October 11.
1. Abstract SAGE Serial analysis of gene expression (SAGE) is a method of large-scale gene expression analysis.that involves sequencing small segments.
How do you identify and clone a gene of interest? Shotgun approach? Is there a better way?
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
Amandine Bemmo 1,2, David Benovoy 2, Jacek Majewski 2 1 Universite de Montreal, 2 McGill university and Genome Quebec innovation centre Analyses of Affymetrix.
Variables: – T(p) - set of candidate transcripts on which pe read p can be mapped within 1 std. dev. – y(t) -1 if a candidate transcript t is selected,
Experimental validation. Integration of transcriptome and genome sequencing uncovers functional variation in human populations Tuuli Lappalainen et al.
Next Generation Sequencing and its data analysis challenges Background Alignment and Assembly Applications Genome Epigenome Transcriptome.
Next Generation DNA Sequencing
Adrian Caciula Department of Computer Science Georgia State University Joint work with Serghei Mangul (UCLA) Ion Mandoiu (UCONN) Alex Zelikovsky (GSU)
Expression of the Genome The transcriptome. Decoding the Genetic Information  Information encoded in nucleotide sequences contained in discrete units.
Verna Vu & Timothy Abreo
MPL Identification of alternative spliced mRNA variants related to cancers by genome-wide ESTs alignment KIM DAE SOO Oncogene Apr.
The iPlant Collaborative
Marco Magistri , Journal Club. A non-coding RNA (ncRNA) is any RNA molecule that is not translated into a protein “Structural genes encode proteins.
The generalized transcription of the genome Víctor Gámez Visairas Genomics Course 2014/15.
Supplementary Figure 2A. A. ZMYM6-variant missing Exon 2 C. ZMYM6-variant missing Exon 4 B. ZMYM6-variant missing Exon 5 D. ZMYM6-variant missing Exons.
Search for novel non-coding RNAs in prostate carcinoma cells Christine Schulz AG RNomics.
Mandal CC et al Supplementary Fig. S1. Increased expression of CSF-1 in human breast cancer cells. (A) The conditioned media from normal human breast epithelial.
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on the GEP UCSC Genome Browser Wilson Leung08/2014.
Introduction to RNAseq
Gene Expression Platforms for Global Coexpression Analyses Assessment and Integration for Study of Gene Deregulation in Cancer Obi Griffith, Erin Pleasance,
TOX680 Unveiling the Transcriptome using RNA-seq Jinze Liu.
No reference available
PLANT BIOTECHNOLOGY & GENETIC ENGINEERING (3 CREDIT HOURS) LECTURE 13 ANALYSIS OF THE TRANSCRIPTOME.
NCode TM miRNA Analysis Platform Identifies Differentially Expressed Novel miRNAs in Adenocarcinoma Using Clinical Human Samples Provided By BioServe.
An Integer Programming Approach to Novel Transcript Reconstruction from Paired-End RNA-Seq Reads Serghei Mangul Department of Computer Science Georgia.
Canadian Bioinformatics Workshops
Validation of RNA-Seq data An introduction to qPCR Sarah Diermeier, Ph.D. Cold Spring Harbor Laboratory
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Reliable Identification of Genomic Variants from RNA-seq Data Robert Piskol, Gokul Ramaswami, Jin Billy Li PRESENTED BY GAYATHRI RAJAN VINEELA GANGALAPUDI.
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Retinoblastoma-binding proteins 4 and 9 are important for human pluripotent stem cell maintenance  Michael D. O’Connor, Elizabeth Wederell, Gordon Robertson,
RNA-Seq Primer Understanding the RNA-Seq evidence tracks on
RNA Quantitation from RNAseq Data
RNA-Seq analysis in R (Bioconductor)
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Sarah K. Whitley, William T. Horne, Jay K. Kolls 
Retinoblastoma-binding proteins 4 and 9 are important for human pluripotent stem cell maintenance  Michael D. O’Connor, Elizabeth Wederell, Gordon Robertson,
Expression profiling of snoRNAs in normal hematopoiesis and AML
Christopher R. Cabanski, Vincent Magrini, Malachi Griffith, Obi L
TBL1XR1/TP63: a novel recurrent gene fusion in B-cell non-Hodgkin lymphoma by David W. Scott, Karen L. Mungall, Susana Ben-Neriah, Sanja Rogic, Ryan D.
Universal Alternative Splicing of Noncoding Exons
Sequence Analysis - RNA-Seq 2
Presentation transcript:

1 30 Sept Genome Sciences Centre BC Cancer Agency, Vancouver, BC, Canada Malachi Griffith ALEXA-Seq analysis reveals breast cell type specific mRNA isoforms

2 In most genes, transcript diversity is generated by alternative expression Types of alternative expression Gene expression

3 Transcript variation is important to the study of human disease Alternative expression generates multiple distinct transcript variants from most human loci Specific transcript variants may represent useful therapeutic targets or diagnostic markers (Venables, 2006)

4 Massively parallel RNA sequencing Isolate RNAs Sequence ends 263 million paired reads 21 billion bases of sequence Generate cDNA, fragment, size select, add linkers Luminal Map to genome, transcriptome, and predicted exon junctions Discover isoforms and measure abundance Myoepithelial hESCs vHMECs Tissues/Cell Lines

5 Pipeline overview

6 What is an ALEXA-Seq sequence ‘feature’ Summary of features for human: ~4 million total (14% ‘known’) 37k Genes 62k Transcripts 278k exons 2,210k exon junctions 407k alternative exon boundaries 560k intron regions 227k intergenic regions

7 Data analyzed to date ALEXA-Seq processing: 19 projects19 projects –REMC + 18 others 105 libraries (200+ lanes) 3.9 billion paired-end reads 36-mers to 75-mers

8 Output Expression, differential expression and alternative expression values for 3.8 million features for each library processed Library quality analysis Number of features expressed (above background) –Genes, transcripts, exon regions, junctions, etc. Differential gene expression –Ranked lists Alternative expression –Ranked lists –Alternative isoforms involving exon skipping, alternative transcript initiation sites, etc. –Known or predicted novel isoforms Candidate peptides –Ranked lists

9 ALEXA-Seq data browser (using REMC analysis as an example) Goals –Visualization, interpretation, design of validation experiments, distribute results to internal/external collaborators What kinds of questions does ALEXA-Seq allow us to ask/answer?

10 Is the RNA-Seq library suitable for alternative expression analysis? Library summary Read quality Tag redundancy End bias Mapping rates Signal-to-noise hnRNA & gDNA contamination Features detected

11 Is my favorite gene expressed? alternatively expressed?

12 What are the most highly expressed genes, exons, etc. in each library? Expression Differential expression Alternative expression Provided for each feature type (gene, exon, junction, etc.) Ranked lists of events

13 e.g. most highly expressed genes

14 What are the top DE and AE genes for each tissue comparison? Candidate genes Each comparison DE or AE events Gains or Losses

15 Summary page for vHMECs vs. Luminal

16 Candidate features gained in vHMECs CD10 vHMECs vs. Luminal

17 Which exons/junctions and corresponding peptides might be suitable for antibody design?

18 Candidate peptides gained in vHMECs vHMECs vs. Luminal

19 Example housekeeping gene (Actin; no change)

20 CD10 (used to sort myoepithelial cells) Myoepithelial & vHMECs Luminal 422-fold higher in Myoepithelial than Luminal

21 CD227 (used to sort luminal epithelial cells) Myoepithelial Luminal CD227

22 Differential gene expression of CASP14 (Caspase 14 gained in vHMECs)

23 Novel skipping of PTEN exon 6

24 Exon 12 skipping of DDX5 (p68)

25 Tissue specific isoforms of CA12 Luminal Myoepithelial vHMECs

26 Alternative first exons of INPP4B

27 Alternative first exons of SERPINB7

28 FERM domain containing proteins are alternatively expressed * * (FRM6, FRM4A, FRMD4B are AE) (FRMD3, FRMD8 are DE)

29 Novel isoforms observed only in vHMECs E6-E10 E7-E10

30 How reliable are predictions from ALEXA-Seq? Are novel junctions real? –What proportion validate by RT-PCR and Sanger sequencing? Are differential/alternative expression changes observed between tissues accurate? –How well do DE values correlate with qPCR? To answer these questions we performed ~400 validations of ALEXA-Seq predictions from a comparison of two cell lines…

31 Validation (qualitative) 33 of 189 assays shown. Overall validation rate = 85%

32 Validation (quantitative) qPCR of 192 exons identified as alternatively expressed by ALEXA-Seq Validation rate = 88%

33 Conclusions ALEXA-Seq approach provides comprehensive global transcriptome profile –Input: paired-end RNA sequence data –Output: expression, differential expression, alternative expression, candidate peptides, etc. Detection of both known and novel isoforms –Subset that differ between conditions Predictions are highly accurate –86% validation rate by RT-PCR, qPCR and Sanger sequencing

34 Acknowledgements Supervisor Marco Marra Committee Joseph Connors Stephane Flibotte Steve Jones Gregg Morin Bioinformatics Obi Griffith Ryan Morin Rodrigo Goya Allen Delaney Gordon Robertson Richard Corbett Sequencing Martin Hirst Thomas Zeng Yongjun Zhao Helen McDonald Laboratory Trevor Pugh Tesa Severson 5-FU resistance Michelle Tang Isabella Tai Marco Marra Multiple Myeloma Rodrigo Goya Marco Marra Neuroblastoma Olena Morozova Marco Marra Morgen Pamela Hoodless Jacquie Schein Inanc Birol Gordon Robertson Shaun Jackman Iressa and Sutent Obi Griffith Steven Jones Lymphoma Ryan Morin Marco Marra Griffith M, Griffith OL, Morin RD, Tang MJ, Pugh TJ, Ally A, Asano JK, Chan SY, Li I, McDonald H, Teague K, Zhao Y, Zeng T, Delaney AD, Hirst M, Morin GB, Jones SJM, Tai IT, Marra MA. Alternative expression analysis by RNA sequencing. In review (Nature Methods).

35