RNA-Seq Visualization

Slides:



Advertisements
Similar presentations
Visualizing RNA-Seq Differential Expression Results with CummeRbund
Advertisements

Simon v2.3 RNA-Seq Analysis Simon v2.3.
Peter Tsai Bioinformatics Institute, University of Auckland
DEG Mi-kyoung Seo.
Statistical model for count data Speaker : Tzu-Chun Lo Advisor : Yao-Ting Huang.
RNA-seq data analysis Project
Data Analysis for High-Throughput Sequencing
OHRI Bioinformatics Introduction to the Significance Analysis of Microarrays application Stem.
Xiaole Shirley Liu STAT115, STAT215, BIO298, BIST520
Transcriptomics Jim Noonan GENE 760.
RNA-seq Analysis in Galaxy
Scaffold Download free viewer:
Bacterial Genome Assembly | Victor Jongeneel Radhika S. Khetani
Before we start: Align sequence reads to the reference genome
NGS Analysis Using Galaxy
An Introduction to RNA-Seq Transcriptome Profiling with iPlant
Brief workflow RNA is isolated from cells, fragmented at random positions, and copied into complementary DNA (cDNA). Fragments meeting a certain size specification.
Introduction to RNA-Seq and Transcriptome Analysis
Expression Analysis of RNA-seq Data
Transcriptome analysis With a reference – Challenging due to size and complexity of datasets – Many tools available, driven by biomedical research – GATK.
BIF Group Project Group (A)rabidopsis: David Nieuwenhuijse Matthew Price Qianqian Zhang Thijs Slijkhuis Species: C. Elegans Project: Advanced.
RNAseq analyses -- methods
Introduction to RNA-Seq & Transcriptome Analysis
Transcriptome Analysis
Next Generation Sequencing. Overview of RNA-seq experimental procedures. Wang L et al. Briefings in Functional Genomics 2010;9: © The Author.
An Introduction to RNA-Seq Transcriptome Profiling with iPlant.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Introduction to RNA-Seq
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq using the Discovery Environment And COGE.
1 Identifying differentially expressed genes from RNA-seq data Many recent algorithms for calling differentially expressed genes: edgeR: Empirical analysis.
IPlant Collaborative Discovery Environment RNA-seq Basic Analysis Log in with your iPlant ID; three orange icons.
Introductory RNA-seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis.
Galaxy – Set up your account. Galaxy – Two ways to get your data.
RNA-Seq Transcriptome Profiling. Before we start: Align sequence reads to the reference genome The most time-consuming part of the analysis is doing the.
Introduction to RNAseq
The iPlant Collaborative Community Cyberinfrastructure for Life Science Tools and Services Workshop RNA-Seq visualization with cummeRbund.
Bioinformatics for biologists Dr. Habil Zare, PhD PI of Oncinfo Lab Assistant Professor, Department of Computer Science Texas State University Presented.
The iPlant Collaborative
An Introduction to RNA-Seq Transcriptome Profiling with iPlant (
The iPlant Collaborative
No reference available
Case study: Saccharomyces cerevisiae grown under two different conditions RNAseq data plataform: Illumina Goal: Generate a platform where the user will.
RNA-Seq visualization with CummeRbund
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
Canadian Bioinformatics Workshops
RNA Seq Analysis Aaron Odell June 17 th Mapping Strategy A few questions you’ll want to ask about your data… - What organism is the data from? -
Micro array Data Analysis. Differential Gene Expression Analysis The Experiment Micro-array experiment measures gene expression in Rats (>5000 genes).
Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.
Canadian Bioinformatics Workshops
Arrays How do they work ? What are they ?. WT Dwarf Transgenic Other species Arrays are inverted Northerns: Extract target RNA YFG Label probe + hybridise.
+ RNAseq for differential gene expression analysis Molly Hammell, PhD
Statistics Behind Differential Gene Expression
Introductory RNA-seq Transcriptome Profiling
An Introduction to RNA-Seq Data and Differential Expression Tools in R
Supplementary figure 5.
WS9: RNA-Seq Analysis with Galaxy (non-model organism )
RNA-Seq visualization with CummeRbund
Gene expression from RNA-Seq
RNA-Seq analysis in R (Bioconductor)
High-Throughput Analysis of Genomic Data [S7] ENRIQUE BLANCO
Introductory RNA-Seq Transcriptome Profiling
Scatter Plot Add your comments here..
Transcriptome analysis
Exploring and Understanding ChIP-Seq data
LR LS SR SS RR RS Cluster T7 Cluster T6 Cluster T4 Cluster T1
Assessing changes in data – Part 2, Differential Expression with DESeq2
Additional file 2: RNA-Seq data analysis pipeline
Transcriptomics – towards RNASeq – part III
Differential Expression of RNA-Seq Data
Presentation transcript:

RNA-Seq Visualization cummrRbund in Atmosphere Jason Williams iPlant / Cold Spring Harbor Laboratory

*Graphics taken from these publications

The Tuxedo Protocol *TopHat and Cufflinks require a sequenced genome

Tophat Explain reference-sequence based NGS read alignments. Explain that we are skipping the cufflinks step because the Arabidopsis transcriptome is so well annotated that we can use the TAIR gene models as our refernce transcripts for CuffDiff

TopHat outputs in IGV

Using CummeRbund in Atmosphere Explain that we are skipping the cufflinks step because the Arabidopsis transcriptome is so well annotated that we can use the TAIR gene models as our refernce transcripts for CuffDiff

Using CummeRbund in Atmosphere Visualize and mine Cuffdiff results Output files from Cuffdiff are reorganized into a local database Explain that we are skipping the cufflinks step because the Arabidopsis transcriptome is so well annotated that we can use the TAIR gene models as our refernce transcripts for CuffDiff

Choose the right image We will be using “RNA-Seq Visualization” Rmi-BE9C2D12 Any image w/R can work, and you could also search For an image with cummeRbund installed Explain that we are skipping the cufflinks step because the Arabidopsis transcriptome is so well annotated that we can use the TAIR gene models as our refernce transcripts for CuffDiff

Installing cummeRbund in R

Installing cummeRbund in R

Reading the data in > cuff <- readCufflinks() > cuff CuffSet instance with: 2 samples 33714 genes 43481 isoforms 35113 TSS 32924 CDS 33621 promoters 35113 splicing 27350 relCDS

Visualizing dispersion >disp<-dispersionPlot(genes(cuff)) >disp Counts vs. dispersion Overdispersion greater variability in a data set than would be expected based on a given model ( in our case extra-Poisson variation) If you use Poisson model, you will overestimate differential expression

Visualizing dispersion Poisson adequately describes technical variation http://www.fgcz.ch/education/StatMethodsExpression/03_Count_data_analysis.pdf

Visualizing dispersion

Squared-coefficient of Variation (SCV) >genes.scv<-fpkmSCVPlot(genes(cuff)) >genes.scv Normalized measure of cross-replicate variability Represents the relationship of the standard deviation to the mean Differences in SCV can result in lower numbers of differentially expressed genes due to a higher degree of variability between replicate fpkm estimates

Distributions of FPKM scores across samples >dens<-csDensity(genes(cuff)) >dens >densRep<-csDensity(genes(cuff),replicates=T) >densRep Non-parametric estimate of pdf

FPKM Pairwise Scatter Plots > csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T)

Saving your Plots 1. Plot type: >(e.g. jpeg, png, pdf) (file_path_and_file_name) 2. Plot function 3. dev.off() > png (‘csScatter.png’) #Will save in working directory > csScatter(genes(cuff),‘WT’,‘hy5’,smooth=T) >dev.off

Selecting and Filtering Gene Sets Using the ‘getSig’ function # Enables you to get genes at significance n >sig <-getSig(cuff, alpha=0.05, level =‘genes’) # genes of significance 0.05 >length(sig) #returns the number of genes in the sig object >sig <-getSig(cuff, alpha=0, level=‘genes’) >tail(sig,100) #displays the last 100 genes in the sig object you just made

Selecting and Filtering Gene Sets Using the ‘getGenes’ function # Get the gene information >sigGenes <- getGenes(cuff,sig) Plot this in another scatter plot >csScatter(sigGenes, ‘WT’, ‘hy5’)

Heat mapping Similar Expression Values >sigGenes <-getGenes(cuff,tail(sig,50)) #last 50 genes in the list we created of genes >csHeatmap(sigGenes,cluster=‘both’)

Heat mapping Similar Expression Values >csHeatmap(sigGenes,cluster=‘both’,replicates=‘T’)

Expression Plots by Genes > myGeneId<-”AT5G41471" > myGene<-getGene(cuff,myGeneId) > myGene

Expression Plots by Genes > expressionPlot(myGene,replicates=‘T’)