Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Introduction to RNA-Seq Transcriptome Profiling with iPlant (https://pods.iplantc.org/wiki/x/axO)

Similar presentations


Presentation on theme: "An Introduction to RNA-Seq Transcriptome Profiling with iPlant (https://pods.iplantc.org/wiki/x/axO)"— Presentation transcript:

1 An Introduction to RNA-Seq Transcriptome Profiling with iPlant (https://pods.iplantc.org/wiki/x/axO)

2

3 Overview: This training module is designed to demonstrate a workflow in the iPlant Discovery Environment using RNA-Seq for transcriptome profiling. Question: How can we compare gene expression levels using RNA-Seq data in Arabidopsis WT and hy5 genetic backgrounds? RNA-seq in the Discovery Environment

4 Scientific Objective LONG HYPOCOTYL 5 (HY5) is a basic leucine zipper transcription factor (TF). Mutations cause aberrant phenotypes in Arabidopsis morphology, pigmentation and hormonal response. We will use RNA-seq to compare WT and hy5 to identify HY5-regulated genes. Source: http://www.gla.ac.uk/media/media_73736_en.jpg

5 Samples Experimental data downloaded from the NCBI Short Read Archive (GEO:GSM613465 and GEO:GSM613466) Two replicates each of RNA-seq runs for Wild- type and hy5 mutant seedlings.

6 RNA-Seq Conceptual Overview Image source: http://www.bgisequence.com

7 RNA-seq Sample Read Statistics Genome alignments from TopHat were saved as BAM files, the binary version of SAM (samtools.sourceforge.net/). Reads retained by TopHat are shown below Sequence runWT-1WT-2hy5-1hy5-2 Reads10,866,70210,276,26813,410,01112,471,462 Seq. (Mbase)445.5421.3549.8511.3

8 RNA-Seq Data @SRR070570.4 HWUSI-EAS455:3:1:1:1096 length=41 CAAGGCCCGGGAACGAATTCACCGCCGTATGGCTGACCGG C + BA?39AAA933BA05>A@A=?4,9################# @SRR070570.12 HWUSI-EAS455:3:1:2:1592 length=41 GAGGCGTTGACGGGAAAAGGGATATTAGCTCAGCTGAATCT + @=:9>5+.5=?@ A?@6+2?:,%1/=0/7/>48## @SRR070570.13 HWUSI-EAS455:3:1:2:869 length=41 TGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCA + A;BAA6=A3=ABBBA84B AB2@>B@/9? @SRR070570.32 HWUSI-EAS455:3:1:4:1075 length=41 CAGTAGTTGAGCTCCATGCGAAATAGACTAGTTGGTACCAC + BB9?A@>AABBBB@BCA?A8BBBAB4B@BC71=?9;B:3B? @SRR070570.40 HWUSI-EAS455:3:1:5:238 length=41 AAAAGGGTAAAAGCTCGTTTGATTCTTATTTTCAGTACGAA + BBB?06-8BB@B17>9)=A91?>>8>*@ >@1:B>(B@ @SRR070570.44 HWUSI-EAS455:3:1:5:1871 length=41 GTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTGTAAG + BBBCBCCBBBBBA@BBCCB+ABBCB@B@BB@:BAA@B@BB> @SRR070570.46 HWUSI-EAS455:3:1:5:1981 length=41 GAACAACAAAACCTATCCTTAACGGGATGGTACTCACTTTC + ?A>-?B;BCBBB@BC@/>A : …Now What?

9 @SRR070570.4 HWUSI-EAS455:3:1:1:1096 length=41 CAAGGCCCGGGAACGAATTCACCGCCGTATGGCTGACCGG C + BA?39AAA933BA05>A@A=?4,9################# @SRR070570.12 HWUSI-EAS455:3:1:2:1592 length=41 GAGGCGTTGACGGGAAAAGGGATATTAGCTCAGCTGAATCT + @=:9>5+.5=?@ A?@6+2?:,%1/=0/7/>48## @SRR070570.13 HWUSI-EAS455:3:1:2:869 length=41 TGCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCCATGCA + A;BAA6=A3=ABBBA84B AB2@>B@/9? @SRR070570.32 HWUSI-EAS455:3:1:4:1075 length=41 CAGTAGTTGAGCTCCATGCGAAATAGACTAGTTGGTACCAC + BB9?A@>AABBBB@BCA?A8BBBAB4B@BC71=?9;B:3B? @SRR070570.40 HWUSI-EAS455:3:1:5:238 length=41 AAAAGGGTAAAAGCTCGTTTGATTCTTATTTTCAGTACGAA + BBB?06-8BB@B17>9)=A91?>>8>*@ >@1:B>(B@ @SRR070570.44 HWUSI-EAS455:3:1:5:1871 length=41 GTCATATGCTTGTCTCAAAGATTAAGCCATGCATGTGTAAG + BBBCBCCBBBBBA@BBCCB+ABBCB@B@BB@:BAA@B@BB> @SRR070570.46 HWUSI-EAS455:3:1:5:1981 length=41 GAACAACAAAACCTATCCTTAACGGGATGGTACTCACTTTC + ?A>-?B;BCBBB@BC@/>A : Bioinformatician 0 1 0 0 1 1 0 1 0 1

10

11 The Tuxedo Protocol

12 $ tophat -p 8 -G genes.gtf -o C1_R1_thout genome C1_R1_1.fq C1_R1_2.fq $ tophat -p 8 -G genes.gtf -o C1_R2_thout genome C1_R2_1.fq C1_R2_2.fq $ tophat -p 8 -G genes.gtf -o C1_R3_thout genome C1_R3_1.fq C1_R3_2.fq $ tophat -p 8 -G genes.gtf -o C2_R1_thout genome C2_R1_1.fq C1_R1_2.fq $ tophat -p 8 -G genes.gtf -o C2_R2_thout genome C2_R2_1.fq C1_R2_2.fq $ tophat -p 8 -G genes.gtf -o C2_R3_thout genome C2_R3_1.fq C1_R3_2.fq $ cufflinks -p 8 -o C1_R1_clout C1_R1_thout/accepted_hits.bam $ cufflinks -p 8 -o C1_R2_clout C1_R2_thout/accepted_hits.bam $ cufflinks -p 8 -o C1_R3_clout C1_R3_thout/accepted_hits.bam $ cufflinks -p 8 -o C2_R1_clout C2_R1_thout/accepted_hits.bam $ cufflinks -p 8 -o C2_R2_clout C2_R2_thout/accepted_hits.bam $ cufflinks -p 8 -o C2_R3_clout C2_R3_thout/accepted_hits.bam $ cuffmerge -g genes.gtf -s genome.fa -p 8 assemblies.txt $ cuffdiff -o diff_out -b genome.fa -p 8 –L C1,C2 -u merged_asm/merged.gtf \./C1_R1_thout/accepted_hits.bam,./C1_R2_thout/accepted_hits.bam,\./C1_R3_thout/accepted_hits.bam \./C2_R1_thout/accepted_hits.bam,\./C2_R3_thout/accepted_hits.bam,./C2_R2_thout/accepted_hits.bam Your RNA-Seq Data Your transformed RNA-Seq Data

13 RNA-Seq Analysis Workflow Tophat (bowtie) Cufflinks Cuffmerge Cuffdiff CummeRbund Your Data iPlant Data Store FASTQ Discovery Environment Atmosphere

14 The iPlant Discovery Environment

15

16

17

18 Staged Data

19 Tophat

20 TopHat in the Discovery Environment

21 TopHat TopHat is one of many applications for aligning short sequence reads to a reference genome. It uses the BOWTIE aligner internally. Other alternatives are BWA, MAQ, OLego, Stampy, Novoalign, etc.

22 Assembling the Transcripts

23 Cufflinks in the Discovery Environment

24 Merging the Transcriptomes

25 Cufffmerge in the Discovery Environment

26 Comparing wild-type to hy5 transcriptomes

27 Cuffdiff in the Discovery Environment

28 Cuffdiff Results

29 Differentially expressed genes Example filtered Cuffdiff results generated in the Discovery Environment.

30 Differentially expressed transcripts Example filtered Cuffdiff results generated in the Discovery Environment.

31 https://pods.iplantc.org/wiki/x/axO


Download ppt "An Introduction to RNA-Seq Transcriptome Profiling with iPlant (https://pods.iplantc.org/wiki/x/axO)"

Similar presentations


Ads by Google