Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana.

Similar presentations


Presentation on theme: "Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana."— Presentation transcript:

1 Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana

2 Before we start: Align NGS reads to the reference genome The most time-consuming part of the analysis is doing the alignments of the reads (in Sanger fastq format) for all replicates against the reference genome

3 Overview: This training module is designed to provide a hands on experience in using RNA-Seq for transcriptome profiling. Question: How well is the annotated transcriptome represented in RNA-seq data in Arabidopsis WT and hy5 genetic backgrounds? How can we compare gene expression levels in the two samples? RNA-seq in the Discovery Environment

4 Scientific Objective LONG HYPOCOTYL 5 (HY5) is a basic leucine zipper transcription factor (TF). Mutations in the HY5 gene cause aberrant phenotypes in Arabidopsis morphology, pigmentation and hormonal response both directly through binding cis regulatory sequences and indirectly through microRNA-mediated post- transcriptional regulation. We will use RNA-seq to compare the transcriptomes of seedlings from WT and hy5 genetic backgrounds.

5 Samples Experimental data downloaded from the NCBI Short Read Archive (GEO:GSM613465 and GEO:GSM613466) Two replicates each of RNA-seq runs for Wild-type and hy5 mutant seedlings.

6 Specific Objectives By the end of this module, you should 1)Be more familiar with the DE user interface 2)Understand the starting data for RNA-seq analysis 3)Be able to align short sequence reads with a reference genome in the DE 4)Be able to analyze differential gene expression in the DE 5)Be able to use DE text manipulation tools to explore the gene expression data

7 Quick Summary Find Differentially Expressed genes Align to Genome: TopHat View Alignments: IGV Differential Expression: CuffDiff Download Reads from SRA Export Reads to FASTQ

8 Import SRA data from NCBI SRA Extract FASTQ files from the downloaded SRA archives Pre-Configured: Getting the RNA-seq Data

9 Align FASTQ files to Arabidopsis genome using TopHat Pause: Align Reads to the Genome

10 RNA-Seq Conceptual Overview Source: http://biostat.jhsph.edu/genomics/projects.html

11 RNA-Seq Conceptual Overview

12 TopHat TopHat is one of many applications whose for aligning short sequence reads to a reference genome. It uses the BOWTIE aligner internally. Other alternatives are BWA, MAQ, TopHat, Stampy, Novoalign, etc.

13 RNA-seq Sample Read Statistics Genome alignments from TopHat were saved as BAM files, the binary version of SAM (). Reads retained by TopHat are shown below Sequence runWT-1WT-2hy5-1hy5-2 Reads10,866,70210,276,26813,410,01112,471,462 Seq. (Mbase)445.5421.3549.8511.3

14 Index BAM files using SAMtools Prepare BAM files for viewing

15 Using IGV in Atmosphere 1.We already Launched an instance of NGS Viewers in Atmosphere 2.Use VNClient to connect to your remote desktop

16 Pre-configured VM for NGS Viewers

17 The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated genomic datasets. It supports a wide variety of data types, including array-based and next-generation sequence data, and genomic annotations. Use IGV to inspect outputs from TopHat http://www.broadinstitute.org/igv/ Integrated Genomics Viewer (IGV)

18 Using the NGS Viewer Atmosphere Instance 1.Configure iDrop 2.Copy.bam files and.bai (index) files from the TopHat output to your Atmosphere instance desktop

19 Using IGV in Atmosphere 1.Launch IGV (Integrative Genomics Viewer) 2.Change the current genome to A. thaliana (TAIR10)

20 ATG44120 (12S seed storage protein) significantly down-regulated in hy5 mutant Background (> 9-fold p=0). Compare to gene on right lacking differential expression

21 Other Ways to View Alignment Data WIG->Ensembl

22 CuffDiff Cufflinks is a program that assembles aligned RNA-Seq reads into transcripts, estimates their abundances, and tests for differential expression and regulation transcriptome-wide. CuffDiff is a program within Cufflinks that compares transcript abundance between samples

23 RNA-Seq Conceptual Overview

24 Pause: Examining Differential Gene Expression

25 Pause: Examining the Gene Expression Data

26

27

28

29 I like this one…..


Download ppt "Introductory RNA-seq Transcriptome Profiling of the hy5 mutation in Arabidopsis thaliana."

Similar presentations


Ads by Google