Download presentation
Presentation is loading. Please wait.
Published byΝίκη Βαρουξής Modified over 6 years ago
1
Bulk RNA-Seq Analysis Using CLCGenomics Workbench
June 26, 2019 SRI CHAPARALA Ansuman Chattopadhyay
2
Topics Brief introduction to RNA-Seq experiments Analyze RNA-seq data
Dexamethasone treatment on airway smooth muscle cells (Himes et al. PLos One 2014) Download seq reads from EBI-ENA/NCBI SRA Import reads to CLC Genomics Workbench Align reads to Reference Genome Estimate expressions in the gene level Estimate expressions in the transcript isoform level Statistical analysis of the differential expressed genes and transcripts Create Heat Map, Volcano Plots, and Venn Diagram
3
Differential Gene Expressions
Raw Reads Venn Diagram Volcano Plot
4
Workshop Page
5
Pitt
6
HSLS MolBio
7
SUMMER 2019 Workshop Schedule
8
Partek Flow : Software for scRNA-Seq Data Analysis
9
NGS Software @ HSLS MolBio
NGS Analysis Sanger Seq Analysis
10
RNA-Seq Software @ HSLS MolBio
Enrichment Analysis Deferentially Expressed Genes CLC Genomics Work Bench Ingenuity Pathway Analysis Functions Diseases Pathways Key Pathway Advisor Upstream Regulators Volcano Plot PCA Plot Venn Diagram Heat Map Any Organism Illumina BaseSpace Correlation Engine Correlated Expression Studies RNA-Seq Reads Variant Detection Ingenuity Variant Analysis Variant Annotation and Prioritization RNA-Seq Analysis Down Stream Analysis
12
RNA-Seq data analysis support through HSLS MBIS
13
RNA Seq Questionnaire What is the scientific objective of the RNA Seq experiment? How many classes will be compared? Are only coding RNA (mRNA) or long non coding RNA, miRNA expected to be detected? Did all the samples pass RNA quality checks before sequencing? Are there biological replicates? If so how many? What type of sequencing platform was used to sequence the reads? Illumina, Ion torrent, Solid Where was the sequencing performed? Facility name and contact info When was the sequencing performed? Year/date Which RNA – extraction method was used in the experiment? Total RNA/ poly A/ rRNA depletion method and kit name and if possible, link to protocol Whether the protocol is strand specific or not? Unstranded/ forward/reverse, kit name and if possible link to protocol Whether the data is single end or paired end? What is the expected read length? Do the reads contain adapters or removed? If not please provide adapter sequence, if available, or link (usually can get this info from facility) What are the experimental conditions to perform differential expression analysis? Which organism and the reference genome to be used for analysis?
14
CLC Genomics Workbench
15
CLCGx 12 Genomics Workbench BioMedical Workbench
16
Install Plugins
17
CLCbio Genomics Workbench
System Requirements Windows Vista, Windows 7, Windows 8, Windows 10, Windows Server 2012 or 2016 Mac: OS X 10.10, and macOS 10.12, 10.13, 10.14 Linux: RHEL 7 and later, Suse Linux Enterprise Server 11 and later. (The software is expected to run without problem on other recent Linux systems, but we do not guarantee this.) 8 GB RAM required 16 GB RAM recommended 1024 x 768 display required 1600 x 1200 display recommended Intel or AMD CPU required 500GB disc space required in the CLC Genomics server
18
CLCBio Genomics Workbench Server
- You can connect your CLC Genomics Workbench software to the core HTC cluster available to University of Pittsburgh researchers through the Center for Research Computing (CRC). - This allows you to transparently migrate data from your workstation to the cluster, and run analyses on the cluster, which then run independently of your workstation (i.e. you can shutdown your machine and your analyses will continue unabated).
19
Center for Research computing (CRC)
20
Request access to CRC
21
CLC Genomics Workbench
Ensure you have the most up-to-date version of the CLCbio Genomics Workbench (the software should tell you if there's a more recent version when you start it, or you can check on the CLCbio website) If you have not already done so, request a user account/allocation on the Center for Research Computing (CRC) for HTC cluster by filling out the required information If your computer is not connected to the Pitt network (e.g. you are working from home or on a trip), or you are working from a laptop that is connected to the Pitt wireless system, make sure you setup Pitt VPN, so that you can communicate with the CLC Bioserver on HTC cluster. Start the CLC Genomics Workbench
22
Connect to CLC Server @ CRC
23
Access to CRC-HTC Cluster – CLC Server
If you DO NOT HAVE CRC-HTC account: Use the following for a limited access UserID: hslsmolb PW: library1# Server host: clcbio.crc.pitt.edu Server port: 7777 If you have CRC-HTC account Use – pitt user name; pitt password Server host: clcbio.crc.pitt.edu Server host: 7777
24
File Structure at CRC CLC Gx Server
25
Pre-analyzed Results
26
RNA-Seq Data
27
Bulk RNA-seq Study
29
NCBI SRA
30
NCBI SRA
31
NCBI SRA Untreated Vs DEX
32
RNA-Seq Basics
33
Bulk RNA-seq Basic Steps
convert to cDNA fragments adaptors ligation short seq reads align reads to reference genome Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet Jan;10(1):57–63.
34
Bulk RNA-Seq Data Analysis Workflow
35
Must Read Cresko Lab, University of Oregon
36
RNA-Seq vs. Microarrays
covers more dynamic range allows to discover novel transcripts able to detect SNPs more costly ($300-$1000/sample) than Microarray ($100-$200/sample) Generates times larger dataset than Microarray uncompressed RNA-Seq raw files: >5GB Microarray RNA-Seq Riki Kawaguchi’s Blog: Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS ONE Jan 16;9(1):e78644.
37
CLC Genomics Software User Interface
38
Contact CLCBio Support Team
CLCGX 12.0 User Mannual: manuals/clcgenomicsworkbench/current/index.php?manual=Introduction_CLC_Genomics_Workbench.html
39
Create a Folder in CRC-HTC Cluster
1 2
40
Create Workshop Folder@ HTC-CLC Server
1 2 3
41
CLCGX Tools for RNA-Seq Data Analysis
42
Import FASTQ Reads to CLCGx
Step 1 Import FASTQ Reads to CLCGx
43
Download from SRA or EBI and import into CLC NCBI SRA download in CLC
Import FASTQ Reads Download from SRA or EBI and import into CLC NCBI SRA download in CLC Import own data from local computer or from CRC servers
44
Illumina 5,043429 NGS Technologies ABI SoLid 25,882 Ion Torrent 74,684
NCBI Seq Read Archive Illumina 5,043429 ABI SoLid ,882 Ion Torrent ,684 PacBio ,410 MinIon ,102 Jan 14, 2019
45
STEP 1: Import Reads to CLC
2
46
STEP 1: Import Reads to CLC
3 4 5
47
STEP 1: Import Reads to CLC; Download from NCBI SRA
2
48
CLC SRA Download
49
EBI ENA
50
EBI-ENA
51
Help : Import Illumina Reads
52
FASTQ Format
53
Results By CLC : Imported Illumina Reads
TrainingMaterials Workshops Pre-analyzed Results_RNA-Seq RNASeq _Workshop_DExvsUNT DexvsUnt Reads
54
Results By CLC: Imported Illumina Reads
55
QC for Sequencing Reads
Step2 QC for Sequencing Reads
56
https://galaxyproject. github
57
FASTQC Project
58
Phred Score wikipedia
59
Taken from Introduction to ChIP-Seq by HPC Tutorial by HBC Training
60
Taken from Introduction to ChIP-Seq by HPC Tutorial by HBC Training
61
Assessing Sequence Data Quality (led by Dawei Lin and Simon Andrews)
62
Step 2: Create Seq QC Report
1 2
63
Results By CLC: Read QC Report
64
Read Trimming (based on quality of reads or adapters)
65
Trim Reads
66
Annotate Reads: Create a Metadata Table
67
Step 3 :Create a Metadata Table
68
Import Metadata
69
Import Metadata 2 1 3
70
Import Metadata Select All
71
STEP 4: Read Mapping
72
Read Mapping Wikipedia
73
Read Mapping Ozsolak et al. Nature Review Genetics
74
CLC Read Mapper Documentation
75
Best Practices
76
Popular Software
77
STEP 4: Read Mapping 5
78
STEP 4: Reads Mapping 7
79
STEP 4: Reads Mapping 8
80
Reference Genome
81
Reference Genomes https://www.ncbi.nlm.nih.gov/grc
82
Reference Genome Human : Grch38 Mouse: mm10 -- C57BL/6J
Mouse 16 other strains are now available
83
Step 4: Read Mapping
84
Step 4: Read Mapping 9
85
Step 4: Reads Mapping 10
86
Step 4: Reads Mapping
87
Normalization and Expression Values
TMM: weighted trimmed mean of the log expression ratios (trimmed mean of M values (TMM) used by EDGER and CLCGx
88
Normalization Methods
89
Step 4: Reads Mapping
90
Results By CLC: Reads Mapping
91
Step 5 :Create a Combined RNA-Seq Report
92
Reads Mapping; Gene expression Track GE
93
Reads Mapping; Transcript Level Gene expression Track - TE
94
Transcript Level Expression
95
Step6: Create a PCA Plot - QC at the sample level
96
Step 7: Differential Expression
Differential Expressions Between Two Groups – ex: Treated vs Untreated, KO vs WT Differential Expressions between Multiple Groups
97
Differential Expressions Between Two Groups – Treated vs Untreated
First, select mapped reads from Test Samples Then, select mapped reads from control samples
99
Differential Expressions Between Two Groups – Treated vs Untreated
TMM Normalization (Trimmed Mean of M values) calculates effective libraries sizes, which are then used as part of the per-sample normalization. TMM normalization adjusts library sizes based on the assumption that most genes are not differentially expressed.
100
Differential Expressions between Multiple Groups
Use the metadata table to define groups
101
Step8: Differential Expressions
102
Differential Expressions; Dex vs Unt Gene Level
103
GraphPad Statistics Guide :
104
Differential Expressions; Dex vs Unt Transcript Level
105
Step8: Differential Expressions; Dex vs Unt Volcano Plot
106
Step9: Create a HeatMap
107
Step9: Create a HeatMap
108
Step9 : Create a HeatMap
109
Step9 : Create a HeatMap
110
Step 10: Create a Venn Diagram
111
Step 10: Create a Venn Diagram
112
Step 10: Create a Venn Diagram
113
Step 11: Create a Track
114
Step 12: Expression Browser – all in one large spread sheet
115
Downstream Analysis
116
Downstream Analysis DEG Annotates differentially expressed genes from
an RNA-seq experiment, using the curated public data from GEO
117
NextBio Research
118
Export Data from CLC
119
Find Correlated Gene Expression Studies from GEO
120
Find Correlated Gene Expression Studies from GEO
121
Ingenuity IPA Analysis
122
Thanks To…. HSLS Carrie Iwema David Leung Michael Sweezer Qiagen
Shawn Prince Center for Research Computing Kim F Wong Mu Fangping
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.