Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bulk RNA-Seq Analysis Using CLCGenomics Workbench

Similar presentations


Presentation on theme: "Bulk RNA-Seq Analysis Using CLCGenomics Workbench"— Presentation transcript:

1 Bulk RNA-Seq Analysis Using CLCGenomics Workbench
June 26, 2019 SRI CHAPARALA Ansuman Chattopadhyay

2 Topics Brief introduction to RNA-Seq experiments Analyze RNA-seq data
Dexamethasone treatment on airway smooth muscle cells (Himes et al. PLos One 2014) Download seq reads from EBI-ENA/NCBI SRA Import reads to CLC Genomics Workbench Align reads to Reference Genome Estimate expressions in the gene level Estimate expressions in the transcript isoform level Statistical analysis of the differential expressed genes and transcripts Create Heat Map, Volcano Plots, and Venn Diagram

3 Differential Gene Expressions
Raw Reads Venn Diagram Volcano Plot

4 Workshop Page

5 Pitt

6 HSLS MolBio

7 SUMMER 2019 Workshop Schedule

8 Partek Flow : Software for scRNA-Seq Data Analysis

9 NGS Software @ HSLS MolBio
NGS Analysis Sanger Seq Analysis

10 RNA-Seq Software @ HSLS MolBio
Enrichment Analysis Deferentially Expressed Genes CLC Genomics Work Bench Ingenuity Pathway Analysis Functions Diseases Pathways Key Pathway Advisor Upstream Regulators Volcano Plot PCA Plot Venn Diagram Heat Map Any Organism Illumina BaseSpace Correlation Engine Correlated Expression Studies RNA-Seq Reads Variant Detection Ingenuity Variant Analysis Variant Annotation and Prioritization RNA-Seq Analysis Down Stream Analysis

11

12 RNA-Seq data analysis support through HSLS MBIS

13 RNA Seq Questionnaire What is the scientific objective of the RNA Seq experiment? How many classes will be compared? Are only coding RNA (mRNA) or long non coding RNA, miRNA expected to be detected? Did all the samples pass RNA quality checks before sequencing? Are there biological replicates? If so how many? What type of sequencing platform was used to sequence the reads? Illumina, Ion torrent, Solid Where was the sequencing performed? Facility name and contact info When was the sequencing performed? Year/date Which RNA – extraction method was used in the experiment? Total RNA/ poly A/ rRNA depletion method and kit name and if possible, link to protocol Whether the protocol is strand specific or not? Unstranded/ forward/reverse, kit name and if possible link to protocol Whether the data is single end or paired end? What is the expected read length? Do the reads contain adapters or removed? If not please provide adapter sequence, if available, or link (usually can get this info from facility) What are the experimental conditions to perform differential expression analysis? Which organism and the reference genome to be used for analysis?

14 CLC Genomics Workbench

15 CLCGx 12 Genomics Workbench BioMedical Workbench

16 Install Plugins

17 CLCbio Genomics Workbench
System Requirements Windows Vista, Windows 7, Windows 8, Windows 10, Windows Server 2012 or 2016 Mac: OS X 10.10, and macOS 10.12, 10.13, 10.14 Linux: RHEL 7 and later, Suse Linux Enterprise Server 11 and later. (The software is expected to run without problem on other recent Linux systems, but we do not guarantee this.) 8 GB RAM required 16 GB RAM recommended 1024 x 768 display required 1600 x 1200 display recommended Intel or AMD CPU required 500GB disc space required in the CLC Genomics server

18 CLCBio Genomics Workbench Server
- You can connect your CLC Genomics Workbench software to the core HTC cluster available to University of Pittsburgh researchers through the Center for Research Computing (CRC). - This allows you to transparently migrate data from your workstation to the cluster, and run analyses on the cluster, which then run independently of your workstation (i.e. you can shutdown your machine and your analyses will continue unabated).

19 Center for Research computing (CRC)

20 Request access to CRC

21 CLC Genomics Workbench
Ensure you have the most up-to-date version of the CLCbio Genomics Workbench (the software should tell you if there's a more recent version when you start it, or you can check on the CLCbio website) If you have not already done so, request a user account/allocation on the Center for Research Computing (CRC) for HTC cluster by filling out the required information If your computer is not connected to the Pitt network (e.g. you are working from home or on a trip), or you are working from a laptop that is connected to the Pitt wireless system, make sure you setup Pitt VPN, so that you can communicate with the CLC Bioserver on HTC cluster. Start the CLC Genomics Workbench

22 Connect to CLC Server @ CRC

23 Access to CRC-HTC Cluster – CLC Server
If you DO NOT HAVE CRC-HTC account: Use the following for a limited access UserID: hslsmolb PW: library1# Server host: clcbio.crc.pitt.edu Server port: 7777 If you have CRC-HTC account Use – pitt user name; pitt password Server host: clcbio.crc.pitt.edu Server host: 7777

24 File Structure at CRC CLC Gx Server

25 Pre-analyzed Results

26 RNA-Seq Data

27 Bulk RNA-seq Study

28

29 NCBI SRA

30 NCBI SRA

31 NCBI SRA Untreated Vs DEX

32 RNA-Seq Basics

33 Bulk RNA-seq Basic Steps
convert to cDNA fragments adaptors ligation short seq reads align reads to reference genome Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet Jan;10(1):57–63.

34 Bulk RNA-Seq Data Analysis Workflow

35 Must Read Cresko Lab, University of Oregon

36 RNA-Seq vs. Microarrays
covers more dynamic range allows to discover novel transcripts able to detect SNPs more costly ($300-$1000/sample) than Microarray ($100-$200/sample) Generates times larger dataset than Microarray uncompressed RNA-Seq raw files: >5GB Microarray RNA-Seq Riki Kawaguchi’s Blog: Zhao S, Fung-Leung WP, Bittner A, Ngo K, Liu X. Comparison of RNA-Seq and microarray in transcriptome profiling of activated T cells. PLoS ONE Jan 16;9(1):e78644.

37 CLC Genomics Software User Interface

38 Contact CLCBio Support Team
CLCGX 12.0 User Mannual: manuals/clcgenomicsworkbench/current/index.php?manual=Introduction_CLC_Genomics_Workbench.html

39 Create a Folder in CRC-HTC Cluster
1 2

40 Create Workshop Folder@ HTC-CLC Server
1 2 3

41 CLCGX Tools for RNA-Seq Data Analysis

42 Import FASTQ Reads to CLCGx
Step 1 Import FASTQ Reads to CLCGx

43 Download from SRA or EBI and import into CLC NCBI SRA download in CLC
Import FASTQ Reads Download from SRA or EBI and import into CLC NCBI SRA download in CLC Import own data from local computer or from CRC servers

44 Illumina 5,043429 NGS Technologies ABI SoLid 25,882 Ion Torrent 74,684
NCBI Seq Read Archive Illumina 5,043429 ABI SoLid ,882 Ion Torrent ,684 PacBio ,410 MinIon ,102 Jan 14, 2019

45 STEP 1: Import Reads to CLC
2

46 STEP 1: Import Reads to CLC
3 4 5

47 STEP 1: Import Reads to CLC; Download from NCBI SRA
2

48 CLC SRA Download

49 EBI ENA

50 EBI-ENA

51 Help : Import Illumina Reads

52 FASTQ Format

53 Results By CLC : Imported Illumina Reads
TrainingMaterials Workshops Pre-analyzed Results_RNA-Seq RNASeq _Workshop_DExvsUNT DexvsUnt Reads

54 Results By CLC: Imported Illumina Reads

55 QC for Sequencing Reads
Step2 QC for Sequencing Reads

56 https://galaxyproject. github

57 FASTQC Project

58 Phred Score wikipedia

59 Taken from Introduction to ChIP-Seq by HPC Tutorial by HBC Training

60 Taken from Introduction to ChIP-Seq by HPC Tutorial by HBC Training

61 Assessing Sequence Data Quality (led by Dawei Lin and Simon Andrews)

62 Step 2: Create Seq QC Report
1 2

63 Results By CLC: Read QC Report

64 Read Trimming (based on quality of reads or adapters)

65 Trim Reads

66 Annotate Reads: Create a Metadata Table

67 Step 3 :Create a Metadata Table

68 Import Metadata

69 Import Metadata 2 1 3

70 Import Metadata Select All

71 STEP 4: Read Mapping

72 Read Mapping Wikipedia

73 Read Mapping Ozsolak et al. Nature Review Genetics

74 CLC Read Mapper Documentation

75 Best Practices

76 Popular Software

77 STEP 4: Read Mapping 5

78 STEP 4: Reads Mapping 7

79 STEP 4: Reads Mapping 8

80 Reference Genome

81 Reference Genomes https://www.ncbi.nlm.nih.gov/grc

82 Reference Genome Human : Grch38 Mouse: mm10 -- C57BL/6J
Mouse 16 other strains are now available

83 Step 4: Read Mapping

84 Step 4: Read Mapping 9

85 Step 4: Reads Mapping 10

86 Step 4: Reads Mapping

87 Normalization and Expression Values
TMM: weighted trimmed mean of the log expression ratios (trimmed mean of M values (TMM) used by EDGER and CLCGx

88 Normalization Methods

89 Step 4: Reads Mapping

90 Results By CLC: Reads Mapping

91 Step 5 :Create a Combined RNA-Seq Report

92 Reads Mapping; Gene expression Track GE

93 Reads Mapping; Transcript Level Gene expression Track - TE

94 Transcript Level Expression

95 Step6: Create a PCA Plot - QC at the sample level

96 Step 7: Differential Expression
Differential Expressions Between Two Groups – ex: Treated vs Untreated, KO vs WT Differential Expressions between Multiple Groups

97 Differential Expressions Between Two Groups – Treated vs Untreated
First, select mapped reads from Test Samples Then, select mapped reads from control samples

98

99 Differential Expressions Between Two Groups – Treated vs Untreated
TMM Normalization (Trimmed Mean of M values) calculates effective libraries sizes, which are then used as part of the per-sample normalization. TMM normalization adjusts library sizes based on the assumption that most genes are not differentially expressed.

100 Differential Expressions between Multiple Groups
Use the metadata table to define groups

101 Step8: Differential Expressions

102 Differential Expressions; Dex vs Unt Gene Level

103 GraphPad Statistics Guide :

104 Differential Expressions; Dex vs Unt Transcript Level

105 Step8: Differential Expressions; Dex vs Unt Volcano Plot

106 Step9: Create a HeatMap

107 Step9: Create a HeatMap

108 Step9 : Create a HeatMap

109 Step9 : Create a HeatMap

110 Step 10: Create a Venn Diagram

111 Step 10: Create a Venn Diagram

112 Step 10: Create a Venn Diagram

113 Step 11: Create a Track

114 Step 12: Expression Browser – all in one large spread sheet

115 Downstream Analysis

116 Downstream Analysis DEG Annotates differentially expressed genes from
an RNA-seq experiment, using the curated public data from GEO

117 NextBio Research

118 Export Data from CLC

119 Find Correlated Gene Expression Studies from GEO

120 Find Correlated Gene Expression Studies from GEO

121 Ingenuity IPA Analysis

122 Thanks To…. HSLS Carrie Iwema David Leung Michael Sweezer Qiagen
Shawn Prince Center for Research Computing Kim F Wong Mu Fangping


Download ppt "Bulk RNA-Seq Analysis Using CLCGenomics Workbench"

Similar presentations


Ads by Google