Presentation is loading. Please wait.

Presentation is loading. Please wait.

ChIP-Seq Analysis – Using CLCGenomics Workbench

Similar presentations


Presentation on theme: "ChIP-Seq Analysis – Using CLCGenomics Workbench"— Presentation transcript:

1 ChIP-Seq Analysis – Using CLCGenomics Workbench
Nov 16,2017 Ansuman Chattopadhyay, PhD Health sciences library system University of pittsburgh

2 Transcription Factor ChIP-Seq Histone ChIP-Seq ATAC-Seq
Topics Transcription Factor ChIP-Seq Histone ChIP-Seq ATAC-Seq

3

4 Transcription Factor and Histone ChIP-Seq

5 ATAC-Seq Study

6 Galaxy : http://galaxy.crc.pitt.edu:8080/
Graphical User Interface based software Galaxy : CLC Genomics Workbench

7 HSLS MolBio

8 NGS Software @ HSLS MolBio
NGS Analysis Sanger Seq Analysis Human , Mouse and Rat NGS Analysis

9 CLCbio Genomics Workbench
System Requirements Windows Vista, Windows 7, Windows 8, Windows 10, Windows Server 2008, or Windows Server 2012 Mac OS X 10.7 or later. Linux: Red Hat 5.0 or later. SUSE 10.2 or later. Fedora 6 or later. 8 GB RAM required 16 GB RAM recommended 1024 x 768 display required 1600 x 1200 display recommended Intel or AMD CPU required Minimum 10 GB free disc space in the tmp directory

10 CLC Plugins to Install CLC Workbench Client Plugin Histone ChIP-Seq
Advanced Peak Shape Tools Plugin – Beta Download available at Top Right Corner

11 Integrating with the CLCbio Genomics Server @ CRC

12 You need Secure Remote Access via Pulse to run CLCGx from off campus locations / Pitt Wireless

13 CLC files at the CRC HTC Cluster
Reference Sequences Look for Folders organized by PI’s name

14 Create Folders at CRC-HTC

15 Create Folder in SaM-HTC Cluster
1 2

16 Create Workshop Folder@ FRANK
1 2 3

17 ChIP-Seq Workflow

18 Dataset

19 GEO Dataset

20 Download FASTQ Reads MyoD_Undiff_ChIP-Seq

21 Download FASTQ Reads MyoD_Undiff_ChIp-Seq

22 ENA : Download FASTQ Reads MyoD_Undiff_ChIp-Seq

23 Import : FASTQ Reads MyoD_Undiff_ChIp-Seq
1

24 Import : FASTQ Reads MyoD_Undiff_ChIp-Seq (single)

25 GEO Dataset – ATAC-Seq

26 STEP 1: Import Reads to CLC (Paired End)
2

27 STEP 1: Import Reads to CLC (Paired End)
3 4 5

28 FASTQ format

29 FASTQ Reads

30 FASTQC Project

31 Step 2: Create a Seq QC Report
1 2

32 Trim Reads – Adapter Seq etc.

33 Create Adapter List

34 Create Adapter List

35 Create FAST QC Report

36 FASTQC Report

37 Read Mapping to Ref Genome

38 Read Mapping to Ref Genome

39 Read Mapping to Ref Genome

40 Read Mapping to Ref Genome

41 Read Mapping to Ref Genome

42 Read Mapping around GM20652 Result from MyOD1 ChIP-Seq

43 Peak Calling Strino etal.,BMC Bioinformatics, June 2016

44 Peak Calling Strino etal.,BMC Bioinformatics, June 2016
Landt etal.,Genome Research,2012

45 Peak Calling Strino etal.,BMC Bioinformatics, June 2016

46 Discovering Obvious Peaks
 The CLC shape-based peak caller finds peaks by building a Gaussian filter based on the mean and variance of the fragment length distribution, which are inferred from the cross-correlation profile Strino etal.,BMC Bioinformatics, June 2016

47 Peak Shape Score The Peak Shape Score is standardised and follows a standard normal distribution, so a p-value for each genomic position can be calculated as  p-value=Φ(−Peak Shape Score of the peak centre), where Φ is the standard normal cumulative distribution function. Score = genomic coverage * filter; *: cross-correlation operator Score indicates how likely a genomic position is to be a center of a peak Strino etal.,BMC Bioinformatics, June 2016

48 Once the positive and negative regions have been identified,
Peak Shape Filter Once the positive and negative regions have been identified, the CLC shape-based peak caller learns a filter that matches the average peak shape, which is called Peak Shape Filter. Strino etal.,BMC Bioinformatics, June 2016

49 Peak Shape Filter Strino etal.,BMC Bioinformatics, June 2016

50 Peak Detection peaks are called by first identifying the genomic positions whose p-value is higher than the specified threshold and which do not have any higher value in a window around them. The size of this window is determined by the filter as the longest distance between two positive values in the filter. These maxima define the center of the peak, while the peak boundaries are identified by expanding from the center both left and right until either the score becomes 0 or the peak touches a window boundary Strino etal.,BMC Bioinformatics, June 2016

51 Call Peaks using Peak Shape information

52 Call Peaks using Peak Shape information

53 Call Peaks using Peak Shape information

54 Peak Calls Result

55 Peak Calls Result

56 Annotate Peaks with near by genes

57 Annotate Peaks with near by genes

58 5Prime and 3Prime Gene Distance

59 ChIP-Seq Result

60 Compare Datasets

61 Compare Datasets

62 Compare Datasets

63 Compare Datasets

64 Commonly Used Open-Source Tool

65 Comparison of CLC Results with MACS2.0

66 Histone ChIP-Seq Li etal., Cell

67 Histone ChIP-Seq

68 Histone Modifications
Li etal., Cell

69 Running Histone ChIP-Seq
Classify Regions of variable length by Peak Shape

70 Running Histone ChIP-Seq

71 Running Histone ChIP-Seq

72 Running Histone ChIP-Seq

73 Histone ChIP-Seq Result

74 Histone ChIP-Seq Result
Classified Gene Regions in the genome

75 H3K4Me3 – Diff : Result by Txnfactor ChIP-Seq tool

76 ATAC-Seq

77 ATAC-Seq Data Analysis

78 Comparison of DNAse-Seq Results

79 HSLS-MBIS and Genomics Analysis Core
GAC Ansuman Chattopadhyay, PhD Uma Chandran, PhD, MSIS Sri Chaparala Carrie Iwema, PhD, MLS

80 Thanks To…. CLCBio Center for Research Computing Shawn Prince
HSLS Sri Chaparala Carrie Iwema David Leung Michael Sweezer CLCBio Shawn Prince Center for Research Computing Mu Fangping


Download ppt "ChIP-Seq Analysis – Using CLCGenomics Workbench"

Similar presentations


Ads by Google