Presentation is loading. Please wait.

Presentation is loading. Please wait.

Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015

Similar presentations


Presentation on theme: "Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015"— Presentation transcript:

1 Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015 http://bioinf.gen.tcd.ie/GE3M25/project

2 Trinity College Dublin, The University of Dublin GE3M25 Data Handling Module Content Python Programming Bioinformatics ChIP-Seq analysis http://bioinf.gen.tcd.ie/GE3M25/project

3 Trinity College Dublin, The University of Dublin Poject report hand-in: 23/12/2015 or 10/01/2016 ?

4 Trinity College Dublin, The University of Dublin Class 4: Project overview Read mapping Peak detection Visualisation http://bioinf.gen.tcd.ie/GE3M25/project

5 Trinity College Dublin, The University of Dublin ChIP-Seq Different sets of genes are expressed under different conditions Regulated through transcription factors that bind to promoters Binding can be captured by ChIP Bound sequences are revealed through NGS

6 Trinity College Dublin, The University of Dublin

7 ChIP-Seq Analysis Goal

8 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) 2.Read mapping (Bowtie2) 3.Generate indexed and sorted BAM file 4.Peak calling 5.Visualisation (IGV) 6.Store BAM and index files GE3M25 Project

9 Trinity College Dublin, The University of Dublin Data Download: Start here: bioinf.gen.tcd.ie/GE3M25 GE3M25 Project

10 Trinity College Dublin, The University of Dublin Data Download: bioinf.gen.tcd.ie/GE3M25/project GE3M25 Project Optional control file for improved results

11 Trinity College Dublin, The University of Dublin Data Download: bioinf.gen.tcd.ie/GE3M25/project/data GE3M25 Project

12 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) ✔ 2.Read mapping (Bowtie2) 3.Generate indexed and sorted BAM file 4.Peak calling 5.Visualisation (IGV) 6.Store BAM and index files GE3M25 Project

13 Trinity College Dublin, The University of Dublin Installing Bowtie Start here: bioinf.gen.tcd.ie/GE3M25/project GE3M25 Project

14 Trinity College Dublin, The University of Dublin Installing Bowtie Switch to Terminal, unpack GE3M25 Project

15 Trinity College Dublin, The University of Dublin Generate Index Download reference sequence from bioinf.gen.tcd.ie/GE3M25/project GE3M25 Project

16 Trinity College Dublin, The University of Dublin Generate Index Run bowtie2-build GE3M25 Project./bowtie2-build S288C_reference_sequence_R64-2-1_20150113.fsa yeast programreference sequencename for index

17 Trinity College Dublin, The University of Dublin Generate Index Run bowtie2-build GE3M25 Project./bowtie2-build S288C_reference_sequence_R64-2-1_20150113.fsa.txt yeast programreference sequencename for index Added by browser?

18 Trinity College Dublin, The University of Dublin Generate Index Check output files GE3M25 Project

19 Trinity College Dublin, The University of Dublin Read mapping Run bowtie2 GE3M25 Project./bowtie2 -x yeast -U 13222222.ChIP.fastq.gz -p 4 -S 13222222.sam programFastQ fileindex name output use of four CPU cores

20 Trinity College Dublin, The University of Dublin Read mapping Run bowtie2 GE3M25 Project./bowtie2 -x yeast -U 13222222.ChIP.fastq.gz -p 4 > 13222222.sam program FastQ file index name output to screen redirected to file use of four CPU cores

21 Trinity College Dublin, The University of Dublin GE3M25 Project Read mapping Output summary

22 Trinity College Dublin, The University of Dublin GE3M25 Project Read mapping - other options (see Bowtie2 manual)

23 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) ✔ 2.Read mapping (Bowtie2) ✔ 3.Generate indexed and sorted BAM file 4.Peak calling 5.Visualisation (IGV) 6.Store BAM and index files GE3M25 Project

24 Trinity College Dublin, The University of Dublin GE3M25 Project Installing samtools Start here: bioinf.gen.tcd.ie/GE3M25/project Download and chmod +x samtools in Terminal

25 Trinity College Dublin, The University of Dublin GE3M25 Project./samtools view -b -S 13222222.sam > 13222222.bam program input file command output Paramters -b  output in binary format -S  input in SAM format Sorting and indexing - Change from SAM to BAM format redirection of output to file

26 Trinity College Dublin, The University of Dublin GE3M25 Project./samtools view -b -S -@ 4 13222222.sam > 13222222.bam Sorting and indexing - Change from SAM to BAM format use 4 cores for compression

27 Trinity College Dublin, The University of Dublin GE3M25 Project./samtools sort -@ 4 13222222.bam 13222222.sorted program input file command output prefix use all four cores Sorting and indexing - Sort BAM file.bam suffix will be added

28 Trinity College Dublin, The University of Dublin GE3M25 Project./samtools index 13222222.sorted.bam program input file command Sorting and indexing - create index file with.bai index will be created

29 Trinity College Dublin, The University of Dublin Sorting and indexing Check output files GE3M25 Project

30 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) ✔ 2.Read mapping (Bowtie2) ✔ 3.Generate indexed and sorted BAM file ✔ 4.Peak calling 5.Visualisation (IGV) 6.Store BAM and index files GE3M25 Project

31 Trinity College Dublin, The University of Dublin GE3M25 Project Installing macs pip install macs --user In the Terminal: find ~/ -iname '*macs*' Find the location of the tool: /Users/kahokamp//Library/Python/2.7/bin/macs same as ~/Library/Python/2.7/bin/macs

32 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling

33 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling path/to/macs -t 13222222.sorted.bam -n yeast_macs -g 12000000 program treatment file output prefix genome size replace with path to macs

34 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling

35 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling

36 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling pairing model

37 Trinity College Dublin, The University of Dublin GE3M25 Project Peak calling Check output

38 Trinity College Dublin, The University of Dublin GE3M25 Project hide first 19 rows sort by column G or H

39 Trinity College Dublin, The University of Dublin GE3M25 Project

40 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) ✔ 2.Read mapping (Bowtie2) ✔ 3.Generate indexed and sorted BAM file ✔ 4.Peak calling 5.Visualisation (IGV) 6.Store BAM and index files GE3M25 Project

41 Trinity College Dublin, The University of Dublin GE3M25 Project 1. Download IGV (local copy on bioinf) 2. Unpack (on the command line_: unzip IGV_2.3.66.app.zip 3. Start by double-click in Finder 4. Load S. cerevisiae (sacCer3) genome 5. Load BAM file Visualisation with IGV (Integrated Genome Viewer)

42 Trinity College Dublin, The University of Dublin GE3M25 Project pick a region with a peak navigate there in IGV

43 Trinity College Dublin, The University of Dublin GE3M25 Project

44 Trinity College Dublin, The University of Dublin Steps in this class: 1.Download FastQ data set (ChIP-Seq of TF in yeast) ✔ 2.Read mapping (Bowtie2) ✔ 3.Generate indexed and sorted BAM file ✔ 4.Peak calling 5.Visualisation (IGV) ✔ 6.Store BAM and index files GE3M25 Project

45 Trinity College Dublin, The University of Dublin GE3M25 Project Storage of BAM file Upload.bam, bam.bai and MACS files through bioinf.gen.tcd.ie/GE3M25/project

46 Trinity College Dublin, The University of Dublin Optional steps in this class: 1. Download and map Input file 2. Run MACS with Input file as control 3. Change parameters in Bowtie2, MACS 4. Trim FastQ data 5. Compare results GE3M25 Project

47 Trinity College Dublin, The University of Dublin Don't forget to log out!


Download ppt "Trinity College Dublin, The University of Dublin GE3M25: Data Analysis, Class 4 Karsten Hokamp, PhD Genetics TCD, 07/12/2015"

Similar presentations


Ads by Google