ChIP-seq Robert J. Trumbly

Slides:



Advertisements
Similar presentations
Methods to read out regulatory functions
Advertisements

Regulomics II: Epigenetics and the histone code Jim Noonan GENE760.
NGS Analysis Using Galaxy
1 1 - Lectures.GersteinLab.org Overview of ENCODE Elements Mark Gerstein for the "ENCODE TEAM"
Mapping protein-DNA interactions by ChIP-seq Zsolt Szilagyi Institute of Biomedicine.
Detecting enriched regions (Chip- seq, RIP-seq) Statistical evaluation of enriched regions Data displayed in Genome Browser Detection of enriched motifs.
Genomics Virtual Lab: analyze your data with a mouse click Igor Makunin School of Agriculture and Food Sciences, UQ, April 8, 2015.
is accessible at: The following pages are a schematic representation of how to navigate through ALE-HSA21.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Chip-Seq Peak Calling in Galaxy | Lisa Stubbs | PowerPoint by Casey Hanson.
Sackler Medical School
Starting Monday M Oct 29 –Back to BLAST and Orthology (readings posted) will focus on the BLAST algorithm, different types and applications of BLAST; in.
How do we represent the position specific preference ? BID_MOUSE I A R H L A Q I G D E M BAD_MOUSE Y G R E L R R M S D E F BAK_MOUSE V G R Q L A L I G.
Thoughts on ENCODE Annotations Mark Gerstein. Simplified Comprehensive (published annotation, mostly in '12 & '14 rollouts)
Overview of ENCODE Elements
Biol 456/656 Molecular Epigenetics Lecture #5 Wed. Sept 2, 2015.
UCSC Genome Browser Zeevik Melamed & Dror Hollander Gil Ast Lab Sackler Medical School.
Tools in Bioinformatics Genome Browsers. Retrieving genomic information Previous lesson(s): annotation-based perspective of search/data Today: genomic-based.
Chip – Seq Peak Calling in Galaxy Lisa Stubbs Lisa Stubbs | Chip-Seq Peak Calling in Galaxy1.
User-friendly Galaxy interface and analysis workflows for deep sequencing data Oskari Timonen and Petri Pölönen.
Using public resources to understand associations Dr Luke Jostins Wellcome Trust Advanced Courses; Genomic Epidemiology in Africa, 21 st – 26 th June 2015.
Practice:submit the ChIP_Streamline.pbs 1.Replace with your 2.Make sure the.fastq files are in your GMS6014 directory.
Centralizing Bioinformatics Services: Analysis Pipelines, Opportunities, and Challenges with Large- scale –Omics, and other BigData High-Performance Computing.
Additional high-throughput sequencing techniques (finding all functional elements of genome) June 15, 2017.
ChIP-seq Robert J. Trumbly
Pathway Informatics 16th August, 2017
Yiming Kang, Hien-haw Liow, Ezekiel Maier, & Michael Brent
Regulation of Gene Expression
Outline of the chromatin immunoprecipitation (ChIP) technique
Cancer Genomics Core Lab
Functional Elements in the Human Genome
NGS Analysis Using Galaxy
Chip – Seq Peak Calling in Galaxy
Functional Mapping and Annotation of GWAS: FUMA
Day 5 Session 29: Questions and follow-up…. James C. Fleet, PhD
Next Generation Sequencing analysis
Many Sample Size and Power Calculators Exist On-Line
Sequencing Data Analysis
Volume 50, Issue 1, Pages (April 2013)
Dynamic epigenetic enhancer signatures reveal key transcription factors associated with monocytic differentiation states by Thu-Hang Pham, Christopher.
High-Resolution Profiling of Histone Methylations in the Human Genome
Schedule for the Afternoon
Volume 11, Issue 2, Pages (August 2012)
Volume 58, Issue 2, Pages (April 2015)
A Phase Separation Model for Transcriptional Control
Volume 7, Issue 5, Pages (June 2014)
Volume 54, Issue 1, Pages (April 2014)
Epigenetics System Biology Workshop: Introduction
lincRNAs: Genomics, Evolution, and Mechanisms
Volume 20, Issue 6, Pages (August 2017)
Rudolf Jaenisch, Richard Young  Cell 
High-Resolution Profiling of Histone Methylations in the Human Genome
Alex M. Plocik, Brenton R. Graveley  Molecular Cell 
Control of the Embryonic Stem Cell State
Genome-wide analysis of p53 occupancy.
Volume 133, Issue 6, Pages (June 2008)
Volume 132, Issue 2, Pages (January 2008)
Volume 21, Issue 6, Pages e6 (December 2017)
Volume 132, Issue 6, Pages (March 2008)
Volume 122, Issue 6, Pages (September 2005)
Adam C. Wilkinson, Hiromitsu Nakauchi, Berthold Göttgens  Cell Systems 
Volume 1, Issue 1, Pages (July 2015)
Chip – Seq Peak Calling in Galaxy
Volume 24, Issue 8, Pages e7 (August 2018)
Volume 58, Issue 2, Pages (April 2015)
The 3D Genome in Transcriptional Regulation and Pluripotency
IMPACT: Genomic Annotation of Cell-State-Specific Regulatory Elements Inferred from the Epigenome of Bound Transcription Factors  Tiffany Amariuta, Yang.
The Genetics of Transcription Factor DNA Binding Variation
Chromatin basics & ChIP-seq analysis
Sequencing Data Analysis
Beyond GWASs: Illuminating the Dark Road from Association to Function
Presentation transcript:

ChIP-seq Robert J. Trumbly Department of Biochemistry and Cancer Biology Block Health Science 448, UTHSC 419-383-4347 robert.trumbly@utoledo.edu

ChIP-seq ChIP-seq (chromatin immunoprecipitation followed by DNA sequencing) has become the preferred method for analyzing protein-DNA interactions and chromatin structure on a genomic scale ChIP-seq has become practical because of rapid developments in NGS (next generation sequencing)

NGS The transition from microarrays to NGS creates not just more data but a different type of data Microarray data are analog: how much expression (signal) for a gene? NGS data are digital: e.g., which splicing variant is expressed?

NGS RNA-seq: can detect splicing variants, allelic expression, novel mRNAs ChIP-seq: can detect differential binding to allelic variants, leading to information about binding specificity

Park, Oct 2009

TFs: sharp binding sites RNA Pol II: sharp and extended Histone modifications: extended domains Park, Oct 2009

Park, Oct 2009

ChIP-seq and RNA-seq analysis Pepke et al., Nature Methods 6:S22-S32 2009

This example shows a workflow for the analysis of data from chromatin immunoprecipitation followed by sequencing (ChIP–seq). This analysis can be done by a bench scientist using current resources, and a similar strategy could be used for other types of next-generation sequencing data. Blue boxes show steps that can be performed using Galaxy. Integration or cross-sectioning of data can often be done in the University of California-Santa Cruz (UCSC) Genome Browser or by joining lists in Galaxy (purple box). Downstream steps, such as known motif analysis and Gene Ontology analysis, can be achieved with online or stand-alone tools (orange boxes). Galaxy can also be used to establish analytical pipelines for calling SNPs that could then be integrated into sequencing-based data, such as reads from ChIP–seq. CEAS, Cis-regulatory Element Annotation System; MACS, Model-based Analysis of ChIP–Seq; TSS, transcription start site. Hawkins 2010

FASTQ files Output of NGS usually in FASTQ files @SEQ_ID GATTTGGGGTTCAAAGCAGTATCGATCAAATAGTAAATCCATTTGTTCAACTCACAGTTT + !''*((((***+))%%%++)(%%%%).1***-+*''))**55CCF>>>>>>CCCCCCC65 Output of NGS usually in FASTQ files Line 1 @ followed by sequence id Line 2: sequence Line 3: +, sometimes followed by text Line 4: quality score for each base, encoded as ASCII symbol

Quality scores Phred quality score, Q = -10 log10p, where p = the probability that the corresponding base call is incorrect. Example: p = 0.001, log(0.001) = -3 Q = - 10 X -3 = 30 For the FASTQ file, an offset of 33 (for the most common encoding) is added to the raw quality score, and the ASCII symbol corresponding to that number is stored and displayed. There are several variations on the quality score encoding, so programs that interpret the scores must know the proper version

Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells Chen et al., Cell 133,13 June 2008, Pages 1106–1117 Chromatin immunoprecipitation coupled with ultra-high-throughput DNA sequencing (ChIP-seq) to map the locations of 13 sequence-specific TFs (Nanog, Oct4, STAT3, Smad1, Sox2, Zfx, c-Myc, n-Myc, Klf4, Esrrb, Tcfcp2l1, E2f1, and CTCF) and 2 transcription regulators (p300 and Suz12).

Figure 1 Genome-Wide Mapping of 13 Factors in ES Cells by Using ChIP-seq Technology TFBS profiles for the sequence-specific transcription factors and mock ChIP control at the Pou5f1 and Nanog gene loci are shown.

Figure 2 Identification of Enriched Motifs by Using a De Novo Approach Matrices predicted by the de novo motif-discovery algorithm Weeder.

ChIP-seq tutorial Chip-seq Analysis with Galaxy: from reads to peaks (and motifs) 2 - Obtaining the raw data: Accessing ChIP-seq reads from ArrayExpress database 3 - Upload the reads in the Galaxy server 4 - Some statistics on the raw data 5 - Mapping the reads with Bowtie 6 - Peak calling with MACS 7 - Retrieving the peak sequences 8 - Visualize the peak regions in UCSC genome browser 9 - Try to identify over represented motifs http://ngs.molgen.mpg.de/ngsuploads/Cornelius/ESGI/Chip.htm

ChIP-seq tutorial Revision to tutorial: Part 2, step 4: click on name of entry Part 2, step 5: click on ENA link at bottom of page Part 4, step 2: there is no FASTX-Toolkit for FASTQ data section, the tools here are under the general heading NGS: QC and manipulation. There is also a new FastQC:Read QC tool here that is useful.

References For tutorial: Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells. Chen et al., Cell Volume 133, 13 June 2008, Pages 1106–1117 The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Cock et al., Nucleic Acids Research, 2010, Vol. 38, No. 6 1767–1771. Computation for ChIP-seq and RNA-seq studies. Pepke et al., Nature Methods SUPPLEMENT | VOL.6 NO.11s | NOVEMBER 2009 | S23. ChIP–seq: advantages and challenges of a maturing technology. Park et al., Nature Reviews | Genetics 10 | October 2009 | 669-680. Next-generation genomics: an integrative approach. Hawkins et al., NATURE REVIEWS | Genetics 11 | July 2010 | 477-486.