Download presentation
Published byEvan Woods Modified over 9 years ago
1
Chromatin Immunoprecipitation DNA Sequencing (ChIP-seq)
2
2nd and 3rd Generation DNA Sequencers and Applications
Roche 454 (2nd) Illumina Solexa(2nd) ABI SoLid (2nd) Helicos (3rd) Applications De novo sequencing Targeted resequencing Digital Gene Expression (DGE) RNA-seq ChIP-seq Sequencing Platforms
3
Why ChIP-seq? Protein-DNA interactions Chromatin States
Transciptional regulation Histone modifications (methylation at K and R) play role in gene regulation; both expression and repression. Enzymes that catalyze methylation reaction have been implicated in playing a critical roles in development and pathological processes. Promotor regions of active genes have reduced nucleosome occupancy and elevated histone acetylation.
4
ChIP experiment In Nutshell Shear chromatin (sonication)
Protein cross-linked to DNA in vivo by treating cells with formaldehyde Shear chromatin (sonication) IP with specific antibody Reverse cross-links, purify DNA PCR amplification* Identify sequences Genome-wide association map *-unless using a single molecule sequencer
5
History: From ChIP-chip to ChIP-seq
ChIP-chip (c.2000) Resolution (30-100bp) Coverage limited by sequences on the array Cross-hybridization between probes and non-specific targets creates background noise Tiled arrays cover most of the non-repetitive genome. Cost increases with size of genome. Ex. Yeast has been very well characterized, but human not so much due to genome size
6
ChIP-seq experiment (2007-present)
7
Sample Prep: Solexa vs. Helicos
8
ChIP-seq Material sample preps with in-house protocols
Solexa sample prep Normal QC and ChIP steps Input material typically >30ng End-Repair (1h) Purification (phenol/precipitation) (1.5h) A-overhang (1h) Adapter oligo ligation (30min) Purification (phenol/precipitation) (1.5h) Size-selection (30min by E-gel) Precipitation (1h) Amplification PCR (2h) (12-18 cycles) Diagnostic gel (30min) QC by direct qPCR (4hours) Amount of library sequenced approx. 1/10 Unique Tags after analysis > 3M (based on our limited ERaChIP-seq libraries) Helicos sample prep Normal QC and ChIP steps Input material 3ng-9ng RNAseA/ProteinaseK treatment (2-3h) Purification (phenol/precipitation) (1.5h) Tailing (1.5h) Termination (1.5h) Amount of library sequenced approx. 1/3 Unique Tags after analysis approx >12M (based on our limited ERaChIP-seq libraries) **Slide borrowed from Thomas Westerling
10
Helicos vs Solexa vs ChIP2
Solexa data (red): Unique tags 4M Peaks called A Negative peaks B 2900 1. Solexa 2541 433 Helicos data (blue): Unique tags 13M Peaks called Negative peaks 1000E 4700 3. ChIP2 2. Helicos 5293 3744 ChIP2C data (green): Array technology, no tags Peaks called FDR 20D 1661 A) More inclusive (10%) ELAND mapping used (compare to Bowtie in library table) B) MACS performs a sample swap between ChIP and Input (chromatin) samples and calculates a local λ-value to determine level of background peaks called in control data. This gives a FDR for each positive peak. Due to the nature of deep sequencing combined with PCR this parameter is in some sample extremely high and not entirely trustworthy. C) ChIP2 data published in Carroll et al. Nat Genet Nov;38(11): D) FDR values of ChIP2 are calculated differently from FDRs by MACS and are not directly comparable. E) Negative peaks and thus local FDR values are at first glance more reliable in Helicos sequencing, in part at least due to the lack of amplification the removes scientist introduced artifacts and reduced complexity of sequenced library.
11
ChIP-seq Analysis
12
ChIP-seq peaks Only 5’ end of fragments are sequenced
Tags from both + and - strand aligned to reference genome
13
+/- tag mapping
14
Types of Analysis Binding site identification and discovery of binding sequence motifs (Non-histone ChIP) Epigenomic gene regulation and chromatin structure (Histone ChIP)
15
Binding Site Detection But where does the meat go?
16
Control: Input DNA Measuring enrichment
Input DNA: portion of DNA sample removed before IP Rozowsky, J. et al. PeakSeq enables systematic scoring of ChIPSeq experiments relative to controls. Nature Biotech. 27, (2009)
17
Why we need to sequence Input DNA
Input DNA does not demostrate “flat” or random (Poisson) distribution Open chromatin regions tend to be fragmented more easily during shearing Amplification bias Mapping artifacts-increased coverage of more “mappable” regions (which also tend to be promotor regions) and repetitive regions due inaccuracies in number of copies in assembled genome
18
Depth of Sequencing Are we there yet?
19
ERa E2 Helicos MACS peaks 12500 (tag30 mfold30) – sequence depth determination by subsampling
% peaks detected of total peaks/bin % of tags sampled FoldChange Bins Number of total Peaks in each bin
20
Statistical Significance
21
MACS shifted tag-count graph – i.e. Peak shapes
Helicos Input HelicosChIP SolexaChIP Solexa Input
22
MACS shifted tag-count graph – i.e. Peak shapes
Helicos Input HelicosChIP SolexaChIP Solexa Input
23
MACS shifted tag-count graph – i.e. Peak shapes
Helicos Input HelicosChIP SolexaChIP Solexa Input
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.