Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chromatin Immuno-precipitation (CHIP)-chip Analysis

Similar presentations


Presentation on theme: "Chromatin Immuno-precipitation (CHIP)-chip Analysis"— Presentation transcript:

1 Chromatin Immuno-precipitation (CHIP)-chip Analysis
11/07/07

2 Experimental Protocol
Step 1: crosslink protein with DNA Step 2: sonication (break) DNA Kim and Ren 2007

3 Experimental Protocol
Step 1: crosslink fix protein with DNA Step 2: sonication break DNA Step 3: immuno-precipitation Pull down target protein by specific antibody Kim and Ren 2007

4 Experimental Protocol
Step 1: crosslink fix protein with DNA Step 2: sonication break DNA Step 3: immuno-precipitation Pull down target protein by specific antibody Step 4: hybridization Hybridize input and pulled-down DNA on microarray Kim and Ren 2007

5 Intergenic microarray
Array probes are PCR products of intergenic regions. Binding signal is represented by a single probe.

6 ChIP-array Consistently enriched in repeated ChIP-arrays are selected to be the TF binding targets Usually hundreds of targets, each ~1000 long We want to know the precise binding (e.g. 10 bases) TF Target

7 Tiling arrays Microarray probes are oligonucleotide sequences with regular spacing covering a whole genomic region. chromosome

8 Tiling Array Data Each TF binding signal is represented by multiple probes. Need more sophisticated statistical tools. Kim and Ren 2007

9 Methods Moving average t-test (Keles et al. 2004)
HMM (Li et al. 2005; Yuan et al. 2005) Tilemap (Ji and Wong 2005) MAT (Johnson et al. 2006)

10 Keles’ method Calculate a two-sample t-statistic CHIP-signal Y2 Y1
Input-signal i Keles et al. 2004

11 Keles’ method Calculate a two-sample t-statistic
CHIP-signal Y2 Y1 Moving average scan-statistic Input-signal i

12 Multiple hypothesis testing
Multiple hypothesis testing needs to be considered to control false positive error rates. What is the null distribution of this statistic?

13 Multiple hypothesis testing
Assume has t-distribution Approximate by normal distribution. Alternatively can use resampling method to estimate the null distribution.

14 Tilemap Improvement over Keles’ method in following ways
Use a more robust test statistic Estimate the null distribution without prior assumptions. Ji and Wong 2005

15 Step 1: calculating a t-like test statistic
Model: log-intensity Probe index Condition index Replicate index

16 Step 1: calculating a t-like test statistic
Model: log-intensity pooling data

17 Step 1: calculating a t-like test statistic
Two samples: Multiple samples: Want to have a robust estimate of variance.

18 Step 1: calculating a t-like test statistic
Estimation of by variance shrinkage Shrinkage factor Notation

19 Step 2: Merging data Moving average
Alternatively use Hidden Markov Model

20 Step 3: control FDR Goal: To find null and signal distributions
Idea: assume a mixture model This is unidentifiable!

21 Step 3: control FDR Goal: To find null and signal distributions
Idea: assume a mixture model This is unidentifiable! A clever trick: Look for with

22 How to find g0 and g1 To get g1, can we select probes with highest t-score? Why or why not?

23 How to find g0 and g1 Idea: signals at neighboring probes are correlated, whereas noises are not (hopefully!) First select probes that have the highest t-score ti. Use their downstream value ti+1 to estimate g1. Use same trick to estimate g0.

24 Step 3: control FDR Goal: To find null and signal distributions
Idea: assume a mixture model This is unidentifiable! A clever trick: Find Additional assumption: with

25 Step 3: control FDR Goal: To find null and signal distributions
Idea: assume a mixture model This is unidentifiable! A clever trick: Find Additional assumption: with

26 Step 3: Unbalanced mixture score
with is estimated by fitting

27 False discovery rate (FDR)
Determine TF bindings sites are FDR cutoff

28 How to find g0 and g1 Idea: signals at neighboring probes are correlated, whereas noises are not (hopefully!) First select probes that have the highest t-score ti. Use their downstream value ti+1 to estimate g1. Use same trick to estimate g0. Memory problem!

29 Example: Analysis of a cMyc binding data

30 Comparison of models

31 Simulation results

32 MAT Basic Idea: Baseline level correction
Standardize probe intensity with respect to the expected baseline value (Johnson et al. 2006)

33 MAT How to estimate the baseline values?

34 Estimated nucleotide effect

35 MAT Standardization

36 (X.S. Liu)

37 Reading List Keles el 2004 Ji and Wong 2005 Johnson et al. 2006
Developed a multiple hypothesis method for tiling array analysis Ji and Wong 2005 Tilemap; improved over Keles et al.’s method Johnson et al. 2006 MAT: showed baseline adjustment improved signal detection.


Download ppt "Chromatin Immuno-precipitation (CHIP)-chip Analysis"

Similar presentations


Ads by Google