Download presentation

Presentation is loading. Please wait.

Published byJessie Pierce Modified about 1 year ago

1
Chromatin Immuno-precipitation (CHIP)-chip Analysis 11/07/07

2
Experimental Protocol Step 1: crosslink protein with DNA Step 2: sonication (break) DNA Kim and Ren 2007

3
Experimental Protocol Step 1: crosslink –fix protein with DNA Step 2: sonication –break DNA Step 3: immuno- precipitation –Pull down target protein by specific antibody Kim and Ren 2007

4
Experimental Protocol Step 1: crosslink –fix protein with DNA Step 2: sonication –break DNA Step 3: immuno- precipitation –Pull down target protein by specific antibody Step 4: hybridization –Hybridize input and pulled-down DNA on microarray Kim and Ren 2007

5
Intergenic microarray Array probes are PCR products of intergenic regions. Binding signal is represented by a single probe.

6
ChIP-array Consistently enriched in repeated ChIP-arrays are selected to be the TF binding targets Usually hundreds of targets, each ~1000 long We want to know the precise binding (e.g. 10 bases) TF Target

7
Microarray probes are oligonucleotide sequences with regular spacing covering a whole genomic region. chromosome Tiling arrays

8
Tiling Array Data Each TF binding signal is represented by multiple probes. Need more sophisticated statistical tools. Kim and Ren 2007

9
Methods Moving average t-test (Keles et al. 2004) HMM (Li et al. 2005; Yuan et al. 2005) Tilemap (Ji and Wong 2005) MAT (Johnson et al. 2006)

10
Keles’ method Calculate a two-sample t- statistic Y2Y2 Y1Y1 i CHIP-signal Input-signal Keles et al. 2004

11
Keles’ method Calculate a two-sample t- statistic Y2Y2 Y1Y1 i CHIP-signal Input-signal w Moving average scan- statistic

12
Multiple hypothesis testing Multiple hypothesis testing needs to be considered to control false positive error rates. What is the null distribution of this statistic?

13
Multiple hypothesis testing Assume has t-distribution Approximate by normal distribution. Alternatively can use resampling method to estimate the null distribution.

14
Tilemap Improvement over Keles’ method in following ways Use a more robust test statistic Estimate the null distribution without prior assumptions. Ji and Wong 2005

15
Step 1: calculating a t-like test statistic Model: log-intensity Probe index Condition indexReplicate index

16
Step 1: calculating a t-like test statistic Model: log-intensity pooling data

17
Two samples: Multiple samples: Step 1: calculating a t-like test statistic Want to have a robust estimate of variance.

18
Notation Step 1: calculating a t-like test statistic Estimation of by variance shrinkage Shrinkage factor

19
Step 2: Merging data Moving average Alternatively use Hidden Markov Model

20
Step 3: control FDR Goal: To find null and signal distributions Idea: assume a mixture model This is unidentifiable!

21
Step 3: control FDR Goal: To find null and signal distributions Idea: assume a mixture model This is unidentifiable! A clever trick: Look for with

22
How to find g 0 and g 1 To get g 1, can we select probes with highest t-score? Why or why not?

23
How to find g 0 and g 1 Idea: signals at neighboring probes are correlated, whereas noises are not (hopefully!) First select probes that have the highest t- score t i. Use their downstream value t i+1 to estimate g 1. Use same trick to estimate g 0.

24
Step 3: control FDR Goal: To find null and signal distributions Idea: assume a mixture model This is unidentifiable! A clever trick: Find with Additional assumption:

25
Step 3: control FDR Goal: To find null and signal distributions Idea: assume a mixture model This is unidentifiable! A clever trick: Find with Additional assumption:

26
Step 3: Unbalanced mixture score with is estimated by fitting

27
False discovery rate (FDR) Determine TF bindings sites are FDR cutoff

28
How to find g 0 and g 1 Idea: signals at neighboring probes are correlated, whereas noises are not (hopefully!) First select probes that have the highest t- score t i. Use their downstream value t i+1 to estimate g 1. Use same trick to estimate g 0. Memory problem!

29
Example: Analysis of a cMyc binding data

30
Comparison of models

31
Simulation results

32
MAT Basic Idea: Baseline level correction Standardize probe intensity with respect to the expected baseline value (Johnson et al. 2006)

33
MAT How to estimate the baseline values?

34
Estimated nucleotide effect A C

35
MAT Standardization

36
(X.S. Liu)

37
Reading List Keles el 2004 –Developed a multiple hypothesis method for tiling array analysis Ji and Wong 2005 –Tilemap; improved over Keles et al.’s method Johnson et al. 2006 –MAT: showed baseline adjustment improved signal detection.

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google