Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gibbs sampling for motif finding Yves Moreau. 2 Overview Markov Chain Monte Carlo Gibbs sampling Motif finding in cis-regulatory DNA Biclustering microarray.

Similar presentations


Presentation on theme: "Gibbs sampling for motif finding Yves Moreau. 2 Overview Markov Chain Monte Carlo Gibbs sampling Motif finding in cis-regulatory DNA Biclustering microarray."— Presentation transcript:

1 Gibbs sampling for motif finding Yves Moreau

2 2 Overview Markov Chain Monte Carlo Gibbs sampling Motif finding in cis-regulatory DNA Biclustering microarray data

3 3 Markov Chain Monte-Carlo Markov chain with transition matrix T A C G T A 0.0643 0.8268 0.0659 0.0430 C 0.0598 0.0484 0.8515 0.0403 G 0.1602 0.3407 0.1736 0.3255 T 0.1507 0.1608 0.3654 0.3231 X= A X= C X= G X= T

4 4 Markov Chain Monte-Carlo Markov chains can sample from complex distributions ACGCGGTGTGCGTTTGACGA ACGGTTACGCGACGTTTGGT ACGTGCGGTGTACGTGTACG ACGGAGTTTGCGGGACGCGT ACGCGCGTGACGTACGCGTG AGACGCGTGCGCGCGGACGC ACGGGCGTGCGCGCGTCGCG AACGCGTTTGTGTTCGGTGC ACCGCGTTTGACGTCGGTTC ACGTGACGCGTAGTTCGACG ACGTGACACGGACGTACGCG ACCGTACTCGCGTTGACACG ATACGGCGCGGCGGGCGCGG ACGTACGCGTACACGCGGGA ACGCGCGTGTTTACGACGTG ACGTCGCACGCGTCGGTGTG ACGGCGGTCGGTACACGTCG ACGTTGCGACGTGCGTGCTG ACGGAACGACGACGCGACGC ACGGCGTGTTCGCGGTGCGG A C G T % Position

5 5 Markov Chain Monte-Carlo Let us look at the transition after two steps Similarly, after n steps

6 6 Markov Chain Monte-Carlo Stationary distribution  If the samples are generated to the distribution , the samples at the next step will also be generated according to   is a left eigenvector of T Equilibrium distribution Rows of T  are stationary distributions  From an arbitrary initial condition and after a sufficient number of steps (burn-in), the successive states of the Markov chains are samples from a stationary distribution

7 7 Detailed balance A sufficient condition for the Markov chain to converge to the stationary distribution p is that they satisfy the condition of detailed balance Proof: Problem: disjoint regions in probability space

8 8 Gibbs sampling Markov chain for Gibbs sampling

9 9 Gibbs sampling Detailed balance Detailed balance for the Gibbs sampler Prove detailed balance Bayes’ rule Q.E.D.

10 10 Data augmentation Gibbs sampling Introducing unobserved variables often simplifies the expression of the likelihood A Gibbs sampler can then be set up Samples from the Gibbs sampler can be used to estimate parameters

11 11 Pros and cons Pros Clear probabilistic interpretation Bayesian framework “Global optimization” Cons Mathematical details not easy to work out Relatively slow

12 12 Motif finding

13 13 Gibbs sampler Gibbs sampling for motif finding Set up a Gibbs sampler for the joint probability of the motif matrix and the alignment given the sequences Sequence by sequence Lawrence et al. One motif of fixed length One occurrence per sequence Background model based on single nucleotides Too sensitive to noise Lots of parameter tuning

14 Translation start 500 bp

15 15 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

16 16 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

17 17 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

18 18 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

19 19 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

20 20 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

21 21 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Convergence of the alignment and of the motif matrix

22 22 Gibbs motif finding Initialization Sequences Random motif matrix Iteration Sequence scoring Alignment update Motif instances Motif matrix Termination Stabilization of the motif matrix (not of the alignment)

23 23 Motif Sampler (extended Gibbs sampling) Model One motif of fixed length per round Several occurrences per sequence Sequence have a discrete probability distribution over the number of copies of the motif (under a maximum bound) Multiple motifs found in successive rounds by masking occurrences of previous motifs Improved background model based on oligonucleotides Gapped motifs

24 Translation start 500 bp


Download ppt "Gibbs sampling for motif finding Yves Moreau. 2 Overview Markov Chain Monte Carlo Gibbs sampling Motif finding in cis-regulatory DNA Biclustering microarray."

Similar presentations


Ads by Google