Presentation is loading. Please wait.

Presentation is loading. Please wait.

Setting Up a Replica Exchange Approach to Motif Discovery in DNA Jeffrey Goett Advisor: Professor Sengupta.

Similar presentations


Presentation on theme: "Setting Up a Replica Exchange Approach to Motif Discovery in DNA Jeffrey Goett Advisor: Professor Sengupta."— Presentation transcript:

1 Setting Up a Replica Exchange Approach to Motif Discovery in DNA Jeffrey Goett Advisor: Professor Sengupta

2 Protein Synthesis from DNA Translation to Proteins Transcription Regulation RNA polymerase Binding Proteins gene Binding sites

3 Binding Sites Sequence A: code for protein Binding protein “A” Binding Site A - A - C - G - A - C - T - T - G - C - T - G - T - T - C - A - A - C - C - A - A - A - G - T - T - G - G - T - Sequence B: code for protein A - A - G - G - A - C - T - T - C - C - T - G - C - G - T - T - G - C - T - C - G - C - A - A - C - G - A - G - Binding protein “A”

4 Discovering New Binding Motifs …ATCG GCTCAG CTAG… …CACT GATCAG AGTA… …TTCC GCTCTG TAAC… …GCTA GCTCAA ATCG… Motif Probability Model Motif: GCTCAG

5 Modeling Motifs in Sequences ATATCCGTA AATCGAGAC TCGATGTGT CCACCTGCA Assume: Break into N sequences Each sequence has one instance of motif embedded in random background Variations of motif by point mutation, but not insertion or deletion

6 Modeling Motifs in Sequences AT ATC CGTA A ATC GAGAC TCG ATG TGT CC ACC TGCA The “Alignment:” Starting position of motif in each sequence The “Motif Probability Distribution:” Probability of each letter occurring at each motif position

7 Scoring a Model “Log-likelihood” score: ATATCCGTA AATCGAGAC TCGATGTGT CCACCTGCA p 1,T p 2,A p 3,T p 1,A p 2,G p 3,A p 1,A p 2,T p 3,G p 1,C p 2,C p 3,A p C p C p G p T p A 0 0 0 0 0 pApA 0 p A p A p T p C p G 0 0 0 0 0 pCpC 0 p T p C p G 0 0 0 p T p G p T 0 0 0 p C p C p T p G p C p A 0 0 0

8 Example Models A TAT CCGTA AAT CGA GAC TCGATG TGT CC ACC TGCA {3, 2, 4, 3} AT ATC CGTA A ATC GAGAC TCG ATG TGT CC ACC TGCA {2, 4, 7, 3}

9 The Gibbs Sampler We want to find that maximizes

10 The Gibbs Sampler

11 Times visited Over time, the frequency distribution approaches

12 Biasing our search to these areas may discover the pj,ro values which maximize faster. If we assume areas of local maximization contribute the most during “integration” to the local maximizations of Optimization Technique

13 Multiple Gibbs Samplers By combining results from Gibbs Samplers begun at random positions, find maximizing sooner

14 Replica Exchange/Parallel Tempering “Low-sensitivity” samplers which “scout out area” periodically swap with “high-sensitivity” samplers good at focused searches if swap appears promising.

15 Controlling Sensitivity Adjust the relative probability of sampling an x i by adjusting a new parameter in distribution: Small Large Search breadth of space Focused search of region

16 Testing the Sensitivity Running on randomly generated sequences to see motifs found, different sensitivity samplers converge to different scores. Betas 2 1.9.1

17 Predicting Convergence Score Measure of Similarity: magnetization “Configuration Score:” energy Ex: m=.5 m=.5 E=0 m=1 E=-6J m=0 E=2J m=0 E=2J m=0 E=2J

18 Alignment Analogue m=.77 E=-5J m=1 E=-9J m=.77 E=-5J m=.77 E=-5J A: B: C:

19 Test Results L < |alphabet| w

20 Test Results L > |alphabet| w

21 Test Results

22

23 Hidden Motifs: Gibbs Sampler Beta =.1Beta =.5Beta =.9 Beta = 1.3Beta = 1.7Beta = 2 W=5, l=500

24 Hidden Motifs: Replica Exchange Betas.9.93.96 1.8 1.5

25


Download ppt "Setting Up a Replica Exchange Approach to Motif Discovery in DNA Jeffrey Goett Advisor: Professor Sengupta."

Similar presentations


Ads by Google