Download presentation

Presentation is loading. Please wait.

1
Sequence Alignment Oct 9, 2002 Joon Lee Genomics & Computational Biology

2
2002-10-09Genomics & Computational Biology2 Dynamic Programming Optimization problems: find the best decision one after another Subproblems are not independent Subproblems share subsubproblems Solve subproblem, save its answer in a table

3
2002-10-09Genomics & Computational Biology3 Four Steps of DP 1.Characterize the structure of an optimal solution 2.Recursively define the value of an optimal solution 3.Compute the value of an optimal solution in a bottom-up fashion 4.Construct an optimal solution from computed information

4
2002-10-09Genomics & Computational Biology4 Sequence Alignment Sequence 1: G A A T T C A G T T A Sequence 2: G G A T C G A

5
2002-10-09Genomics & Computational Biology5 Align or insert gap G A A T T C A G T T A | | | G G A _ T C _ G _ _ A G _ A A T T C A G T T A | | | G G _ A _ T C _ G _ _ A

6
2002-10-09Genomics & Computational Biology6 Three Steps of SA 1.Initialization: gap penalty 2.Scoring: matrix fill 3.Alignment: trace back

7
2002-10-09Genomics & Computational Biology7 Step 1: Initialization GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -2 G -4 A -6 T -8 C -10 G -12 A -14

8
2002-10-09Genomics & Computational Biology8 Step 2: Scoring A = a 1 a 2 …a n, B = b 1 b 2 …b m S ij : score at (i,j) s(a i b j ) : matching score between a i and b j w : gap penalty figure source

9
2002-10-09Genomics & Computational Biology9 Step 2: Scoring Match: +2 Mismatch: -1 Gap: -2

10
2002-10-09Genomics & Computational Biology10 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -22 G -4 A -6 T -8 C -10 G -12 A -14 0 + 2 = 2 -2 + (-2) = -4 0 + 2 = 2 -2 + (-2) = -4

11
2002-10-09Genomics & Computational Biology11 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 G -4 A -6 T -8 C -10 G -12 A -14 -2 + (-1) = -3 -4 + (-2) = -6 2 + (-2) = 0 -2 + (-1) = -3 -4 + (-2) = -6 2 + (-2) = 0

12
2002-10-09Genomics & Computational Biology12 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 G -40 A -6 T -8 C -10 G -12 A -14 -2 + 2 = 0 2 + (-2) = 0 -4 + (-2) = -6 -2 + 2 = 0 2 + (-2) = 0 -4 + (-2) = -6

13
2002-10-09Genomics & Computational Biology13 Step 2: Scoring GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 -4-6-8-10-12-14-16-18 G -401-3-5-7-9-8-10-12-14 A -6-2231-3-5-7-9-11-10 T -8-401531-3-5-7-9 C -10-6-234531 -3-5 G -12-8-4-31234531 A -14-10-6-20153423

14
2002-10-09Genomics & Computational Biology14 Step 3: Trace back GAATTCAGTTA 0-2-4-6-8-10-12-14-16-18-20-22 G -220 -4-6-8-10-12-14-16-18 G -401-3-5-7-9-8-10-12-14 A -6-2231-3-5-7-9-11-10 T -8-401531-3-5-7-9 C -10-6-234531 -3-5 G -12-8-4-31234531 A -14-10-6-20153423

15
2002-10-09Genomics & Computational Biology15 Step 3: Trace back G A A T T C A G T T A G G A _ T C _ G _ _ A G A A T T C A G T T A G G A T _ C _ G _ _ A

16
2002-10-09Genomics & Computational Biology16 Excercise GCATCCG G A T C G Match: +2 Mismatch: -1 Gap: -2

17
2002-10-09Genomics & Computational Biology17 Excercise GCATCCG 0-2-4-6-8-10-12-14 G -220 -4-6-8-10 A -40120-2-4-6 T -20420-2 C -8-40-22642 G -10-6-20456 Match: +2 Mismatch: -1 Gap: -2 G C A T C C G G A T C G

18
2002-10-09Genomics & Computational Biology18 Amino acids Match/mismatch → Substitution matrix

19
2002-10-09Genomics & Computational Biology19 Global & Local alignment Global: Needlman-Wunsch Algorithm Local: Smith-Waterman Algorithm From Mount Bioinformatics Chap 3

20
2002-10-09Genomics & Computational Biology20 References Sequence alignment with Java applet –http://linneus20.ethz.ch:8080/5_4_5.htmlhttp://linneus20.ethz.ch:8080/5_4_5.html

Similar presentations

© 2020 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google