Download presentation

Presentation is loading. Please wait.

Published byOdalys Hankin Modified about 1 year ago

1
Eugene W.Myers and Webb Miller

2
Outline Introduction Gotoh's algorithm O(N) space Gotoh's algorithm Main algorithm Implementation Conclusion

3

4
Introduction Space, not time Hirschberg’s Algorithm Maximizing the similarity score of an alignment Gotoh’s Algorithm Minimizing the difference score of a conversion Linear space version for affine gap penalties. For a megabyte of memory. W.Myers and Miller : sequences of length Altschul and Erickson : sequences length < 1070

5
Transformation (1/2) Hirschberg’s AlgorithmGotoh’s Algorithm Aligned Pair Affine Gap Penalties

6
Transformation (2/2) Match = 8, Mismatch = -5, Gap Symbol = -3, Gap-open = -4 <

7
Example(1/2) Hirschberg’s Algorithm Gotoh’s Algorithm Match80 Mismatch-513 Gap-open-44 Gap Symbol-37

8
Example(2/2) 1A : ACGGTTCAAG B : ACGGTTCAAG 2A : ACGGTTCAAG B : ACGGATCAAG 3 Hirschberg’s AlgorithmGotoh’s Algorithm Cost C (minimum)

9
R 黃博平

10
Some notations : the i-symbol prefix of A : the j-symbol prefix of B C(i, j):minimum cost of a conversion of to

11
Simple gap(1/4) gap(k)= h*k

12
Simple gap(2/4) A A G AGTACAGTAC Space= O(n^2)

13
Simple gap(3/4) m/2

14
Simple gap(4/4) Forward score and backward score Space: O(m+n)

15
Affine gap(1/8) A gap of length k : cost = g + k*h A T A A C T C G A A T C - - T

16
Affine gap(2/8) C(i, j):minimum cost of a conversion of to D(i, j):minimum cost of a conversion of to that deletes I(i, j):minimum cost of a conversion of to that inserts

17
Affine gap(3/8) if i > 0 and j> 0 if i = 0 and j> 0 if i > 0 and j= 0 if i = 0 and j= 0

18
Affine gap(4/8) if i > 0 and j> 0 if i = 0 and j> 0

19
Affine gap(5/8) if i > 0 and j> 0 if i > 0 and j= 0

20
Affine gap(6/8)

21
Affine gap(7/8) * * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC C D I

22
Affine gap(8/8) * * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC I D C

23
R 陳彥璋

24
Observation i-th row of C and D depends only on row i and i-1. i-th row of I depends only on row i. CDI

25
Linear Space Use two one-dimension arrays (CC and DD) and three variables.

26
Linear Space

27
Algorithm

28
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC C D I g = 2.0 h = 0.5 CC DD t = 2.0

29
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC C D I g = 2.0 h = 0.5 CC DD t = 2.0

30
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC s c e CC DD g = 2.0 h = 0.5 i = 5 t = 4.5 C D I

31
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC s c e CC DD t = 4.5 i = 5 j = 1 g = 2.0 h = 0.5 C D I

32
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC s c CC DD t = 4.5 i = 5 j = 1 g = 2.0 h = 0.5 e C D I

33
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC s CC DD t = 4.5 i = 5 j = 1 g = 2.0 h = 0.5 e c C D I

34
* * * * * * A A G AGTACAGTAC **** A A G AGTACAGTAC AGTACAGTAC Optimal conversion cost. CC DD C D I

35
What is the conversion of AGTAC and AAG ?

36
B 王柏易

37
Midpoint Hirschberg (1975): recursive divide-and-conquer Backward Computing Forward Computing

38
Gap Penalty i-1, j-1i, j-1 i-1, ji, j

39
Gap Penalty CC( j) = minimum cost of a conversion of Ai* to Bj DD( j) = minimum cost of a conversion of Ai* to Bj that ends with a delete

40
Gap Penalty RR(N - j) = minimum cost of a conversion of Ai* T to Bj T SS(N - j) = minimum cost of a conversion of Ai* T to Bj T that begins with a delete

41
Find Midpoint with Gap Penalty Backward Computing Forward Computing How to compute the midpoint?

42
R 李政緯

43
Midpoint The problem of calculating the midpoint is that when we concatenate two substrings into one, we may coalesce two gaps into one Which means that we may consider min { CC + RR, DD + SS - g, II + JJ - g}

44
Midpoint Recall the above algorithm, we do save the space of II and JJ. We can reduce it into min {CC + RR, DD + SS - g}

45
Midpoint Remember that we should find min j ∈ [0, N] {min { CC + RR, DD + SS - g, II + JJ - g}} i* j j+1

46
Midpoint Type 1 recurrence Type 2 recurrence i* j* i* j*

47
Example A = agtac, B = aag, i* = 2 agtac a__ag Recurrsive call on (a, a) and (ac, ag)

48
R 涂宗瑋

49
Implementation Storage Requirement Memory v.s. Sequence length Compared with classic dynamic programming algorithm

50
Storage Requirement(1/4) Vectors : CC,DD,RR, and SS Space: 4N words M + N words for an optimal conversion M = N = 38 40

51
Storage Requirement(2/4) words for the table(w):replacement costs 128*128 wASCII [1]ASCII [2]ASCII[3]ASCII[4]ASCII[…]ASCII[128] ASCII [1]W1,1W1,2W1,3W1,4W1,…W1,128 ASCII [2]W2,1W2,2W2,3W2,4W2,…W2,128 ASCII [3]W3,1W3,2W3,3W3,4W3,…W3,128 ASCII [4]W4,1W4,2W4,3W4,4W4,…W4,128 ASCII[…]W…,1W…,2W…,3W…,4W…,…W…,128 ASCII[128]W128,1W128,2W128,3W128,4W128,…W128,128

52
Storage Requirement(3/4) 16 words for the table(w):replacement costs 4*4 ATCG AW(A,A)W(A,T)W(A,C)W(A,G) TW(T,A)W(T,T)W(T,C)W(T,G) CW(C,A)W(C,T)W(C,C)W(C,G) GW(G,A)W(G,T)W(G,C)W(G,G)

53
Storage Requirement(4/4) M + N bytes for the sequences A and B. A and B could be compressed DNA sequences only 2(M + N) bits are necessary

54
Memory v.s. Sequence length Maximum length of sequences that can be aligned in a given amount of memory Altschul and Erickson : 7MN-bit approach Memory (bytes)Linear Space(w/o op.) Linear Space(with op.) Altschul and Erickson 64K k k k N = Memory / 4*4N = Memory / 6*4N = sqrt(Memory *8 / 7)

55
Compared with classic dynamic programming algorithm classic dynamic programming algorithm (Wagner and Fischer, 1974).

56
Compared with classic dynamic programming algorithm Space : classic dynamic programming algorithm : O(MN) linear-space algorithm O(N + lgM) Time : Both O(MN) But in practice, linear-space slower than classic dynamic programming algorithm. linear-space : classic DP = 2.84 : 1

57
R 林澤豪

58
C G G A T C A T CTTAACTCTTAACT Reduce problem

59
Reduce problem(cont.)

60
60 Reduce problem(cont.) m/2 Partition line

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google