Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 6 Dynamic Programming. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, optimizing some local criterion. Divide-and-conquer.

Similar presentations


Presentation on theme: "1 Chapter 6 Dynamic Programming. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, optimizing some local criterion. Divide-and-conquer."— Presentation transcript:

1 1 Chapter 6 Dynamic Programming

2 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, optimizing some local criterion. Divide-and-conquer. Break up a problem into sub-problems, solve each sub-problem independently, and combine solution to sub-problems to form solution to original problem. Dynamic programming. Break up a problem into a series of overlapping sub-problems, and build up solutions to larger and larger sub-problems.

3 3 Dynamic Programming Applications Areas. n Bioinformatics. n Control theory. n Information theory. n Operations research. n Computer science: theory, graphics, AI, compilers, systems, …. Some famous dynamic programming algorithms. n Linux diff for comparing two files. n Smith-Waterman for genetic sequence alignment. n Bellman-Ford for shortest path routing in networks. n Cocke-Kasami-Younger for parsing context free grammars.

4 4 Knapsack Problem Knapsack problem. n Given n objects and a "knapsack." n Item i weighs w i > 0 kilograms and has value v i > 0. n Knapsack has capacity of W kilograms. n Goal: fill knapsack so as to maximize total value. Ex: { 3, 4 } has value 40. Greedy: repeatedly add item with maximum ratio v i / w i. Ex: { 5, 2, 1 } achieves only value = 35  greedy not optimal. 1 value 18 22 28 1 weight 5 6 62 7 # 1 3 4 5 2 W = 11

5 5 Dynamic Programming: False Start Def. OPT(i) = max profit subset of items 1, …, i. n Case 1: OPT does not select item i. – OPT selects best of { 1, 2, …, i-1 } n Case 2: OPT selects item i. – accepting item i does not immediately imply that we will have to reject other items – without knowing what other items were selected before i, we don't even know if we have enough room for i Conclusion. Need more sub-problems!

6 6 Dynamic Programming: Adding a New Variable Def. OPT(i, w) = max profit subset of items 1, …, i with weight limit w. n Case 1: OPT does not select item i. – OPT selects best of { 1, 2, …, i-1 } using weight limit w n Case 2: OPT selects item i. – new weight limit = w – w i – OPT selects best of { 1, 2, …, i–1 } using this new weight limit

7 7 Input: n, W, w 1,…,w N, v 1,…,v N for w = 0 to W M[0, w] = 0 for i = 1 to n for w = 1 to W if (w i > w) M[i, w] = M[i-1, w] else M[i, w] = max {M[i-1, w], v i + M[i-1, w-w i ]} return M[n, W] Knapsack Problem: Bottom-Up Knapsack. Fill up an n-by-W array.

8 8 Knapsack Algorithm n + 1 1 Value 18 22 28 1 Weight 5 6 62 7 Item 1 3 4 5 2  { 1, 2 } { 1, 2, 3 } { 1, 2, 3, 4 } { 1 } { 1, 2, 3, 4, 5 } 0 0 0 0 0 0 0 1 0 1 1 1 1 1 2 0 6 6 6 1 6 3 0 7 7 7 1 7 4 0 7 7 7 1 7 5 0 7 18 1 6 0 7 19 22 1 7 0 7 24 1 28 8 0 7 25 28 1 29 9 0 7 25 29 1 34 10 0 7 25 29 1 34 11 0 7 25 40 1 W + 1 W = 11 OPT: { 4, 3 } value = 22 + 18 = 40

9 9 Knapsack Problem: Running Time Running time.  (n W). n Not polynomial in input size! n "Pseudo-polynomial." n Decision version of Knapsack is NP-complete. [Chapter 8] Knapsack approximation algorithm. There exists a poly-time algorithm that produces a feasible solution that has value within 0.01% of optimum. [Section 11.8]

10 10 String Similarity How similar are two strings? n ocurrance n occurrence ocurrance ccurrenceo - ocurrnce ccurrnceo --a e- ocurrance ccurrenceo - 6 mismatches, 1 gap 1 mismatch, 1 gap 0 mismatches, 3 gaps

11 11 Applications. n Basis for Linux diff. n Speech recognition. n Computational biology. Edit distance. [Levenshtein 1966, Needleman-Wunsch 1970] n Gap penalty  ; mismatch penalty  pq. n In general, 2  >=  pq. n Cost = sum of gap and mismatch penalties. 2  +  CA CGACCTACCT CTGACTACAT TGACCTACCT CTGACTACAT - T C C C  TC +  GT +  AG + 2  CA - Edit Distance

12 12 Goal: Given two strings X = x 1 x 2... x m and Y = y 1 y 2... y n of symbols, find alignment of minimum cost. Def. An alignment M is a set of ordered pairs x i -y j such that each symbol occurs in at most one pair and no crossings. The number of x i and y j that don’t appear in M is the number of gaps. Def. The pair x i -y j and x i' -y j' cross if i j'. Ex: CTACCG vs. TACATG. Sol: M = x 2 -y 1, x 3 -y 2, x 4 -y 3, x 5 -y 4, x 6 -y 6. Sequence Alignment CTACC- TACAT- G G y1y1 y2y2 y3y3 y4y4 y5y5 y6y6 x2x2 x3x3 x4x4 x5x5 x1x1 x6x6

13 13 Sequence Alignment: Problem Structure Def. OPT(i, j) = min cost of aligning strings x 1 x 2... x i and y 1 y 2... y j. n Case 1: OPT matches x i -y j. – pay mismatch for x i -y j + min cost of aligning two strings x 1 x 2... x i-1 and y 1 y 2... y j-1 n Case 2a: OPT leaves x i unmatched. – pay gap for x i and min cost of aligning x 1 x 2... x i-1 and y 1 y 2... y j n Case 2b: OPT leaves y j unmatched. – pay gap for y j and min cost of aligning x 1 x 2... x i and y 1 y 2... y j-1

14 14 Sequence Alignment: Algorithm Analysis.  (mn) time and space. English words or sentences: m, n  10. Computational biology: m = n = 100,000. 10 billions ops OK, but 10GB array? Alignment(m, n, x 1 x 2...x m, y 1 y 2...y n, ,  ) { // A[0..m,0..n]: int array for i = 0 to m A[i, 0] = i  for j = 1 to n A[0, j] = j  for i = 1 to m A[i, j] = min(  [x i, y j ] + A[i-1, j-1],  + A[i-1, j],  + A[i, j-1]) return A[m, n] }

15 15 Sequence Alignment: Algorithm Alignment(m, n, x 1 x 2...x m, y 1 y 2...y n, ,  ) { // A[0..m,0..n]: int array for i = 0 to m A[i, 0] = i  for j = 1 to n A[0, j] = j  for i = 1 to m A[i, j] = min(  [x i, y j ] + A[i-1, j-1],  + A[i-1, j],  + A[i, j-1]) return A[m, n] } Assuming  = 1  [x i, y j ] = 0 if x i =y j  [x i, y j ] = 1 otherwise

16 Goal: Given two strings X = x 1 x 2... x m and Y = y 1 y 2... y n of symbols, find alignment of X and a substring of Y with minimum cost. Ex: CTACCG vs. TXYTACATGAH. Sol: Substring is TACATG and M = x 2 -y 4, x 3 -y 5, x 4 -y 6, x 5 -y 7, x 6 -y 9. Subequence Alignment

17 17 Sequence Alignment: Problem Structure Def. OPT(i, j) = min cost of aligning strings x 1 x 2... x i and y 1 y 2... y j. n Case 1: OPT matches x i -y j. – pay mismatch for x i -y j + min cost of aligning two strings x 1 x 2... x i-1 and y 1 y 2... y j-1 n Case 2a: OPT leaves x i unmatched. – pay gap for x i and min cost of aligning x 1 x 2... x i-1 and y 1 y 2... y j n Case 2b: i < m and OPT leaves y j unmatched. – pay gap for y j and min cost of aligning x 1 x 2... x i and y 1 y 2... y j-1 n Case 2c: i == m and OPT leaves y j unmatched. – pay 0 for y j and min cost of aligning x 1 x 2... x i and y 1 y 2... y j-1

18 18 Subequence Alignment: Algorithm Analysis.  (mn) time and space. Alignment(m, n, x 1 x 2...x m, y 1 y 2...y n, ,  ) { // A[0..m,0..n]: int array for i = 0 to m A[i, 0] = i  for j = 1 to n A[0, j] = 0 for i = 1 to m - 1 A[i, j] = min(  [x i, y j ] + A[i-1, j-1],  + A[i-1, j],  + A[i, j-1]) A[m, j] = min(  [x m, y j ] + A[m-1, j-1],  + A[m-1, j], A[m, j-1]) return A[m, n] }

19 Longest common subsequence The longest common subsequence (not substring) between “democrat” and “republican” is eca. A common subsequence is defined by all the identical character matches in an alignment of two strings. To maximize the number of such matches, we must prevent substitution of non-identical characters, that is, 2  <=  pq for p != q. A[i, j] = min(  [x i, y j ] + A[i-1, j-1],  + A[i-1, j],  + A[i, j-1]) 19

20 Maximum Monotone Subsequence A numerical sequence is monotonically increasing if the ith element is at least as big as the (i - 1)st element. The maximum monotone subsequence problem seeks to delete the fewest number of elements from an input string S to leave a monotonically increasing subsequence. Ex: A longest increasing subsequence of “243519698” is “24569.” Let X be the input sequence and Y be the sorted input sequence. Then a longest increasing subsequence of X is also a longest common subsequence of X and Y, and vice versa. Using the previous idea, we can solve this problem in O(n 2 ) space and time. Can we do better? 20

21 Maximum Monotone Subsequence A numerical sequence is monotonically increasing if the ith element is at least as big as the (i - 1)st element. Given X = x 1 x 2... x n find the longest monotonically increasing subsequence of X. Let OPT(i) be the longest monotonically increasing subsequence ending with x i. Then OPT(1) = 1 and OPT(i) = max(OPT(j)+1 : j < i and x j < x i ) 21 MonotoneSubsequence(x 1 x 2...x n ) { // A[1..n]: int array for i = 1 to n { A[i] = 1 for j = 1 to i - 1 if (x i >= x j ) A[i] = max(A[i], A[j]+1) } return max(A[1..n]) } // O(n) space, O(n 2 ) time

22 Maximum Monotone Subsequence MonotoneSubsequence returns the length of maximum monotone subsequence. How to return the maximum monotone subsequence? 22 MonotoneSubsequence2(x 1 x 2...x n ) { y = MonotoneSubsequence(x 1 x 2...x n ) for k = 1 to n if (A[k] == y) i = k; S = []; while (i > 0) { S = x i + S for j = i – 1 to 1 if (x i >= x j && A[i] == A[j]+1) { i = j; break; } if (j < 1) break; } return S } // O(n) time


Download ppt "1 Chapter 6 Dynamic Programming. 2 Algorithmic Paradigms Greedy. Build up a solution incrementally, optimizing some local criterion. Divide-and-conquer."

Similar presentations


Ads by Google