Download presentation
1
Dynamic Programming UNC Chapel Hill Z. Guo
2
Optimization Problems
In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may be several solutions to achieve an optimal value.) Two common techniques: Dynamic Programming (global) Greedy Algorithms (local) UNC Chapel Hill
3
Dynamic Programming Dynamic Programming is an algorithm design technique for optimization problems: often minimizing or maximizing. Like divide and conquer, DP solves problems by combining solutions to subproblems. Unlike divide and conquer, subproblems may overlap. Subproblems may share subsubproblems, However, solution to one subproblem may not affect the solutions to other subproblems of the same problem. (More on this later.) DP reduces computation by Solving subproblems in a bottom-up fashion. Storing solution to a subproblem the first time it is solved. Looking up the solution when subproblem is encountered again. Key: determine structure of optimal solutions UNC Chapel Hill
4
Steps in Dynamic Programming
Characterize structure of an optimal solution. Define value of optimal solution recursively. Compute optimal solution values either top-down with caching or bottom-up in a table. Construct an optimal solution from computed values. We’ll study these with the help of two examples. Matrix Multiplication Longest Common Subsequence UNC Chapel Hill
5
Matrix Multiplication
In particular for 1 i p and 1 j r, C[i, j] = k = 1 to q A[i, k] B[k, j] Observe that there are pr total entries in C and each takes O(q) time to compute, thus the total time to multiply 2 matrices is pqr. UNC Chapel Hill
6
Chain Matrix Multiplication
Given a sequence of matrices A1 A2…An , and dimensions p0 p1…pn where Ai is of dimension pi-1 x pi , determine multiplication sequence that minimizes the number of operations. This algorithm does not perform the multiplication, it just figures out the best order in which to perform the multiplication. UNC Chapel Hill
7
Example: CMM Consider 3 matrices: A1 be 5 x 4, A2 be x 6, and A3 be 6 x 2. Mult[((A1 A2)A3)] = (5x4x6) + (5x6x2) = 180 Mult[(A1 (A2A3 ))] = (4x6x2) + (5x4x2) = 88 Even for this small example, considerable savings can be achieved by reordering the evaluation sequence. UNC Chapel Hill
8
Naive Algorithm If we have just 1 item, then there is only one way to parenthesize. If we have n items, then there are n-1 places where you could break the list with the outermost pair of parentheses, namely just after the first item, just after the 2nd item, etc. and just after the (n-1)th item. When we split just after the kth item, we create two sub-lists to be parenthesized, one with k items and the other with n-k items. Then we consider all ways of parenthesizing these. If there are L ways to parenthesize the left sub-list, R ways to parenthesize the right sub-list, then the total possibilities is LR. UNC Chapel Hill
9
Cost of Naive Algorithm
The number of different ways of parenthesizing n items is P(n) = 1, if n = 1 P(n) = k = 1 to n-1 P(k)P(n-k), if n 2 This is related to Catalan numbers (which in turn is related to the number of different binary trees on n nodes). Specifically P(n) = C(n-1). C(n) = (1/(n+1)) C(2n, n) (4n / n3/2) where C(2n, n) stands for the number of various ways to choose n items out of 2n items total. UNC Chapel Hill
10
DP Solution (I) Let Ai…j be the product of matrices i through j. Ai…j is a pi-1 x pj matrix. At the highest level, we are multiplying two matrices together. That is, for any k, 1 k n-1, A1…n = (A1…k)(Ak+1…n) The problem of determining the optimal sequence of multiplication is broken up into 2 parts: : How do we decide where to split the chain (what k)? A : Consider all possible values of k. : How do we parenthesize the subchains A1…k & Ak+1…n? A : Solve by recursively applying the same scheme. NOTE: this problem satisfies the “principle of optimality”. Next, we store the solutions to the sub-problems in a table and build the table in a bottom-up manner. UNC Chapel Hill
11
DP Solution (II) For 1 i j n, let m[i, j] denote the minimum number of multiplications needed to compute Ai…j . Example: Minimum number of multiplies for A3…7 In terms of pi , the product A3…7 has dimensions ____. UNC Chapel Hill
12
DP Solution (III) The optimal cost can be described be as follows:
i = j the sequence contains only 1 matrix, so m[i, j] = 0. i < j This can be split by considering each k, i k < j, as Ai…k (pi-1 x pk ) times Ak+1…j (pk x pj). This suggests the following recursive rule for computing m[i, j]: m[i, i] = 0 m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) for i < j UNC Chapel Hill
13
Computing m[i, j] For a specific k, (Ai …Ak)( Ak+1 … Aj) =
m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
14
Computing m[i, j] For a specific k, (Ai …Ak)( Ak+1 … Aj) = Ai…k( Ak+1 … Aj) (m[i, k] mults) m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
15
Computing m[i, j] For a specific k, (Ai …Ak)( Ak+1 … Aj) = Ai…k( Ak+1 … Aj) (m[i, k] mults) = Ai…k Ak+1…j (m[k+1, j] mults) m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
16
Computing m[i, j] For a specific k, (Ai …Ak)( Ak+1 … Aj) = Ai…k( Ak+1 … Aj) (m[i, k] mults) = Ai…k Ak+1…j (m[k+1, j] mults) = Ai…j (pi-1 pk pj mults) m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
17
Computing m[i, j] For a specific k, (Ai …Ak)( Ak+1 … Aj) = Ai…k( Ak+1 … Aj) (m[i, k] mults) = Ai…k Ak+1…j (m[k+1, j] mults) = Ai…j (pi-1 pk pj mults) For solution, evaluate for all k and take minimum. m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
18
Matrix-Chain-Order(p)
1. n length[p] - 1 2. for i 1 to n // initialization: O(n) time do m[i, i] 0 4. for L 2 to n // L = length of sub-chain do for i 1 to n - L+1 do j i + L - 1 m[i, j] for k i to j - 1 do q m[i, k] + m[k+1, j] + pi-1 pk pj if q < m[i, j] then m[i, j] q s[i, j] k 13. return m and s UNC Chapel Hill
19
Example: DP for CMM The initial set of dimensions are <5, 4, 6, 2, 7>: we are multiplying A1 (5x4) times A2 (4x6) times A3 (6x2) times A4 (2x7). Optimal sequence is (A1 (A2A3 )) A4. UNC Chapel Hill
20
Analysis The array s[i, j] is used to extract the actual sequence (see example). There are 3 nested loops and each can iterate at most n times, so the total running time is (n3). UNC Chapel Hill
21
Extracting Optimum Sequence
Leave a split marker indicating where the best split is (i.e. the value of k leading to minimum values of m[i, j]). We maintain a parallel array s[i, j] in which we store the value of k providing the optimal split. If s[i, j] = k, the best way to multiply the sub-chain Ai…j is to first multiply the sub-chain Ai…k and then the sub-chain Ak+1…j , and finally multiply them together. Intuitively s[i, j] tells us what multiplication to perform last. We only need to store s[i, j] if we have at least 2 matrices & j > i. UNC Chapel Hill
22
Mult (A, i, j) 1. if (j > i) 2. then k = s[i, j]
X = Mult(A, i, k) // X = A[i]...A[k] Y = Mult(A, k+1, j) // Y = A[k+1]...A[j] return X*Y // Multiply X*Y else return A[i] // Return ith matrix UNC Chapel Hill
23
Finding a Recursive Solution
Figure out the “top-level” choice you have to make (e.g., where to split the list of matrices) List the options for that decision Each option should require smaller sub-problems to be solved Recursive function is the minimum (or max) over all the options m[i, j] = mini k < j (m[i, k] + m[k+1, j] + pi-1pkpj ) UNC Chapel Hill
24
Longest Common Subsequence
Problem: Given 2 sequences, X = x1,...,xm and Y = y1,...,yn, find a common subsequence whose length is maximum. springtime ncaa tournament basketball printing north carolina Zhishan Subsequence need not be consecutive, but must be in order. UNC Chapel Hill
25
Naïve Algorithm For every subsequence of X, check whether it’s a subsequence of Y . Time: Θ(n2m). 2m subsequences of X to check. Each subsequence takes Θ(n) time to check: scan Y for first letter, for second, and so on. UNC Chapel Hill
26
Optimal Substructure Theorem Let Z = z1, , zk be any LCS of X and Y . 1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. 2. If xm yn, then either zk xm and Z is an LCS of Xm-1 and Y . or zk yn and Z is an LCS of X and Yn-1. Notation: prefix Xi = x1,...,xi is the first i letters of X. This says what any longest common subsequence must look like; do you believe it? UNC Chapel Hill
27
Optimal Substructure Theorem
Let Z = z1, , zk be any LCS of X and Y . 1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. 2. If xm yn, then either zk xm and Z is an LCS of Xm-1 and Y . or zk yn and Z is an LCS of X and Yn-1. Proof: (case 1: xm = yn) Any sequence Z’ that does not end in xm = yn can be made longer by adding xm = yn to the end. Therefore, longest common subsequence (LCS) Z must end in xm = yn. Zk-1 is a common subsequence of Xm-1 and Yn-1, and there is no longer CS of Xm-1 and Yn-1, or Z would not be an LCS. UNC Chapel Hill
28
Optimal Substructure Theorem
Let Z = z1, , zk be any LCS of X and Y . 1. If xm = yn, then zk = xm = yn and Zk-1 is an LCS of Xm-1 and Yn-1. 2. If xm yn, then either zk xm and Z is an LCS of Xm-1 and Y . or zk yn and Z is an LCS of X and Yn-1. Proof: (case 2: xm yn, and zk xm) Since Z does not end in xm, Z is a common subsequence of Xm-1 and Y, and there is no longer CS of Xm-1 and Y, or Z would not be an LCS. UNC Chapel Hill
29
Recursive Solution Define c[i, j] = length of LCS of Xi and Yj .
We want c[m,n]. This gives a recursive algorithm and solves the problem. But does it solve it well?
30
Recursive Solution c[springtime, printing]
c[springtim, printing] c[springtime, printin] [springti, printing] [springtim, printin] [springtim, printin] [springtime, printi] [springt, printing] [springti, printin] [springtim, printi] [springtime, print]
31
Recursive Solution Keep track of c[a,b] in a table of nm entries:
g s m e Keep track of c[a,b] in a table of nm entries: top/down bottom/up
32
Computing the length of an LCS
LCS-LENGTH (X, Y) m ← length[X] n ← length[Y] for i ← 1 to m do c[i, 0] ← 0 for j ← 0 to n do c[0, j ] ← 0 do for j ← 1 to n do if xi = yj then c[i, j ] ← c[i1, j1] + 1 b[i, j ] ← “ ” else if c[i1, j ] ≥ c[i, j1] then c[i, j ] ← c[i 1, j ] b[i, j ] ← “↑” else c[i, j ] ← c[i, j1] b[i, j ] ← “←” return c and b b[i, j ] points to table entry whose subproblem we used in solving LCS of Xi and Yj. c[m,n] contains the length of an LCS of X and Y. Time: O(mn) UNC Chapel Hill
33
Constructing an LCS PRINT-LCS (b, X, i, j) if i = 0 or j = 0
then return if b[i, j ] = “ ” then PRINT-LCS(b, X, i1, j1) print xi elseif b[i, j ] = “↑” then PRINT-LCS(b, X, i1, j) else PRINT-LCS(b, X, i, j1) Initial call is PRINT-LCS (b, X,m, n). When b[i, j ] = , we have extended LCS by one character. So LCS = entries with in them. Time: O(m+n) UNC Chapel Hill
34
Elements of Dynamic Programming
Optimal substructure Overlapping subproblems UNC Chapel Hill
35
Optimal Substructure Show that a solution to a problem consists of making a choice, which leaves one or more subproblems to solve. Suppose that you are given this last choice that leads to an optimal solution. Given this choice, determine which subproblems arise and how to characterize the resulting space of subproblems. Show that the solutions to the subproblems used within the optimal solution must themselves be optimal. Usually use cut-and-paste. Need to ensure that a wide enough range of choices and subproblems are considered. UNC Chapel Hill
36
Optimal Substructure Optimal substructure varies across problem domains: 1. How many subproblems are used in an optimal solution. 2. How many choices in determining which subproblem(s) to use. Informally, running time depends on (# of subproblems overall) (# of choices). How many subproblems and choices do the examples considered contain? Dynamic programming uses optimal substructure bottom up. First find optimal solutions to subproblems. Then choose which to use in optimal solution to the problem. Shortest Path vs. NP-hard problems UNC Chapel Hill
37
Overlapping Subproblems
The space of subproblems must be “small”. The total number of distinct subproblems is a polynomial in the input size. A recursive algorithm is exponential because it solves the same problems repeatedly. If divide-and-conquer is applicable, then each problem solved will be brand new. UNC Chapel Hill
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.