Matrix Chain Multiplication

Slides:



Advertisements
Similar presentations
Multiplying Matrices Two matrices, A with (p x q) matrix and B with (q x r) matrix, can be multiplied to get C with dimensions p x r, using scalar multiplications.
Advertisements

Algorithm Design Methodologies Divide & Conquer Dynamic Programming Backtracking.
Dynamic Programming An algorithm design paradigm like divide-and-conquer “Programming”: A tabular method (not writing computer code) Divide-and-Conquer.
CS 691 Computational Photography Instructor: Gianfranco Doretto Cutting Images.
Chapter 15 Dynamic Programming Lee, Hsiu-Hui Ack: This presentation is based on the lecture slides from Hsu, Lih-Hsing, as well as various materials from.
Dynamic Programming (pro-gram)
Dynamic Programming Lets begin by looking at the Fibonacci sequence.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2006 Lecture 1 (Part 3) Design Patterns for Optimization Problems.
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Spring, 2009 Design Patterns for Optimization Problems Dynamic Programming.
Dynamic Programming Code
UMass Lowell Computer Science Analysis of Algorithms Prof. Karen Daniels Fall, 2002 Lecture 1 (Part 3) Tuesday, 9/3/02 Design Patterns for Optimization.
CPSC 411 Design and Analysis of Algorithms Set 5: Dynamic Programming Prof. Jennifer Welch Spring 2011 CPSC 411, Spring 2011: Set 5 1.
1 Dynamic Programming Andreas Klappenecker [based on slides by Prof. Welch]
UNC Chapel Hill Lin/Manocha/Foskey Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject.
Analysis of Algorithms CS 477/677
Analysis of Algorithms
11-1 Matrix-chain Multiplication Suppose we have a sequence or chain A 1, A 2, …, A n of n matrices to be multiplied –That is, we want to compute the product.
Dynamic Programming Introduction to Algorithms Dynamic Programming CSE 680 Prof. Roger Crawfis.
Dynamic Programming Dynamic programming is a technique for solving problems with a recursive structure with the following characteristics: 1.optimal substructure.
Dynamic Programming UNC Chapel Hill Z. Guo.
Dynamic Programming. Well known algorithm design techniques:. –Divide-and-conquer algorithms Another strategy for designing algorithms is dynamic programming.
CS 5243: Algorithms Dynamic Programming Dynamic Programming is applicable when sub-problems are dependent! In the case of Divide and Conquer they are.
COSC 3101A - Design and Analysis of Algorithms 7 Dynamic Programming Assembly-Line Scheduling Matrix-Chain Multiplication Elements of DP Many of these.
CS 8833 Algorithms Algorithms Dynamic Programming.
1 Dynamic Programming Andreas Klappenecker [partially based on slides by Prof. Welch]
Algorithms: Design and Analysis Summer School 2013 at VIASM: Random Structures and Algorithms Lecture 4: Dynamic Programming Phan Th ị Hà D ươ ng 1.
Algorithmics - Lecture 121 LECTURE 11: Dynamic programming - II -
Optimization Problems In which a set of choices must be made in order to arrive at an optimal (min/max) solution, subject to some constraints. (There may.
15.Dynamic Programming. Computer Theory Lab. Chapter 15P.2 Dynamic programming Dynamic programming is typically applied to optimization problems. In such.
Chapter 15 Dynamic Programming Lee, Hsiu-Hui Ack: This presentation is based on the lecture slides from Hsu, Lih-Hsing, as well as various materials from.
TU/e Algorithms (2IL15) – Lecture 3 1 DYNAMIC PROGRAMMING
1 Chapter 15-2: Dynamic Programming II. 2 Matrix Multiplication Let A be a matrix of dimension p x q and B be a matrix of dimension q x r Then, if we.
Dynamic Programming Typically applied to optimization problems
Dynamic Programming (DP)
Advanced Algorithms Analysis and Design
Advanced Algorithms Analysis and Design
Seminar on Dynamic Programming.
Chapter 8 Dynamic Programming
Advanced Design and Analysis Techniques
Matrix Chain Multiplication
Data Structures and Algorithms (AT70. 02) Comp. Sc. and Inf. Mgmt
Dynamic programming techniques
Dynamic programming techniques
Dynamic Programming Several problems Principle of dynamic programming
Chapter 8 Dynamic Programming.
CSCE 411 Design and Analysis of Algorithms
Unit-5 Dynamic Programming
Matrix Chain Multiplication
ICS 353: Design and Analysis of Algorithms
Merge Sort 1/12/2019 5:31 PM Dynamic Programming Dynamic Programming.
Advanced Algorithms Analysis and Design
Dynamic Programming.
Dynamic Programming Dynamic Programming 1/18/ :45 AM
Data Structure and Algorithms
Chapter 15: Dynamic Programming II
Dynamic Programming Comp 122, Fall 2004.
Lecture 8. Paradigm #6 Dynamic Programming
Ch. 15: Dynamic Programming Ming-Te Chi
Algorithms CSCI 235, Spring 2019 Lecture 28 Dynamic Programming III
Matrix Chain Product 張智星 (Roger Jang)
//****************************************
Algorithms and Data Structures Lecture X
Analysis of Algorithms CS 477/677
CSCI 235, Spring 2019, Lecture 25 Dynamic Programming
Algorithms CSCI 235, Spring 2019 Lecture 27 Dynamic Programming II
Chapter 15: Dynamic Programming
Asst. Prof. Dr. İlker Kocabaş
Chapter 15: Dynamic Programming
Seminar on Dynamic Programming.
Data Structures and Algorithms Dynamic Programming
Presentation transcript:

Matrix Chain Multiplication

Matrix-chain Multiplication Suppose we have a sequence or chain A1, A2, …, An of n matrices to be multiplied That is, we want to compute the product A1A2…An There are many possible ways (parenthesizations) to compute the product.

Matrix-chain Multiplication To compute the number of scalar multiplications necessary, we must know: Algorithm to multiply two matrices Matrix dimensions Can you write the algorithm to multiply two matrices?

Algorithm to Multiply 2 Matrices Input: Matrices Ap×q and Bq×r (with dimensions p×q and q×r) Result: Matrix Cp×r resulting from the product A·B MATRIX-MULTIPLY(Ap×q , Bq×r) 1. for i ← 1 to p 2. for j ← 1 to r 3. C[i, j] ← 0 4. for k ← 1 to q 5. C[i, j] ← C[i, j] + A[i, k] · B[k, j] 6. return C Scalar multiplication in line 5 dominates time to compute C Number of scalar multiplications = pqr

Matrix Chain Multiplication A - km matrix B - mn matrix C = AB is defined as kn size matrix with elements 2 1 3 0 1 1 1 2 1 2 -1 0 9 1 2  =

Matrix Chain Multiplication A - km matrix B - mn matrix C = AB Each element of C can be computed in time (m) Matrix C can be computed in time (kmn) Matrix multiplication is associative, i.e. A(BC) = (AB)C

Matrix Chain Multiplication A - 210000 matrix B - 100005 matrix C - 550 matrix (AB)C (AB) - = 2*10000*5 =100000 scalar multiplications (AB)C - = 2*5*50 =500 scalar multiplications 100500 scalar multiplications A(BC) (BC) - = 10000*5*50 = 2500000 scalar multiplications A(BC) - = 2*10000*50 = 1000000 scalar multiplications 3500000 scalar multiplications How we parenthesize a chain of matrices can have a dramatic impact on the cost of evaluating the product.

Matrix Chain Multiplication - Problem For a given sequence of matrices find a parenthesization which gives the smallest number of multiplications that are necessary to compute the product of all matrices in the given sequence.

Counting Number of Parenthesizations

Catalan Numbers n multiplication order 2 (x1 · x2) 3 (x1 · (x2 · x3)) 4 (x1 · (x2 · (x3 · x4))) (x1 · ((x2 · x3) · x4)) ((x1 · x2) · (x3 · x4)) ((x1 · (x2 · x3)) · x4) (((x1 · x2) · x3) · x4) n C(n) 1 2 3 4 5 14 6 42 7 132 Multiplying n Numbers Objective: Find C(n), the number of ways to compute product x1 . x2 …. xn.

Counting the Number of Parenthesizations

Number of Parenthesizations

Multiplying n Numbers - small n Recursive equation: where is the last multiplication? Catalan numbers: Asymptotic value:

Applying Dynamic Programming Characterize the structure of an optimal solution Recursively define the value of an optimal solution Compute the value of an optimal solution in a bottom-up fashion Construct an optimal solution from the information computed in Step 3

Matrix-Chain Multiplication Given a chain of matrices A1, A2, …, An, where for i = 1, 2, …, n matrix Ai has dimensions pi-1x pi, fully parenthesize the product A1  A2  An in a way that minimizes the number of scalar multiplications. A1  A2  Ai  Ai+1  An p0 x p1 p1 x p2 pi-1 x pi pi x pi+1 pn-1 x pn

1. The Structure of an Optimal Parenthesization Step 1: Characterize the structure of an optimal solution Ai..j : matrix that results from evaluating the product Ai Ai+1 Ai+2 ... Aj ,i≤j An optimal parenthesization of the product A1A2 ... An – Splits the product between Ak and Ak+1, for some 1≤k<n Ai..j = (A1A2A3 ... Ak) · (Ak+1Ak+2 ... An) – i.e., first compute A1..k and Ak+1..n and then multiply these two The cost of this optimal parenthesization Cost of computing A1..k+ Cost of computing Ak+1..n + Cost of multiplying A1..k · Ak+1..n

1. The Structure of an Optimal Parenthesization… Ai…j = Ai…k Ak+1…j Key observation: Given optimal parenthesization (A1A2A3 ... Ak) · (Ak+1Ak+2 ... An) Parenthesization of the sub-chain A1A2…Ak Parenthesization of the sub-chain Ak+1Ak+2…An should be optimal if the parenthesization of the chain A1A2…An is optimal (why?) That is, the optimal solution to the problem contains within it the optimal solution to sub-problems i.e. optimal substructure within optimal solution exists.

2. A Recursive Solution Subproblem: Step 2: Define the value of optimal solution recursively Subproblem: Determine the minimum cost of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n Let m[i, j] = the minimum number of multiplications needed to compute Ai…j Full problem (A1..n): m[1, n]

2. A Recursive Solution …. Define m recursively: Ai…j = Ai…k Ak+1…j i = j: Ai…i = Ai  m[i, i] = 0, for i = 1, 2, …, n , since the sub-chain contain just one matrix; no multiplication at all. i < j: assume that the optimal parenthesization splits the product Ai Ai+1  Aj between Ak and Ak+1, i  k < j Ai…j = Ai…k Ak+1…j m[i, j] = m[i, k] + m[k+1, j] + pi-1pkpj Ai…k Ak+1…j Ai…kAk+1…j

2. A Recursive Solution ….. Consider the subproblem of parenthesizing Ai…j = Ai Ai+1  Aj for 1  i  j  n = Ai…k Ak+1…j for i  k < j m[i, j] = the minimum number of multiplications needed to compute the product Ai…j m[i, j] = m[i, k] + m[k+1, j] + pi-1pkpj pi-1pkpj m[k+1,j] m[i, k] min # of multiplications to compute Ai…k min # of multiplications to compute Ak+1…j # of multiplications to compute Ai…kAk…j

2. A Recursive Solution …. We do not know the value of k m[i, j] = m[i, k] + m[k+1, j] + pi-1pkpj We do not know the value of k There are j – i possible values for k: k = i, i+1, …, j-1 Minimizing the cost of parenthesizing the product Ai Ai+1  Aj becomes: 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j

Reconstructing the Optimal Solution Additional information to maintain: s[i, j] = a value of k at which we can split the product Ai Ai+1  Aj in order to obtain an optimal parenthesization

3. Computing the Optimal Costs 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j A recurrent algorithm may encounter each subproblem many times in different branches of the recursion  overlapping subproblems Compute a solution using a tabular bottom-up approach

3. Computing the Optimal Costs … 0 if i = j m[i, j] = min {m[i, k] + m[k+1, j] + pi-1pkpj} if i < j ik<j Length = 0: i = j, i = 1, 2, …, n Length = 1: j = i + 1, i = 1, 2, …, n-1 second first 1 2 3 n m[1, n] gives the optimal solution to the problem n j 3 Compute rows from bottom to top and from left to right In a similar matrix s we keep the optimal values of k 2 1 i

Example: min {m[i, k] + m[k+1, j] + pi-1pkpj} m[2, 2] + m[3, 5] + p1p2p5 m[2, 3] + m[4, 5] + p1p3p5 m[2, 4] + m[5, 5] + p1p4p5 k = 2 m[2, 5] = min k = 3 k = 4 1 2 3 4 5 6 6 5 Values m[i, j] depend only on values that have been previously computed 4 j 3 2 1 i

Example Compute A1  A2  A3 A1: 10 x 100 (p0 x p1) m[i, i] = 0 for i = 1, 2, 3 m[1, 2] = m[1, 1] + m[2, 2] + p0p1p2 (A1A2) = 0 + 0 + 10 *100* 5 = 5,000 m[2, 3] = m[2, 2] + m[3, 3] + p1p2p3 (A2A3) = 0 + 0 + 100 * 5 * 50 = 25,000 m[1, 3] = min m[1, 1] + m[2, 3] + p0p1p3 = 75,000 (A1(A2A3)) m[1, 2] + m[3, 3] + p0p2p3 = 7,500 ((A1A2)A3) 7500 2 25000 2 3 5000 1 2 1

Algorithm Chain-Matrix-Order(p) n  length[p] – 1 for i  1 to n do m[i, i]  0 for l  2 to n, do for i  1 to n-l+1 do j  i+l-1 m[i, j]   for k  i to j-1 do q  m[i, k] + m[k+1, j] + pi-1 . pk . pj if q < m[i, j] then m[i, j] = q s[i, j]  k return m and s, “l is chain length” 1 2 3 4 m[1,4] m[2,4] m[3,4] m[4,4] m[1,3] m[2,3] m[3,3] m[1,2] m[2,2] m[1,1] 4 3 2 1

Example: Dynamic Programming Problem: Compute optimal multiplication order for a series of matrices given below 1 2 3 4 m[1,4] m[2,4] m[3,4] m[4,4] m[1,3] m[2,3] m[3,3] m[1,2] m[2,2] m[1,1] P0 = 10 P1 = 100 P2 = 5 P3 = 50 P4 = 20 4 3 2 1

Main Diagonal m[1,4] m[2,4] m[3,4] m[1,3] m[2,3] m[1,2] Main Diagonal 1 2 3 4 m[1,4] m[2,4] m[3,4] m[1,3] m[2,3] m[1,2] 4 3 2 1

Computing m[3,4], 4] m[1,4] m[2,4] 3 5000 m[1,3] m[2,3] m[1,2] 1 2 3 4 m[1,4] m[2,4] 3 5000 m[1,3] m[2,3] m[1,2] m[3, 4] = 0 + 0 + 5 . 50 . 20 = 5000 s[3, 4] = k = 3 4 3 2 1

Computing m[2, 3] m[1,4] m[2,4] 3 5000 m[1,3] 225000 m[1,2] 1 2 3 4 m[2, 3] = 0 + 0 + 100 . 5 . 50 = 25000 s[2, 3] = k = 2 m[1,4] m[2,4] 3 5000 m[1,3] 225000 m[1,2] 4 3 2 1

Computing m[1, 2] m[1,4] m[2,4] 35000 m[1,3] 225000 1 5000 1 2 3 4 m[1,4] m[2,4] 35000 m[1,3] 225000 1 5000 m[1, 2] = 0 + 0 + 10 . 100 . 5 = 5000 s[1, 2] = k = 1 4 3 2 1

Computing m[2, 4] 1 2 3 4 m[1,4] 215000 3 5000 m[1,3] 225000 1 5000 4 3 2 1 m[2, 4] = min(0+5000+100.5.20, 25000+0+100.50.20) = min(15000, 35000) = 15000 s[2, 4] = k = 2

Computing m[1, 3] 1 2 3 4 m[1,4] 215000 3 5000 2 2500 225000 1 5000 4 3 2 1 m[1, 3] = min(0+25000+10.100.50, 5000+0+10.5.50) = min(75000, 2500) = 2500 s[1, 3] = k = 2

Computing m[1, 4] 211000 215000 3 5000 2 2500 225000 1 5000 m[1, 4] = min(0+15000+10.100.20, 5000+5000+ 10.5.20, 2500+0+10.50.20) = min(35000, 11000, 35000) = 11000 s[1, 4] = k = 2 4 3 2 1

Final Cost Matrix and Its Order of Computation 1 2 3 4 211000 215000 3 5000 2 2500 225000 1 5000 Final Cost Matrix Order of Computation 4 3 2 1 1 2 3 4 10 8 5 4 9 6 3 7 2 1 4 3 2 1

K,s Values Leading Minimum m[i, j] 1 2 3 4 4 3 2 1 2 3 1

l =3 m[3,5] = min m[3,4]+m[5,5] + 15*10*20 =750 + 0 + 3000 = 3750 m[3,3]+m[4,5] + 15*5*20 =0 + 1000 + 1500 = 2500 l = 2 10*20*25=5000 35*15*5=2625

Analysis Chain-Matrix-Order(p) n  length[p] – 1 for i  1 to n do m[i, i]  0 for l  2 to n, do for i  1 to n-l+1 do j  i+l-1 m[i, j]   for k  i to j-1 do q  m[i, k] + m[k+1, j] + pi-1 . pk . pj if q < m[i, j] then m[i, j] = q s[i, j]  k return m and s, “l is chain length” Takes O(n3) time Requires O(n2) space

4. Construct the Optimal Solution Our algorithm computes the minimum- cost table m and the split table s The optimal solution can be constructed from the split table s Each entry s[i, j ]=k shows where to split the product Ai Ai+1 … Aj for the minimum cost.

4. Construct the Optimal Solution … s[i, j] = value of k such that the optimal parenthesization of Ai Ai+1  Aj splits the product between Ak and Ak+1 1 2 3 4 5 6 A1..n = A1..s[1, n]  As[1, n]+1..n 3 5 - 4 1 2 6 s[1, n] = 3  A1..6 = A1..3 A4..6 s[1, 3] = 1  A1..3 = A1..1 A2..3 s[4, 6] = 5  A4..6 = A4..5 A6..6 5 4 3 j 2 1 i

4. Construct the Optimal Solution … PRINT-OPT-PARENS(s, i, j) if i = j then print “A”i else print “(” PRINT-OPT-PARENS(s, i, s[i, j]) PRINT-OPT-PARENS(s, s[i, j] + 1, j) print “)” 1 2 3 4 5 6 3 5 - 4 1 2 6 5 4 j 3 2 1 i

Example: A1  A6 ( ( A1 ( A2 A3 ) ) ( ( A4 A5 ) A6 ) ) 3 5 - 4 1 2 PRINT-OPT-PARENS(s, i, j) if i = j then print “A”i else print “(” PRINT-OPT-PARENS(s, i, s[i, j]) PRINT-OPT-PARENS(s, s[i, j] + 1, j) print “)” s[1..6, 1..6] 1 2 3 4 5 6 3 5 - 4 1 2 6 5 4 j 3 2 P-O-P(s, 1, 6) s[1, 6] = 3 i = 1, j = 6 “(“ P-O-P (s, 1, 3) s[1, 3] = 1 i = 1, j = 3 “(“ P-O-P(s, 1, 1)  “A1” P-O-P(s, 2, 3) s[2, 3] = 2 i = 2, j = 3 “(“ P-O-P (s, 2, 2)  “A2” P-O-P (s, 3, 3)  “A3” “)” 1 i …

Elements of Dynamic Programming When should we look for a DP solution to an optimization problem? Two key ingredients for the problem Optimal substructure Overlapping subproblems

Optimal Substructure

Optimal Substructure

Optimal Substructure

Overlapping Subproblems

Overlapping Subproblems