Towards Communication Avoiding Fast Algorithm for Sparse Matrix Multiplication Part I: Minimizing arithmetic operations Oded Schwartz CS294, Lecture #21.

Slides:



Advertisements
Similar presentations
2.3 Modeling Real World Data with Matrices
Advertisements

Applied Informatics Štefan BEREŽNÝ
1 CSE 417: Algorithms and Computational Complexity Winter 2001 Lecture 4 Instructor: Paul Beame TA: Gidon Shavit.
Introduction to Algorithms 6.046J/18.401J L ECTURE 3 Divide and Conquer Binary search Powering a number Fibonacci numbers Matrix multiplication Strassen’s.
Divide and Conquer. Recall Complexity Analysis – Comparison of algorithm – Big O Simplification From source code – Recursive.
CS 140 : Matrix multiplication Linear algebra problems Matrix multiplication I : cache issues Matrix multiplication II: parallel issues Thanks to Jim Demmel.
CS 240A : Matrix multiplication Matrix multiplication I : parallel issues Matrix multiplication II: cache issues Thanks to Jim Demmel and Kathy Yelick.
Nattee Niparnan. Recall  Complexity Analysis  Comparison of Two Algos  Big O  Simplification  From source code  Recursive.
1 Reduction between Transitive Closure & Boolean Matrix Multiplication Presented by Rotem Mairon.
Maths for Computer Graphics
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
1 Fast Sparse Matrix Multiplication Raphael Yuster Haifa University (Oranim) Uri Zwick Tel Aviv University ESA 2004.
1 Finding cycles using rectangular matrix multiplication and dynamic programming Raphael Yuster Haifa Univ. - Oranim Uri Zwick Tel Aviv University Uri.
CS267 L12 Sources of Parallelism(3).1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 12: Sources of Parallelism and Locality (Part 3)
Design of parallel algorithms
1 Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as a i,j and elements of.
11/26/02CSE FFT,etc CSE Algorithms Polynomial Representations, Fourier Transfer, and other goodies. (Chapters 28-30)
Table of Contents Matrices - Multiplication Assume that matrix A is of order m  n and matrix B is of order p  q. To determine whether or not A can be.
Row 1 Row 2 Row 3 Row m Column 1Column 2Column 3 Column 4.
4.2 Operations with Matrices Scalar multiplication.
How to Compute and Prove Lower and Upper Bounds on the Communication Costs of Your Algorithm Part III: Graph analysis Oded Schwartz CS294, Lecture #10.
MA/CSSE 473 Day 17 Divide-and-conquer Convex Hull Strassen's Algorithm: Matrix Multiplication.
CS 140 : Matrix multiplication Warmup: Matrix times vector: communication volume Matrix multiplication I: parallel issues Matrix multiplication II: cache.
Introduction to Algorithms 6.046J/18.401J/SMA5503 Lecture 3 Prof. Erik Demaine.
Matrices: Simplifying Algebraic Expressions Combining Like Terms & Distributive Property.
1 Chapter 4-2 Divide and Conquer Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Matrix Operations.
ADA: 4.5. Matrix Mult.1 Objective o an extra divide and conquer example, based on a question in class Algorithm Design and Analysis (ADA) ,
Divide and Conquer Andreas Klappenecker [based on slides by Prof. Welch]
1 How to Multiply Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. integers, matrices, and polynomials.
1 Complete this to a Pfaffian orientation (all internal faces have an odd number of clockwise arcs).
How to Compute and Prove Lower and Upper Bounds on the Communication Costs of Your Algorithm Part II: Geometric embedding Oded Schwartz CS294, Lecture.
Algorithmics - Lecture 41 LECTURE 4: Analysis of Algorithms Efficiency (I)
3.6 Multiplying Matrices Homework 3-17odd and odd.
09/13/2012CS4230 CS4230 Parallel Programming Lecture 8: Dense Linear Algebra and Locality Optimizations Mary Hall September 13,
CSCI-256 Data Structures & Algorithm Analysis Lecture Note: Some slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved. 16.
Notes Over 4.2 Finding the Product of Two Matrices Find the product. If it is not defined, state the reason. To multiply matrices, the number of columns.
12-2 MATRIX MULTIPLICATION MULTIPLY MATRICES BY USING SCALAR AND MATRIX MULTIPLICATION.
Communication-Avoiding Algorithms: 1) Strassen-Like Algorithms 2) Hardware Implications Jim Demmel.
Algorithms for Supercomputers Upper bounds: from sequential to parallel Oded Schwartz Seminar: Sunday, 12-2pm, B410 Workshop: Sunday, 2-5pm High performance.
A rectangular array of numeric or algebraic quantities subject to mathematical operations. The regular formation of elements into columns and rows.
Ch. 12 Vocabulary 1.) matrix 2.) element 3.) scalar 4.) scalar multiplication.
Algorithms for Supercomputers Introduction Oded Schwartz Seminar: Sunday, 12-2pm, B410 Workshop: ? High performance Fault tolerance
13.4 Product of Two Matrices
12-1 Organizing Data Using Matrices
CS 140 : Numerical Examples on Shared Memory with Cilk++
Matrix Multiplication
Matrix Operations.
BLAS: behind the scenes
Parallel Matrix Operations
Another Randomized Algorithm
MATRICES MATRIX OPERATIONS.
Raphael Yuster Haifa University Uri Zwick Tel Aviv University
CS100: Discrete structures
Introduction to Algorithms
Fast Sparse Matrix Multiplication
3.5 Perform Basic Matrix Operations
Dimensions matching Rows times Columns
CSCI 256 Data Structures and Algorithm Analysis Lecture 12
3.6 Multiply Matrices.
Chapter 4 Matrices & Determinants
Answering distance queries in directed graphs using fast matrix multiplication Raphael Yuster Haifa University Uri Zwick Tel Aviv University.
Matrix Addition, C = A + B Add corresponding elements of each matrix to form elements of result matrix. Given elements of A as ai,j and elements of B as.
Topic: Divide and Conquer
Matrix A matrix is a rectangular arrangement of numbers in rows and columns Each number in a matrix is called an Element. The dimensions of a matrix are.
Another Randomized Algorithm
3.5 Perform Basic Matrix Operations Algebra II.
Matrices - Operations MULTIPLICATION OF MATRICES
Applied Discrete Mathematics Week 4: Functions
Presentation transcript:

Towards Communication Avoiding Fast Algorithm for Sparse Matrix Multiplication Part I: Minimizing arithmetic operations Oded Schwartz CS294, Lecture #21 Fall, 2011 Communication-Avoiding Algorithms Based on: R. Yuster and U. Zwick Fast Sparse Matrix Multiplication G. Ballard, J. Demmel, O. Holtz, and O. Schwartz Communication Avoiding Fast Sparse Matrix Multiplication Many slides from:

2 How to multiply sparse matrices faster? Outline Strassen-like algorithms The naïve algorithm The Yuster-Zwick algorithm Again, minding communication costs: Fast rectangular matrix multiplication The naïve algorithm Retuning Yuster-Zwick algorithm

3 Matrix Multiplication n = C(i,j) =  k A(i,k) B(k,j) ABC ii jj

4 Strassen-like algorithms Compute n 0 x n 0 matrix multiplication using only n 0  0 multiplications (instead of n 0 3 ). Apply recursively (block-wise)  0  2.81[Strassen 69] works fast in practice. 2.79[Pan 78] 2.78[Bini 79] 2.55[Schönhage 81] 2.50 [Pan Romani,Coppersmith Winograd 84] 2.48 [Strassen 87] 2.38[Coppersmith Winograd 90] 2.38 [Cohn Kleinberg Szegedy Umans 05] Group-theoretic approach T(n) = n 0  0  T(n/n 0 ) + O(n 2 ) T(n) =  (n  0 ) n/n 0 = Are these faster on sparse matrices?

5 Sparse Matrix Multiplication  = n - number of rows and columns m - number of non-zero elements The distribution of the non-zero elements in the matrices is arbitrary!

6 Naïve Sparse Matrix Multiplication  = Each element of B is multiplied by at most n elements from A. Worst case complexity: mn j k k

7 Matrix multiplication AuthorsComplexity Coppersmith, Winograd (1990) n mn Is this as good as it gets? Yuster, Zwick (2005) m 0.7 n 1.2 +n 2+o(1)

8 Comparison r (m=n r )  n 2.38 mn m 0.7 n 1.2 +n 2 Complexity = n 

9 A closer look at the naïve algorithm  =  =

10 Complexity of the naïve algorithm Complexity = where Can it really be that bad? a i (and b i ) are the number of non-zeros in the i th column of A (row of B)

11 Regular case: Best case for naïve algorithm

12 Worst case for naïve algorithm  = 0 0 vs. m 2 /n (best case)

13 Rectangular Matrix multiplication How fast can we multiply (dense) rectangular matrices?  = n p p n n n

14 Fast Rectangular matrix multiplication Compute matrix multiplication on matrices of dimension a  b and b  c using only  =  (a, b, c) < abc multiplications Apply recursively (block-wise)  =  = a= 3 a 2 = 9 b = 2 b 2 = 4 c = 4 c 2 = 16

15 Fast Rectangular matrix multiplication [Brockett, Dobkin 76] T(n, n, log n) = n 2 + o(n 2 ) [Coppersmith 82] T(n, n, n  ) = O(n 2+  ) for  = [Coppersmith 97] T(n, n, n  ) = O(n 2+  ) for  = [Coppersmith Winograd 90] T(n, n, n 2 ) = O(n ) [Huang, Pan 97] T(n, n, n 2 ) = O(n ) [Coppersmith 97] T(n, n, p) = O(n 2+  + n 1.85 p 0.54 ) Is it better than black-box use of fast square matrix multiplication? n nn2n2 [Coppersmith Winograd 90] Dense square matrix multiplication in  (n 2.38 )

16 Fast Rectangular matrix multiplication [Brockett, Dobkin 76] T(n, n, log n) = n 2 + o(n 2 ) [Coppersmith 82] T(n, n, n  ) = O(n 2+  ) for  = [Coppersmith 97] T(n, n, n  ) = O(n 2+  ) for  = [Coppersmith Winograd 90] T(n, n, n 2 ) = O(n ) [Huang, Pan 97] T(n, n, n 2 ) = O(n ) [Coppersmith 97] T(n, n, p) = O(n 2+  + n 1.85 p 0.54 ) But Yuster-Zwick need T(n, p, n)… [Pan 72] T(a, b, c) = T(a, c, b) = T(b, a, c) = T(b, c, a) = T(c, a, b) = T(c, b, a) Which of these algorithm can we implement?

17 The combined algorithm Assume: a 1 b 1 ≥ a 2 b 2 ≥ … ≥ a n b n Choose: 0 ≤ p ≤ n Compute: AB = A 1 B 1 + A 2 B 2 Complexity: Fast rectangular matrix multiplication Naïve sparse matrix multiplication A1A1 A2A2 B1B1 B2B2

18 Analysis of combined algorithm Theorem: There exists a 1≤p≤n for which Lemma:

19 How to multiply sparse matrices faster? Outline Strassen-like algorithms The naïve algorithm The Yuster-Zwick algorithm Again, minding communication costs: The naïve algorithm Fast rectangular matrix multiplication Retuning Yuster-Zwick algorithm

Towards Communication Avoiding Fast Algorithm for Sparse Matrix Multiplication Oded Schwartz CS294, Lecture #21 Fall, 2011 Communication-Avoiding Algorithms Based on: R. Yuster and U. Zwick Fast Sparse Matrix Multiplication G. Ballard, J. Demmel, O. Holtz, and O. Schwartz Communication Avoiding Fast Sparse Matrix Multiplication Many slides from: Thank you!