Download presentation

Presentation is loading. Please wait.

Published byMadeline Cherry Modified over 2 years ago

1
Strassen's Matrix Multiplication Presented By: Gaurav Jain Lalchand Course Project On : Under The Guidance Of: Prof. Subodh Kumar

2
Basic Matrix Multiplication Suppose we want to multiply two matrices of size N x N: for example A x B = C. C 11 = a 11 b 11 + a 12 b 21 C 12 = a 11 b 12 + a 12 b 22 C 21 = a 21 b 11 + a 22 b 21 C 22 = a 21 b 12 + a 22 b 22 2x2 matrix multiplication can be accomplished in 8 multiplication.(2 log 2 8 =2 3 )

3
Strassens’s Matrix Multiplication

4
P 1 = (A 11 + A 22 )(B 11 +B 22 ) P 2 = (A 21 + A 22 ) * B 11 P 3 = A 11 * (B 12 - B 22 ) P 4 = A 22 * (B 21 - B 11 ) P 5 = (A 11 + A 12 ) * B 22 P 6 = (A 21 - A 11 ) * (B 11 + B 12 ) P 7 = (A 12 - A 22 ) * (B 21 + B 22 )

5
Strassens’s Matrix Multiplication P 1 = (A 11 + A 22 )(B 11 +B 22 ) P 2 = (A 21 + A 22 ) * B 11 P 3 = A 11 * (B 12 - B 22 ) P 4 = A 22 * (B 21 - B 11 ) P 5 = (A 11 + A 12 ) * B 22 P 6 = (A 21 - A 11 ) * (B 11 + B 12 ) P 7 = (A 12 - A 22 ) * (B 21 + B 22 ) C 11 = P 1 + P 4 - P 5 + P 7 C 12 = P 3 + P 5 C 21 = P 2 + P 4 C 22 = P 1 + P 3 - P 2 + P 6

6
Strassens’s Matrix Multiplication Ref : Accelerating High Performance Applications with CUDA and MPI

7
Why MPI + CUDA ?.. ➢ Equations naturally suitable for CUDA environment ➢ Incapability of CUDA : No inter GPU communication. ➢ MPI : Data distributing mechanism ➢ CUDA : Main Execution Engine

8
MPI + CUDA

9
➢ Divide the input matrix into four equal parts ➢ Send the appropiate part to the corresponding process ➢ Each process compute the corresponding equation Node Contains GPU Use kernels on their own GPU to compute result Steps Performed

10
➢ Divide the input matrix into four equal parts ➢ Send the appropiate part to the corresponding process ➢ Each process compute the corresponding equation ➢ Process will send their result to the head process of equation ➢ All Heads collect data ➢ Head will compute C's equation ➢ All head send their partial result to master node ➢ Master will combine & display the result Steps Performed

11
P 1 = (A 11 + A 22 )(B 11 +B 22 ) P 5 = (A 11 + A 12 ) * B 22 P 1 = (A 11 + A 22 )(B 11 +B 22 ) P 5 = (A 11 + A 12 ) * B 22 P 2 = (A 21 + A 22 ) * B 11 P 6 = (A 21 - A 11 ) * (B 11 + B 12 ) P 2 = (A 21 + A 22 ) * B 11 P 6 = (A 21 - A 11 ) * (B 11 + B 12 ) P 3 = A 11 * (B 12 - B 22 ) P 7 = (A 12 - A 22 ) * (B 21 + B 22 ) P 3 = A 11 * (B 12 - B 22 ) P 7 = (A 12 - A 22 ) * (B 21 + B 22 ) P 4 = A 22 * (B 21 - B 11 ) Detailed Description – Step 1

12
P 1, P 5 P 2, P 6 P 3, P 7 P4P4 P4P4 Detailed Description – Step 2

13
P 1, P 5 P 2, P 6 P3, P7 P4P4 P4P4 Declare Result Detailed Description – Step 3

14
Experimental Result - 1

15
Experimental Result - 2

16
Experimental Result - 3

17
References : Accelerating High Performance Applications with CUDA and MPI : N. P. Karunadasa & D. N. Ranasinghe Strassen’s Matrix Multiplication on GPUs : Junjie Li, Sanjay Ranka

18
Thanks

Similar presentations

Presentation is loading. Please wait....

OK

Objectives The student will be able to:

Objectives The student will be able to:

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google

Ppt on rulers and buildings Ppt on brand marketing resume Ppt on boilers operations with integers Ppt on display advertising Ppt on cross sectional study Ppt on surface water drains Ppt on effect of global warming on weathering Free ppt on mobile number portability status Ppt on limits and derivatives video Ppt on new york life insurance