Presentation is loading. Please wait.

Presentation is loading. Please wait.

EECC756 - Shaaban #1 lec # 1 Spring 2003 3-11-2003 Systolic Architectures Replace single processor with an array of regular processing elements Orchestrate.

Similar presentations


Presentation on theme: "EECC756 - Shaaban #1 lec # 1 Spring 2003 3-11-2003 Systolic Architectures Replace single processor with an array of regular processing elements Orchestrate."— Presentation transcript:

1 EECC756 - Shaaban #1 lec # 1 Spring 2003 3-11-2003 Systolic Architectures Replace single processor with an array of regular processing elements Orchestrate data flow for high throughput with less memory access Different from pipelining –Nonlinear array structure, multidirection data flow, each PE may have (small) local instruction and data memory Different from SIMD: each PE may do something different Initial motivation: VLSI enables inexpensive special-purpose chips Represent algorithms directly by chips connected in regular pattern Slides from Shaaban

2 EECC756 - Shaaban #2 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 b1,0 b0,1 b0,0 a0,2 a0,1 a0,0 a1,2 a1,1 a1,0 a2,2 a2,1 a2,0 Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product Rows of A Columns of B T = 0 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

3 EECC756 - Shaaban #3 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 b1,0 b0,1 a0,2 a0,1 a1,2 a1,1 a1,0 a2,2 a2,1 a2,0 Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 1 b0,0 a0,0 a0,0*b0,0 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

4 EECC756 - Shaaban #4 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication b2,2 b2,1 b1,2 b2,0 b1,1 b0,2 a0,2 a1,2 a1,1 a2,2 a2,1 a2,0 Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 2 b1,0 a0,1 a0,0*b0,0 + a0,1*b1,0 a1,0 a0,0 b0,1 b0,0 a0,0*b0,1 a1,0*b0,0 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

5 EECC756 - Shaaban #5 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication b2,2 b2,1 b1,2 a1,2 a2,2 a2,1 Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 3 b2,0 a0,2 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a1,1 a0,1 b1,1 b1,0 a0,0*b0,1 + a0,1*b1,1 a1,0*b0,0 + a1,1*b1,0 a1,0 b0,1 a0,0 b0,0 b0,2 a2,0 a1,0*b0,1 a0,0*b0,2 a2,0*b0,0 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

6 EECC756 - Shaaban #6 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication b2,2 a2,2 Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 4 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a1,2 a0,2 b2,1 b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,1 b1,1 a0,1 b1,0 b1,2 a2,1 a1,0*b0,1 +a1,1*b1,1 a0,0*b0,2 + a0,1*b1,2 a2,0*b0,0 + a2,1*b1,0 b0,1 a1,0 b0,2 a2,0 a2,0*b0,1 a1,0*b0,2 a2,2 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

7 EECC756 - Shaaban #7 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 5 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,2 b2,1 a0,2 b2,0 b2,2 a2,2 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 b1,1 a1,1 b1,2 a2,1 a2,0*b0,1 + a2,1*b1,1 a1,0*b0,2 + a1,1*b1,2 b0,2 a2,0 a2,0*b0,2 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

8 EECC756 - Shaaban #8 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 6 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 b2,1 a1,2 b2,2 a2,2 a2,0*b0,1 + a2,1*b1,1 + a2,2*b2,1 a1,0*b0,2 + a1,1*b1,2 + a1,2*b2,2 b1,2 a2,1 a2,0*b0,2 + a2,1*b1,2 Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/

9 EECC756 - Shaaban #9 lec # 1 Spring 2003 3-11-2003 Systolic Array Example: 3x3 Systolic Array Matrix Multiplication Alignments in time Processors arranged in a 2-D grid Each processor accumulates one element of the product T = 7 a0,0*b0,0 + a0,1*b1,0 + a0,2*b2,0 a0,0*b0,1 + a0,1*b1,1 + a0,2*b2,1 a1,0*b0,0 + a1,1*b1,0 + a1,2*a2,0 a1,0*b0,1 +a1,1*b1,1 + a1,2*b2,1 a0,0*b0,2 + a0,1*b1,2 + a0,2*b2,2 a2,0*b0,0 + a2,1*b1,0 + a2,2*b2,0 a2,0*b0,1 + a2,1*b1,1 + a2,2*b2,1 a1,0*b0,2 + a1,1*b1,2 + a1,2*b2,2 b2,2 a2,2 a2,0*b0,2 + a2,1*b1,2 + a2,2*b2,2 Done Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/


Download ppt "EECC756 - Shaaban #1 lec # 1 Spring 2003 3-11-2003 Systolic Architectures Replace single processor with an array of regular processing elements Orchestrate."

Similar presentations


Ads by Google