Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra

Similar presentations


Presentation on theme: "Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra"— Presentation transcript:

1 Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra
SPARSE MATRIX VECTOR MULTIPLICATION

2 SPARSE MATRICES WHY CONVENTIONAL ALGORITHMS NOT EFFICIENT FOR SPARSE MATRICES? WHAT ARE THEY ? WHERE ARE THEY USED ? Sparse Matrices are when systems are modelled into large differential equations Typical domains are Image processing , Industrial process simulations, Data retrieval Processing of Sparse matrices require large processing time There is a huge overhead due to storing redundant elements Simply, Matrices with a large number of zero elements

3 BASICS OF SPARSE MATRICES
Compressed Sparse Row / Column to Matrix Market

4 FORMAT INDEPENDENCE

5

6 MOTIVATION for ALTERNATE STRATEGIES
Low Memory Bandwidth Irregular memory access patterns High latency of load/store instructions High Ratio of Load/Store Instructions

7 CONVEY - A QUICK LOOK INSIDE
The AEH runs scalar instructions and routes memory requests from AE 8 Memory Controllers enable parallel and pipelined access to memory 256 MB Coherent Cache for memory requests from coprocessor to host memory It has 4 FPGAs for user defined Application Personalities as well !!!

8 Details of the C Code SEQUENTIAL PROCESSOR AEH AE1 AE2 AE3 AE4 A8 A9
MB1 MB2 MB3 HOST PROCESSOR POPULATES INPUT MATRICES COP_CALL ROUTINE PASSES THE BASE ADDRESS TO COPROCESSOR Memory allocated for array 1 from mem_base 1 Memory allocated for array 2 from mem_base 2 Memory allocated for result from mem_base 3

9 Details of the Assembly Code
AEH AE1 MB1 MB2 AEG AEG 1 AEG 2 AEG 31 MB3 USE ASSEMBLY TO MOVE BASE ADDRESSES TO APPLICATION ENGINE REGISTERS Logical operations – AND,OR,XOR Arithmetic Operations- Multiplication, Addition Complex calculations involving vectors could be done without writing VHDL code MAIN MEMORY

10 MEMORY INTERFACING MAIN MEMORY OUR MODULE A 0 A 1 DATA ADDRESS POP
DATA VALID REQ ID ROQ 0 MC 0 ID 0 ID 1 ID 2 ID 3 ID 255 ID 4 I &D 4 I &D 3 I &D 1 I &D 2 I &D 0 MAIN MEMORY D 0 D 1

11 IMPLEMENTATION WE NOW HAVE THE REQUIRED INPUTS FOR SMVM
IN THIS WAY, WE WRITE ALL 11 OUTPUTS TO MEMORY IN THIS WAY, WE DO 21 READS FROM THE DATA BUS AFTER PROCESSING, THE SMVM GIVES A DONE SIGNAL ONE CYCLE OF COMPUTATION IS COMPLETE !!! GENERATE LD SIGNAL GIVE BASE ADDRESS GENERATE 21 LOAD SIGNALS LOAD COMPLETE SIGNAL MASTER CONTROL ADD. DECODER MCs, ROQs AND MEMORY 0X454C… X454C…..040 DATA READ ENGINE DATA VALID START READ DATA BUS READ COMPLETE INPUT BUFFER INPUT BUFFER START WRITE OUTPUT BUFFER DONE START SMVM OUTPUT BUFFER

12 Simulation Results - Co-Processor Instruction Execution
Base Address & Size values moved to internal Registers Decode Move Instruction ( 6 Move Instructions ) Decode CAEP Instruction Start’s Custom Personality

13 Simulation Results – Load Request to MC
Starts Read Procedure ID from ROQ With Load Request Append ID from ROQ Start Read Process after send request to MC Check Address Send Load Request to respective MC Decoded Address Send 21 Data Load Requests

14 Receive Data from MC through ROQ
Simulation Results – Receive Data from MC through ROQ Start Load Process Valid Data Available at MC’s Start Next Read ( if nothing to Write) But, Read Data Sequentially from MC0 – MC1 – MC2 Load Process done after receiving 21 Data Inputs

15 Write Back Results from SpMV- Engine using MC
Simulation Results – Write Back Results from SpMV- Engine using MC Write Process Done after 11 store operations Start Write if valid data received from SpMV Engine Decode Address Send Store Request to respective MCs Write Process Done and Start next read cycle

16 FUTURE SCOPE Increasing memory bandwidth
Partitioning SMVM calculations across four Application Engines


Download ppt "Mihir Awatramani Lakshmi kiran Tondehal Xinying Wang Y. Ravi Chandra"

Similar presentations


Ads by Google