Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester.

Similar presentations


Presentation on theme: "1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester."— Presentation transcript:

1 1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester

2 2 Project Goals:   Implementing Matrix multiplication of different sizes of matrixes in hardware on SOPC.  In our project we will implement an IP that multiplies N x N Integer matrixes in various sizes.

3 3 Determination of Matrix Sizes  The input matrixes will be of type integer (32bits) and the result matrix will be of type long (64bits).  In the Virtex2pro XC2VP30 model There are 136 BRAMS of 18kbit each. Out of which approximately 60-70 BRAMS are used by the processor and Buses. That leaves us with about 64 free BRAMS for the IP’s inner memory.  Since we use integers of 32bit the core generator will have to use the BRAMS in a configuration of 8*(4*4kbits) in order to get 4k rows of integers. Thus 64 BRAMS will be sufficient for a total of 32k integers.  The result matrix is of type long (64bits) and thus requires twice the size of a regular matrix, so each input matrix can have up to 32k/4 = 8k integers.  Since the matrixes are square the maximum size (N) of a matrix is round_down( squre_root ( 8k ) ) = 90. round_down( squre_root ( 8k ) ) = 90.  It is important to mention that this size is an initial estimation and may change (upwards or downwards) in the final implementation.

4 4 Implementation General Hardware scheme Processor Matrix Multiplication PLB/OPB bridge Uart PLB OPB

5 5 General Block diagram scheme Mult_Matrix Memory Logic FSM data address Write enable Clock data address Write enable Clock R0 start signal and size of matrix data out Matrix A Matrix B The result Matrix

6 6 Implementation - continue  The processor will write the 2 matrixes to the memory and will inform the IP about it through register and about the size (N).  The IP will perform the multiplication and save it in its inner memory. It will inform the processor about the completion of the task by polling or interrupt.

7 7 The IP’s inner Memory  For the IP’s inner memory we chose a Dual port memory in order to support access both from the software control (the processor writing the 2 matrixes into the memory and reading the final result) and the IP itself (reading the 2 matrixes and writing the result).

8 8 The multiplier  In order to perform the multiplication we defined a multiplier of 32 x 32 bit to 64 bit using the HDL Designer Core generator. Note: Note: The core generator uses small multipliers of 18x18 bits to create this multiplier and the result output can be achieved after only one clock cycle. this will allow us to multiply two integers in one clk. this will allow us to multiply two integers in one clk.

9 9 The Matrix multiplication algorithm (the Logic FSM) When the IP unit is given a start signal, it will initialize its own inner signals. Among them is the matrix size (N). At this point the FSM output signals will effect the second port of the memory block so the IP could read and write through this port.

10 10 Idle n<= size row <= 0 a<= mem[base_a+i+row*n] i<=0 total<=0 row<=row+1 b<= mem[base_b+i*n+col] temp<=a*b total<=temp+total i<=i+1 mem[base_c+row*n+col]<= total col<=col+1 Finish start=‘0’ i<n-1 i=n-1 col=n-1 row <n-1 row =n col<n-1 start =‘1’ i<=0 total<=0 col <= 0

11 11 The multiplication process - explanation In this FSM we have 3 loops: 1.multiplication of two single numbers (one from a row in matrix A and one from a column in matrix B). 2.Multiplication of current row with all columns of matrix B and saving result. 3.Multiplication of all rows of matrix A with columns of matrix B.

12 12 The final step - Finish  The processor will be informed about the completion of the multiplication task by polling: R0 (the configuration register) will be loaded with a finish bit. The processor will be polling this register and will know when the operation was completed.  Interrupt: an optional way to inform the processor about the completion of the task - to be considered later.

13 13schedule  Learning EDKDone √  Defining an initial system and burning it to the FPGA Done √  Implementing a Dual port memory block in HDL designer and integrating it with our IP + performing reading and writing to it.Done √  Initial design of the FSM logic block.Done √  Continuation of the design and testing the FSM for correctness (18/6 –15/8) [ exam’s period (8/7 – 3/8) ]  Adding the FSM to the IP’s VHDL code and integrating it with the PPC. Testing and validation.(15/8 – 30/9)  Final Documentation (30/9 – 10/10)

14 14 Thank you !


Download ppt "1 Matrix Multiplication on SOPC Project instructor: Ina Rivkin Students: Shai Amara Shuki Gulzari Project duration: one semester."

Similar presentations


Ads by Google