Presentation is loading. Please wait.

Presentation is loading. Please wait.

Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:

Similar presentations


Presentation on theme: "Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:"— Presentation transcript:

1 Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor: Gregory Mironov Spring 2004 Project no. D0623

2 Nowadays complex computations are done on a standard processor or a DSP which aren’t optimal for the matrix inversion. In order to decrease the time spent on matrix inversion tasks we use a specific hardware to do the matrix inversion leaving the CPU free for other tasks and using the faster hardware for the complex computation.

3 Designing and implementing an FPGA circuitry that inverses a 625x625 matrix.

4 A standalone system The matrix is of size 625x625 Matrix elements are of type 64 bits double precision floating point Calculation time < 20ms

5 Suggested Solutions Two algorithms were considered: –Linear algorithm of order O(N^3) –Monte-Carlo algorithm of order O(N^2) The selected hardware was Virtex II Pro The selected algorithm was the Monte-Carlo

6 The Monte-Carlo Algorithm (simplified version) b i,j := 0; For c := 1 to N do { k 0 := i ; w 0 := 1 ; For t := 1 to T do { k t := MP( k t-1 ) ; w t := sign(d k t-1,k t ) * w t-1 * E k t ; if k t = j then b i,j += w t ; } b i,j /= N ; N – number of markov chains T – length of each chain b – an inversed element MP() – a chain generator

7 The MC Algorithm (continued) D = I – A E i = Σ j | d i,j | - weights vector P is a transition probability matrix such that p i,j = | d i,j | / E i - used for generating the marcov chains.

8 A Small Demonstration 7-2 -34 A =D = -62 3-3 E = 8 6 P = 3434 1414 1212 1212 t rand# k t w t b 1,2 3 0.49 1 -384 -48 2 0.9 2 -48 1 0.2 1 -8 0 none 1 0

9 Algorithm’s Architecture MP SW A MP SW A k = i E1E1 EnEn 0 MP SW A b i,j T

10 Switch & Accumulator K in T in T out K out E in R in E out R out SW E out = E in R out = R in K out = K in If R in = K in Then T out = E in Else T out = T in K in W int C in V in C out V out A * W in W out T in C out = C in W out = W in * T in W int = W out If C in = K in Then V out = V in + W int Else V out = V in

11 Architecture Demonstration MP SW A k = 1 E 1 = 8 E 2 = 6 0b 1,2 = MP SW A MP SW A K out = 1 K out = 2 T out =8 T out =6 W out =-8W out =-48W out =-384 V out =0 V out =-48

12 Basic Block Diagram RAM A Memory Controller Algorithm FPGA B Elements request Elements transfer Read/Write

13 Some scales 64bit * 625 * 625 = 3MB Two matrices needed  6MB 20[msec] / (625^2) = 51.2 [nsec] per one matrix element  20Mhz Considering an O(n^3) algorithm  12.2[Ghz]

14 Encountered obstacles Studying the Monte-Carlo algorithm and some of its mathematical basics. The architecture requires a lot of FPGA cells. Finding a floating point library and adjusting it to our needs. Getting to know all the software used in an FPGA development

15 Encountered obstacles (Cont.) The floating point units have a big delay time (130ns for the Division unit alone) Monte-Carlo algorithm needs a delicate tuning and a lot of iterations for achieving a reasonable accuracy A very large bus is needed in order to transfer the matrix elements.

16 Project achievements Studied the Monte-Carlo algorithm and its architecture. Wrote a C simulation in order to check the Monte-Carlo method. Studied the VHDL language. Found and adjusted a floating point library to the project needs. Ran a simulation for the floating point unit.

17 Project achievements (cont.) Implemented the switch and accumulator blocks in VHDL. Implemented a basic chain using the switch and accumulator block. Implemented and loaded to the V2P a circuit that used the floating point library.

18 Things to do Implement the MP block, the memory controller and the computation control circuit. Improve FP delays Design a communication interface to load and send the matrix.


Download ppt "Technion – Israel Institute of Technology Department of Electrical Engineering High Speed Digital Systems Lab Written by: Haim Natan Benny Pano Supervisor:"

Similar presentations


Ads by Google