Presentation is loading. Please wait.

Presentation is loading. Please wait.

Mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012.

Similar presentations


Presentation on theme: "Mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012."— Presentation transcript:

1 mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012

2 Contents Project Overview Project goals Requirements Architecture Micro architecture Problems & solutions Conclusions Testability Methodology Schedule

3 algorithm overview INPUT FILE -------------- -------------- -------------- ------------- INPUT FILE -------------- -------------- -------------- ------------- Literal items ONLY A copy item consists of two bytes that represent from 3 to 18 bytes. literal item consist of one byte which represents himself LZRW3 COMPRESSOR OUTPUT FILE [----][-----]- [-------][---- -------][----] OUTPUT FILE [----][-----]- [-------][---- -------][----] GROUPS OF ITEMS (literal/Copy)

4 mechanism HASH FUNCTION INDEX 4095 0 INPUT FILE: Offset Exp ression_compress_ion Exp Offset value= 0 XXX ZZZ YYY UUU demonstration UUU res 3 XXX Output Exp res L.I NOTE: The next 3 byte should be “x p r”, then “ p r e “ and only then “r e s”, we did’nt demonstrate all the actions for simplicity. “L.I“ stands for “ Literal Item “

5 mechanism HASH FUNCTION INDEX 4095 0 INPUT FILE: Expression_compress_ion Offset value= XXX ZZZ YYY UUU demonstration ZZZ 0 3 6 s s i 9 _ _ o o YYY Exp res Output L.I sio L.I n_c L.I Offset c c n n

6 mechanism HASH FUNCTION INDEX 4095 0 INPUT FILE: Expression_compress_ion Offset value= XXX ZZZ YYY UUU demonstration om p 0 3 12 6 9 Exp res Output L.I sio L.I n_c L.I omp L.I Offset

7 mechanism HASH FUNCTION INDEX 4095 0 INPUT FILE: Express_compress_io Offset value= XXX ZZZ YYY UUU re s XXX 0 3 15 12 9 6 demonstration Exp res Output L.I sio L.I n_c L.I omp L.I 123 C.I XXX ionn 3+0 12345 Offset “C.I“ stands for “ Copy Item “

8 Hash 3 bytes Hash table [index] Enter offset O.F.- Literal item O.F.- Literal item Get offset O.F.- Copy item Length++ more same bytes FWD 1 byte FWD 1 byte FWD 3+ Length bytes FWD 3+ Length bytes START index empty filed Same 3 bytes no yes

9 Project Goals Implementation of LZRW3 data compression algorithm Implementing strong debugging capabilities via GUI

10 Requirements VHDL implementation DE2 development board that features an Altera Cyclone II FPGA FPGA – Host communication via UART protocol Use internal memory on FPGA, no interface to external memory Adapted to data templates of 2Kbyte to 32Kbyte High performance- data transfer of 1Gbps

11 Requirements VHDL implementation XUPV5 development board that features an Xilinx Virtex-5 FPGA FPGA – Host communication via UART protocol Use internal memory on FPGA, no interface to external memory Adapted to data templates of 2Kbyte to 32Kbyte High performance- data transfer of 1Gbps

12 Architecture Rx PATH Tx PATH INPUT BLOCK memory LZRW3 COMPRESSOR CORE LZRW3 COMPRESSOR CORE COMPRESSED FILE memory GUI XILINX VIRTEX 5 ON XUVP505 BOARD UART

13 Architecture Rx PATH Tx PATH INPUT BLOCK memory LZRW3 COMPRESSOR CORE LZRW3 COMPRESSOR CORE COMPRESSED FILE memory GUI XILINX VIRTEX 5 ON XUVP505 BOARD UART

14 LZRW3 COMPRESSOR CORE LZRW3 COMPRESSOR CORE Lzrw3_go Lzrw3_mode data_input_byte (7..0) data_input_valid data_input_taken clk Lzrw3_busy Lzrw3_done Lzrw3_output_group_size (4..0) data_output_valid data_output_taken data_output_last reset data_output_bytes(13..0) End_of_file

15

16

17 STAGE 1 – three bytes buffer 3 BYTES BUFFER 3 BYTES BUFFER enable reset New_byte(7..0) clk Newer_byte(7..0) Mid_byte(7..0) Older_byte(7..0)

18

19 STAGE 2- hash function enable HASH FUNCTION HASH FUNCTION middle_byte(7..0) clk Table_index(11..0) older_byte(7..0) Newer_byte(7..0) reset

20 TABLE INDEX = (((40543*(((*(PTR)) >4) & 0xFFF) PTR pointes to the first byte. TABLE INDEX range: 0 to 4095.

21 STAGE 2- RTL view

22 STAGE 3 – hash table enable HASH TABLE Data_out_valid Table_index(0..11) clk Offset(19..0) Current_offset(19..0) Offset counter Offset counter reset clear

23 Current_offset 0 0 0 0 1 1 0 1 0 1 1 0 Valid bits 21 bits 4096 rows Offset counter DATA_ IN INDEX ADDRESS Offset Data_out_valid 1 Offset counter Offset counter

24 STAGE 4 – input file memory

25 Stage 4 implementation Input file memory should supply three byte at the same time.

26 How to choose bank when byte arrives ?

27 SOLUTION Instead of counting in stage 3 and divide in stage 4, we incerment by one only after three clock cycles. In this configuration we expand the offset by 2 bits (tagging) to select the the data need to write into. Hash table size now is 4096 x (19+2). 100101010100111001110 19 bits2 bits

28 Solution costs (mem units) Memory usage At stage 3 from synplify_pro: same as before. LUT usage: 36Kbit

29 Back to stage 4

30 Input file memory banks Input file memory banks comparator Continue 1 0 clk Tentative Next address Tentative Next address clk counter offset TAG Comprison_valid Compare_success clk Offset_tag Tentative_tag clk Tentative_taken Compare_success_P Item_length_p Offset_valid Bank 0,1,2 addresses 0 1 Addresses alignment Older_byte_P Offset_valid C B A 3401 YZ TENT 0 0 A 0 0 X B C D C D B B 1 1 1 0 INDEX TAG indicate the banks bytes order

31 Input file memory banks Input file memory banks comparator Continue 1 0 clk Tentative Next address Tentative Next address clk counter offset TAG Comprison_valid Compare_success clk Offset_tag Tentative_tag clk Tentative_taken Compare_success_P Item_length_p Offset_valid Bank 0,1,2 addresses 0 1 Addresses alignment Older_byte_P Offset_valid DC 0 0 1 T D E C INDEX C

32 Problems & solutions

33 Problem(1) in stage 4, at first we implemented the counter that counts the number of successful comparisons in the comparator which is made of an asynchronous process. It passed simulations but was not synthesizable.

34 Solution(1) we’ve changed the architecture of the units so the counter is implemented in a synchronous unit, it receives a signal from the asynchronous comparator if the comparison was successful and responds accordingly.

35 Problem(2) in stage 4, in order to perform the comparison of the current 3 bytes in the pipe and three bytes from the RAM memory we need to extract three following bytes from different addresses at one clock period.

36 Solution(2) we distributed the one memory we had into 3 RAM memory banks which contains following addresses so in case we want to extract 3 following bytes from the memory we’ll extract one byte from each bank.

37 Problem(3) in stage 4, the current pipe bytes that arrive the comparator are arranged in their arrival order but the three bytes withdrawn from the banks aren’t necessarily arranged in the right order.

38 Reading configurations 1. SAME ADDRESES

39 2. DIFFERENT ADDRESS Reading configurations

40 3. DIFFERENT ADDRESS # 2 Reading configurations

41 ׂ (3)Solution We used the TAG that represented the extracted bytes addresses to determine which extracted byte will be compared with which current piped byte.

42 Problem(4) In stage 4, the RAM memory banks need to have the next address to extract on the next clock before the end of the current clock.

43 (4)Solution We created two units that will contain the next two possible addresses (tentative address unit or address align unit).

44 Conclusions Writing code for synthesis is different from writing code for simulation. In asynchronous implementation all the signals need to be in the sensitivity list. Reset should not pass through any logic. Think hardware when writing VHDL code for synthesis. Keep on simplicity to achieve more flexibility.

45 2048 Testability Synthesisable Hash Function Block Synthesisable Hash Function Block Unsynthesisable Simulation Function Unsynthesisable Simulation Function Random input generator A B CA B C Assert the comparison and report to console Input file

46 Methodology Stage data flow review. Writing VHDL code. Writing VHDL testbench. Code review and debugging. Synthesis check- synplify. Check RTL view. Check CLK constraints. Commit SVN folders and update data flow if needed. Next stage data flow review. Simulation & debugging

47 Schedule 1/2 DateGoals 24/4/2012 – 1/5/2012Project Characterization & Algorithm interpreting 2/5/2012Characterization Presentation 2/5/2012 – 16/5/2012Full Characterization of all blocks 17/5/2012 – 1/7/2012 System blocks VHDL Design 1/7/2012 – 27/7/2012Work on project paused for exams 29/7/2012 –11/11/2012 System blocks VHDL Design (Cont.) Writing every unit a simulating testbench

48 Schedule 2/2 DateGoals 12/11/2012Mid presentation 13/11/2012 –19/12/2012 System blocks VHDL Design (Cont.) Writing every unit a simulating testbench 20/1/2012Part A final- Core Simulation Vs. Golden model 21/1/2012 – 15/2/2012Assemble all units and FPGA synthesis 16/2/2012 – 28/2/2012GUI implementation 1/3/2012 – 10/3/2012Final overall Tests & debug 11/3/2012 – 31/3/2012Editing and finishing project portfolio 1/4/2012Final presentation


Download ppt "Mid presentation Part A Project Netanel Yamin & by: Shahar Zuta Moshe porian Advisor: Dual semester project November 2012."

Similar presentations


Ads by Google