Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital.

Similar presentations


Presentation on theme: "1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital."— Presentation transcript:

1 1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital System Lab

2 2 Project Goals  Real time video signal filtering based on nonlinear diffusion algorithm. Studying the algorithm of nonlinear diffusion. Studying the work environment of Synplify DSP. Implementing on FPGA, a real time video processing algorithm.

3 Non linear Diffusion Filtering 3 The nonlinear diffusion is an iterative algorithm that provides local smoothing of the picture and at the same time edges preservation. Here you can see 3 steps along the iterative process. Original image Step one Step twoStep three

4 Project stages Simulink design of an existing Matlab code Adaptation of the Simulink design to SynplifyDSP components and constraints. Synthesis of the VHDL code produced by SynplifyDSP using SynplifyPro Integration of the above RTL component within the Gidel card architecture using Quartus II and ProcWizard Place and route by using Quartus II Loading RBF file to Gidel’s Procstar II card using ProcWizard 4

5 Comparison between SynplifyDSP and direct VHDL implementation Pros: The SynplifyDSP tool plugs into the familiar Simulink environment. The development is fast. Cons: Hard to obtain an optimal implementation (non optimal critical path) VHDL code that is hard to understand and therefore it is difficult to make changes 5

6 Simulink design 6

7 7

8 From Simulink to SynplifyDSP We had to change our design because: 1)We choose not to use any buffer between the DVI connection and the processing of the input. 2)In the Simulink design we use matrices to represent images, but SynplifyDSP can only use vectors. 8

9 Image representations Image as matrix Image as vector 9

10 Computing derivation 10 0 0 0 false result true result Matrix derivation Vector derivation

11 SynplifyDSP design 11

12 SynplifyDSP design 12

13 ROM component in SynplifyDSP design In SynplifyDSP we can’t implement the mathematical expression: To overcome this problem we use ROM components that function as LUT. Loading the ROM is done by creating an array. SynplifyDSP automatically uses a LUT to calculate the LOG function. 13

14 Fixed point precision In Matlab and Simulink we work at full precision. But when we implements the above design on FPGA, we have to work with fixed point precision. Hence we need to estimate how many bits we should use per signal, in order to get a satisfactory error. It appears that using 12 bits for the fraction of each signal provides satisfactory precision. 14

15 Matlab and Synplify comparison We measure the error between the Matlab code output and the SynplifyDSP output. For 1 iteration: relative root MSE = 1% 15 Matlab resultSynplify result

16 SynplifyDSP – VHDL code 16

17 Synplify Pro 17

18 Synplify Pro 18

19 Synplify Pro Performance Summary *************************** Worst slack in design: 13.447 Requested Estimated Requested Estimated Starting Clock Frequency Frequency Period [ns] Period [ns] Slack ----------------------------------------------------------------------------------------------------------------- clk 44.0 MHz 107.8 MHz 22.727 9.280 13.447 ================================================================ 19 Requested Frequency – the minimal frequency we want to achieve. Estimated Frequency – the frequency of the current design. Requested Period – the maximal period we want to achieve for a single cycle. Estimated Period - single cycle time of the current design. Slack – this is the extra time we have in single cycle. A negative value indicates that timing constraints could not be met.

20 Procwizard + Quartus In the ProcWizard we create the interface between the FPGA and daughter board DVI port. The Quartus performs the place and route according to the Procwizard interface and the SynplifyPRO node-level netlist. 20

21 Procwizard 21

22 Block Diagram 22 CLKI2C 2 3 Data DVI Receiver Video In DVD CLK I2C 2 3 Data Video Out Computer Screen DVI Transmitter Procstar II Board DVI Daughter Board Pixel Data ClockVSYNCHSYNC Top Level Design Clock VSYNCHSYNCPixel Data DVI Connector DVI Connector 24

23 Rates & Frequencies The DVI connection provides one pixel (24 bits) per clock. DVI frame rate is 60 frames per second.DVI frame rate is 60 frames per second. Minimum clock frequency of DVI standard is : 25.175 MHz Our goal was : 43MHz (for 800 600) Achieved frequency: 107.8 MHz We achieved our goal by using pipeline The bit rate is 43M 24bit 1Gbit/sec 23

24 Memory For 10 iteration we use 10 55KB ROMs and 3 log 0.4KB ROMs and 3 8KB ROMs. ROM size = 3*0.4K+3*8K+10*55K=574KB 24

25 Time table 25


Download ppt "1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk SPRING 2008 High Speed Digital."

Similar presentations


Ads by Google