Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk Semesterial project SPRING.

Similar presentations


Presentation on theme: "1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk Semesterial project SPRING."— Presentation transcript:

1 1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk Semesterial project SPRING 2008 High Speed Digital System Lab

2 2 Project Goals  Real time video signal filtering based on nonlinear diffusion algorithm. Studying the algorithm of nonlinear diffusion. Studying the work environment of Synplify DSP. Implementing on FPGA, a real time video processing algorithm.

3 NON LINEAR DIFFUSION It aims at filtering an image The filtered image is the solution of a nonlinear differential equation This equation is called the nonlinear diffusion equation: Since no analytic solution is known, we need to solve it iteratively 3

4 ALGORITHM FEATURES 4 It smoothes the image without damaging the edges It is an iterative algorithm The more iterations, the more effect you get Highly computationally demanding Real time implementation possible only in hardware Original image 15 iterations 40 iterations80 iterations

5 5 Part 1: Simulation in Simulink environment Part 2: Working directly with the FPGA CONTENTS

6 6 Short reminder of what has been done till the midterm presentation PART 1

7 FROM Simulink to SynplifyDSP We had to change the original Matlab/Simulink design because: 1)We choose not to use any buffer between the DVI connection and the processing of the input 2)In the Simulink design we use matrixes to represent images, but SynplifyDSP can only use vectors 7

8 REAL TIME DERIVATION 8 0 0 0 false result true result Matrix derivation Vector derivation

9 SynplifyDSP DESIGN 9

10 10

11 FIXED POINT PRECISION In Matlab and Simulink we work at full precision But when we working with FPGA, one needs to use fixed point precision We check for each block how many bits it use. 11

12 12 SynplifyDSP SIMULATION

13 13 SynplifyDSP SIMULATION

14 SIMULATION RESULTS 14 ORIGINAL IMAGE SynplifyDSP RESULT AFTER 30 ITERATIONS

15 Matlab AND SynplifyDSP COMPARISON We measure the error between the Matlab code output and the SynplifyDSP output. For 30 iteration: relative root MSE = 1.9481% per pixel 15

16 ß PARAMETER 16 Let’s simplify the diffusion equation: Now let’s show how one can get an iterative solution: We define ß in the following way: In our implementation this parameter can be changed online !!!

17 17 Original image

18 18 NLD image ß=0.1 iterations=10

19 WORK FOLW – Matlab & Simulink STAGE 19 Algorithm design Simulation and error measurement DSPPRO

20 WORK FOLW – SynplifyDSP STAGE 20 VHDL code generation DSPPRO

21 WORK FOLW – SynplifyPRO STAGE 21 Synthesis the VHDL code to logic schema Creats a VQM file DSPPRO

22 WORK FOLW – ProcWizard STAGE 22 Built the VHDL code of the board interface DSPPRO

23 WORK FOLW – Quartus STAGE 23 Configuration of the interface VHDL code Link the VHDL interface to the VQM file Place and route Creates RBF file DSPPRO

24 WORK FOLW – ProcWizard STAGE 24 Load the FPGA with the RBF file DSPPRO

25 25 Working with the FPGA PART 2

26 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking check. 26 PROJECT PROGRESS

27 MEMORY LACK We use ROM block to implement “pow” function There isn’t enough ROM to load more than one iteration on FPGA High MSE 27

28 We replaced the ROM by “DIV” and 3 “CONVERTER” This solution give us a 0.2% MSE for one iteration. 28 MEMORY LACK

29 PROJECT PROGRESS Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking check 29

30 30 FREQUENCY PROBLEM (PIPELINE) Highest frequency 18MHz. Pipeline at the hardware not at the logic level.

31 31 To implement a correct pipeline we use the SynplifyDSP program: This solution gave us a frequency of 107MHz FREQUENCY PROBLEM (PIPELINE)

32 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 32 PROJECT PROGRESS

33 33 SIMPLE DESIGNS We still had noise that come from the DVI input 1.delay 2.Overhead test We got noise To understand the problem we built 2 simple designs: 1. ”delay”, 2. “overhead_test”

34 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 34 PROJECT PROGRESS

35 35 CHECKING THE CARD AT GIDEL Cleaning the card at the lab and switching the DVI cables Checking the card at Gidel: 1.Automatic card test 2.Test with a new PSDB The board and the daughter board worked fine

36 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 36 PROJECT PROGRESS

37 DVI CONNECTION 37 DVI RECEIVER DVI TRANSMITTER From graphics card 24 data bit 3 control bit Clk 12 double data rate bit To screen Clk FPGA Sites on DVI PSDB 3 control bit 24 bit 12 MSB 12 LSB

38 INVALID DVI CONNECTION 38 DVI RECEIVER DVI TRANSMITTER Synplify DSP VHDL code MUX From graphics card 24 data bit 3 control bit Clk 12 MSB data bit 12 LSB data bit 12 double data rate bit To screen Clk 3 control bit FPGA Sites on DVI PSDB

39 Synplify DSP VHDL code DDR From graphics card 24 data bit 3 control bit Clk 12 MSB data bit 12 LSB data bit 12 double data rate bit To screen PLL Clk Phased Clk 3 control bit FPGA VALID DVI CONNECTION DDR ‘1’ ‘0’ Sites on DVI PSDB 39 DVI RECEIVER DVI TRANSMITTER

40 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 40 PROJECT PROGRESS

41 WHY DOES IT WORK DDR clk 1# LSB 1# MSB 2# MSB2#LSB Tcd -mux Tpd -mux Tcd -mux Tpd -mux TholdTsuTholdTsu DDR Data out Transmitter clk 41 FF 12 MSB clk 12 LSB TRANSMITTER DDR 12-BIT DATATO SCREEN 2# MSB Tcd -ff Tpd -ff Tcd -ff Tpd -ff Tcd -mux Tpd -mux Transmitter clk 3# MSB

42 WHY DOES IT WORK 1# LSB 1# MSB 2# MSB2#LSB Tcd -mux Tpd -mux Tcd -mux Tpd -mux TholdTsuTholdTsu TholdTsu DDR Data out Phased clk 42 FF 12 MSB clk 12 LSB TRANSMITTER DDR 12-BIT DATATO SCREEN PLL 2# MSB Tcd -ff Tpd -ff Tcd -ff Tpd -ff Tcd -mux Tpd -mux Phased clk DDR clk Transmitter clk 3# MSB

43 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 43 PROJECT PROGRESS

44 DVI CONNECTION 44 DVI RECEIVER DVI TRANSMITTER From graphics card 24 data bit 3 control bit Clk 12 double data rate bit To screen Clk 3 control bit FPGA

45 DVI CONNECTION 45 DVI RECEIVER DVI TRANSMITTER From graphics card 24 data bit 3 control bit Clk 12 double data rate bit To screen Clk 3 control bit FPGA

46 46 RECEIVER CONFIGURATION “overhead_test” worked perfect But “delay” and “NLD” still had noise We found that the solution is to configure differently the receiver BAD CONFIGURATION GOOD CONFIGURATION Valid data and control signal FPGA clk obtained from the receiver

47 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 47 PROJECT PROGRESS

48 48 IDEAL CONTROL SIGNAL WAVEFORMS “NLD” works perfect Need to check the control signals with scope

49 Memory lack Frequency problem (pipeline) Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 49 PROJECT PROGRESS

50 CLOCK SIGNAL 50 AT SCOPEEXPECTED CLOCK

51 CONTROL SIGNALS AT SCOPE 51 hsyncvsync enable This signal caused when vsync=‘1’. 19”TFT LCD SXGA monitor data sheet

52 Memory lack Frequency problem –pipeline Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 52 PROJECT PROGRESS

53 ITERATION AND FREQUENCY TRADEOFF The more we pipelined our design in order to get higher frequencies, the less iterations we can load 7 iterations and 53MHz 12 iterations and 24.01MHz

54 Memory lack Frequency problem –pipeline Simple designs Checking the card at Gidel Gidel’s advice Why does it work Receiver configuration Ideal control signal waveforms Control signals on scope Maximum iteration minimum frequency Blanking test 54 PROJECT PROGRESS

55 BLANKING CHECK We built a special design which count the clock cycles for row, blanking and data.

56 AND THE BIG BONUS…….. 56

57 REC TRNS JOINING OF FOUR FPGA’S DVI PSDB 11 ITR (vqm) 1’st FPGA PLL 2’nd FPGA 3’rd FPGA 4’th FPGA Control signals Data clk Control and Data signals PLL clk PLL 11 ITR (vqm) 11 ITR (vqm) 11 ITR (vqm) clk DDR

58 THANKS… Our great supervisor Mike Sumszyk Lab staff Michael Yampolsky Gadi Tuchman 58

59 Our God in the sky We are happy to invite you to our demonstration at the lab 59

60 APPENDIX 60

61 Image processing Beltrami Smoothing Gaussian Smoothing 61

62 Comparison between SynplifyDSP and direct VHDL implementation Pros: The SynplifyDSP tool plugs into the familiar Simulink environment. The development is fast. Cons: Hard to obtain an optimal implementation (non optimal critical path) VHDL code that is hard to understand and therefore it is difficult to make changes 62

63 The Digital Visual Interface (DVI) is a video interface standard designed to maximize the visual quality of digital display devices DVI

64 Simulink design 64

65 Simulink design 65

66 SynplifyDSP – VHDL code 66

67 Synplify Pro 67

68 Synplify Pro 68

69 Procwizard + Quartus In the ProcWizard we create the interface between the FPGA and daughter board DVI port. The Quartus performs the place and route according to the Procwizard interface and the SynplifyPRO node-level netlist. 69

70 Project stages Simulink design of an existing Matlab code Adaptation of the Simulink design to SynplifyDSP components and constraints. Synthesis of the VHDL code produced by SynplifyDSP using SynplifyPro Integration of the above RTL component within the Gidel card architecture using Quartus II and ProcWizard Place and route by using Quartus II Loading RBF file to Gidel’s Procstar II card using ProcWizard 70


Download ppt "1 Implementation in Hardware of Video Processing Algorithm Performed by: Yony Dekell & Tsion Bublil Supervisor : Mike Sumszyk Semesterial project SPRING."

Similar presentations


Ads by Google