Presentation is loading. Please wait.

Presentation is loading. Please wait.

MOTION ESTIMATION IMPLEMENTATION IN VERILOG

Similar presentations


Presentation on theme: "MOTION ESTIMATION IMPLEMENTATION IN VERILOG"— Presentation transcript:

1 MOTION ESTIMATION IMPLEMENTATION IN VERILOG
ECE 734 PROJECT SHREYAS SRIVASTAVA

2 OBJECTIVE AND OUTCOMES
Develop a block based motion estimation implementation Design criteria: Should be fast enough to be capable of real time HD video with “reasonable power consumption” Outcome : Developed and tested a Verilog implementation[1] based on H.264 block based motion estimation capable of encoding 720p HD video.

3 PROBLEM STATEMENT For a given block in the current frame find the best possible match among candidate blocks in the reference frame Matching criteria (SAD) :

4 H.264 BASED MOTION ESTIMATION
Variable Macro block size motion estimation Each pair of reference and current 16X16 macro block generates 41 Motion Vectors

5 Steps Taken Making a design estimate and requirements for supporting H.264 based VSBME. Should support HD 720p Selecting a suitably fast architecture to perform the actual SAD computation Write a verilog based design Validation of design Evaluating performance design

6 PERFORMANCE SPECIFICATION
Desired throughput Rate for Design: 108,000 macro blocks for 720p HD video Table showing number of macro blocks to be processed per second[3]

7 Hence some ballpark figures indicating range of possible power values
POWER SPECIFICATION Hard to get power figures for the motion estimation block separately. So reported below is the power for encoder block Hence some ballpark figures indicating range of possible power values Product Power Description Broadcomm 450 fps 720p HD video+ Audio encoder Shen et al[2] 423 mW Full Search Motion estimation Koziri et al[4] 95 mW Low-Power 720pMotion Estimation

8 SELECTED ARCHITECTURE OVERVIEW[1]
Motion estimation engine 16*16 array : performs 256 SAD s per cycle

9 ARCHITECTURE OVERVIEW
CONTROL UNIT AGU 16 *16 PE ARRAY SA SRAM SA0 41 MV’s Adder & Comp. SA SRAM SA1 16 4X4 SAD 41 min SADs SA SRAM SA 15

10

11 PERFORMANCE Macro block and search data pipelining ensures no stall unless switching rows. Cycles required to process 30 frames of HD 720p : (30frames )*(3600 macro blocks)* (1024 cycles/macroblock) +(30 *48*(720/16) stall cycles) = cycles @ 30 fps Frequency required = MHz Reported Frequency : 115 MHz(TSMC 40nm) Reported Power: 405 mW

12 DESIGN COMPARISON Parameter [3] Original Paper[1] My implementation
# of PE 16*16 Process 0.35 um 0.18 um 40nm Gate count 106 K 154 K 270 K(includes memory) On Chip Memory 24 K Bits 60K Bits 60 K Bits Frequency 66 MHz 100 MHz 115 MHz Throughput(blocks/sec) (search range: 32*32) 61218 97560 112,304

13 VALIDATION Hierarchical validation : individually tested the processing element , processing array and control unit Testbench with search area SRAM initialized and performed RTL simulation to ensure that data path computations are correct at each cycle

14 FUTURE WORK Low power design could involve looking into faster searching algorithms like diamond searching and hexagon based searching. REFERENCES [1] Kim et al. A Fast VLSI Architecture for Full-Search Variable Block Size Motion Estimation in MPEG-4 AVC/H.264 [2] Jun-Fu Shen, Tu-Chih Wang, and Liang-Gee Chen; A Novel Low-Power Full Search Block-Matching Motion-Estimation Design for H.263+ IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 7, JULY 2001 [3] [4]Koziri et al. Novel Low-Power Motion Estimation Design for H.264


Download ppt "MOTION ESTIMATION IMPLEMENTATION IN VERILOG"

Similar presentations


Ads by Google