Presentation is loading. Please wait.

Presentation is loading. Please wait.

NCTU, EE, Vision Lab Implementation of H.264 Based System on Multi-DSPs Board  陳奕安  2008.02.13 1.

Similar presentations


Presentation on theme: "NCTU, EE, Vision Lab Implementation of H.264 Based System on Multi-DSPs Board  陳奕安  2008.02.13 1."— Presentation transcript:

1 NCTU, EE, Vision Lab Implementation of H.264 Based System on Multi-DSPs Board  陳奕安  2008.02.13 1

2 NCTU, EE, Vision Lab Outline  System description Architecture MEX Board TMSDM642  Communication interface  Software development  Error resilience 2

3 NCTU, EE, Vision Lab PC 2 Architecture MEX Board 2 MEX Board 1 Capture Frame H.264 Encode Send to Network Display H.264 Decode Receive from Network PC 1 PC 2 3

4 NCTU, EE, Vision Lab MEX Board  MEX board is composed of : 4 DSP TMS320DM642 for data stream compression (video/audio) and its memory. 2 FPGA for flexible architecture 8 video chips SA6711H(ADC) 4 audio stereo chip CS4221(ADC 4 audio stereo chip CS4221(ADC) 4

5 NCTU, EE, Vision Lab MEX Board 4 DM642 2 FPGA Video/Audio Chip Block Diagram of MEX board[1] 5

6 NCTU, EE, Vision Lab MEX Board Block Diagram Block Diagram of MEX board[1] 6

7 NCTU, EE, Vision Lab TMS320DM642  TMS320DM642 Performance : 4000-4800 MIPS Two Level Cache : ○ L2: 256 KB, L1P: 16 KB, L1D: 16 KB 3 Video Ports 8-Bit McASP Ethernet MAC 32-Bit HPI 66 MHz PCI 64-Bit EMIF DSP DM642 block diagram[2] 7

8 NCTU, EE, Vision Lab TMS320DM642  Peripherals will be used: Enhanced DMA (EDMA) Video ports (VP0~VP2) Inter-integrated circuit (I 2 C) bus External memory interface (EMIF) Ethernet media access controller(EMAC) Management data input/output (MDIO) 8

9 NCTU, EE, Vision Lab Outline  System description  Communication interface Host/ MEX Communication Video capturing/ Displaying Network Transmit  Software development  Error resilience 9

10 NCTU, EE, Vision Lab PC MEX Host/ MEX Communication DSP started : fill memory Initialize transfer DSP to PCI transfer request Start Transfer Transfer finished Set DSP FIFO Direction Set FIFO Full Flag value DSP FIFO is reset Start EDMA Unreset DSP1 FIFO Clear PCI Interrupt PCI started : wait for interrupt Initialize transfer PCI to DSP start transfer request Wait for transfer finished Transfer finished Set transfer size Set PCI FIFO direction Select DSP data sources Set transfer destination address Start PCI FIFO Clear DSP Interrupt 10 Data transfer from the 4 DSP (SDRAM) to PCI [1]

11 NCTU, EE, Vision Lab Video Capture Camera MEX Board Video Chip SAA7113H (ADC) DM642 VP0 VP1 VP2 ITU656 : Digital / for PAL or NTSC Raw Data DMA NTSC : Analog / 525-line per frame / 30 frames per second or PAL : Analog / 625-line per frame / 25 frames per second I 2 C BUS 11

12 NCTU, EE, Vision Lab TMS320DM642 Video Port 12 [3]

13 NCTU, EE, Vision Lab Network Architecture MEX Board 1 PHY LXT971ALC DM642 EMAC MDIO MEX Board 2 PHY LXT971ALC DM642 EMAC MDIO RJ45 13

14 NCTU, EE, Vision Lab TMS320DM642 EMAC  DM642 Networking Using EMAC and MDIO 14 DM642 Networking [4]

15 NCTU, EE, Vision Lab Outline  System description  Communication interface  Software development H.264 Codec Optimization Parallelization Memory Issue  Error resilience 15

16 NCTU, EE, Vision Lab H.264 Encoder Block Diagram 16

17 NCTU, EE, Vision Lab H.264 Decoder Block Diagram 17

18 NCTU, EE, Vision Lab Optimization on Single Chip Realization and Optimization of DSP Based H.264 Encoder [5]  Optimization of H.264 on DSP platform Code transplant and primary optimization Optimization of the key module Using TI C64x IMAGLIB  Data scheduling and storage allocation Data scheduling with EDMA Storage allocation (Code section/Data section)

19 NCTU, EE, Vision Lab Parallelization on Chips  One GOP in one DSP Each DSP handles IPPP… or IBBPBB.... No dependences are between group of pictures (GOPs).  One Frame / One macroblck in one DSP Each DSP handle one frame or one macroblock. Dependences are between frames and macroblocks. 19

20 NCTU, EE, Vision Lab Macroblock Dependencies  Data dependencies induced by inter-prediction: Motion vector MV cur are predicted from MV A~D 20 MV D MV B MV C MV A MV cur Reference frame Current frame Data dependencies induced from MV prediction [6]

21 NCTU, EE, Vision Lab Macroblock Dependencies  Data dependencies induced by intra-prediction: Left, upper-left, upper, and upper-right MBs Data dependencies induced from intra prediction [6] 21

22 NCTU, EE, Vision Lab Macroblock Dependencies  Data dependencies induced by deblocking filter: Top 4 rows of pixels and leftmost 4 columns 22 Data dependencies induced from deblocking filter [6]

23 NCTU, EE, Vision Lab Intra Pred. MV Pred. Intra Pred. MV Pred. Deblocking Fitler Intra Pred. MV Pred. Intra Pred. MV Pred. Deblocking Fitler Current MB Macroblock Dependencies 23  Possible spatial data dependencies for a macroblock Possible spatial data dependencies for a macroblock [6]

24 NCTU, EE, Vision Lab Macroblock Dependencies  Macroblock Dependencies: Data dependencies between frames Data dependencies between MB rows in the same frame Data dependencies in the same MB row 24

25 NCTU, EE, Vision Lab Wave-front parallelization  Partition for MB region Wave-front of Macro-block Region Partition [7] 25

26 NCTU, EE, Vision Lab Wave-front parallelization Wave-front of Frame Partition [7] 26  Partition for frames

27 NCTU, EE, Vision Lab Memory Issue 27 L1P Cache Direct Mapped 16Kbytes Total DM642 DSP Core L1D Cache 2-way Set Associated 16Kbytes Total L2 Cache/ Memory 256Kbytes Total Two-level cache architecture of DM642 EDMA Controller peripherals  Limited memory of DM642  Use memory buffer to reduce memory access

28 NCTU, EE, Vision Lab Memory Issue  Memory hierarchy for inter prediction 28 Memory hierarchy [8]

29 NCTU, EE, Vision Lab Memory Issue  Slice memory buffer for intra prediction and deblocking filter Slice Memory [9] 29

30 NCTU, EE, Vision Lab Outline  System description  Communication interface  Software development  Error resilience Error-Resilience Tools in H.264/AVC Error resilience of JM source code 30

31 NCTU, EE, Vision Lab Error Resilience Tools in H.264/AVC  Redundant slices (RSs) [10] For a MB, an encoder can place redundant representation of the same MBs into the same it stream. e.g. ○ One slice is coded using different quantization parameter (QP). ○ If the slice of low QP is available, the decoder discards the RS; otherwise, the RS is reconstructed by the decoder Slice A QP1 Slice A QP2 Decoder

32 NCTU, EE, Vision Lab  Parameter sets [10] Including picture size, entropy coding method, MV resolution, and so on. Sequence parameter set (SPS) ○ Containing all information related to the picture sequence between two IDR (Instantaneous Decoding Refresh ) pictures. Picture parameter set (PPS) ○ Containing all information related to all slices in a picture. e.g. Sending multiple copies of SPSs so to enhance the arrival rate. e.g. SPSs can be sent out-of-band. Error Resilience Tools in H.264/AVC

33 NCTU, EE, Vision Lab Error Resilience Tools in H.264/AVC  Flexible macro-block ordering (FMO) [10] 7 modes Overhead bits highly depends on the picture format, the content, and the QP. ○ < 5% penalty at QP = 16; on average 20% at QP = 28. 6 modes of FMO [10]

34 NCTU, EE, Vision Lab Error Concealment of H.264/AVC  Error concealment scheme provided in JM Intra Inter ○ Error concealment for macro-blocks [11]

35 NCTU, EE, Vision Lab Future Work  Optimization the H.264 codec for real time  Implementation of different concealment methods  Proposed corresponding error resilience methods

36 NCTU, EE, Vision Lab Reference  [1] VITEC MULTIMEDIA, “MEX User manual Revision 1.7”.  [2] Texas Instruments, Incorporated “TMS320C64x DSP Generation Product Bulletin” (sprt236)  [3] Texas Instruments, Incorporated “TMS320DM64x Video Port to Video Port Communication.” (spraaf3)  [4] Texas Instruments, Incorporated “TMS320C6000 DSP Ethernet Media Access Controller (EMAX) Management Data Input Output Module Reference Guide.” (spru628a)  [5] Zhe Wei and Canhui Cai “Realization and Optimization of DSP Based H.264 Encoder “, ISCAS 2006 Circuits and Systems, May 2006  [6] Chen, Y., Li, E., Zhou, X., Ge, S. “Implementation of H. 264 Encoder and Decoder on Personal Computers.” Journal of Visual Communications and Image Representation 17 (2006)  [7] Zhuo Zhao, and Ping Liang, “Data partition for wave-front parallelization of H.264 video encoder”, 31st IEEE International Conference on Acoustics, Speech, and Signal Processing (2006)  [8] Denolf, K. De Vleeschouwer, et al,, “Memory centric design of an MPEG-4 video encoder”, IEEE Trans. CSVT, Vol. 15, No. 5, pp. 609-619, May 2005.  [9] Tsu-Ming Liu et al., “A 125μW, Fully Scalable MPEG-2 and H.264/AVC Video Decoder for Mobile Applications,” ISSCC Digest of Technical Papers, pp. 402-403, Feb. 2006.  [10]S. Wenger, “H.264/AVC over IP,” IEEE Trans. Cir. Syst. Video Technol., vol. 13, pp. 645–656, July 2003.  [11] "Non-normative error concealment algorithms , ITU-T VCEG-N62[S 】, 2001 一 O9 36

37 NCTU, EE, Vision Lab H.264 Partitions Frame partitionsMacroblock partitions 16x16 blocks8x8 blocks4x4 blocks 37

38 NCTU, EE, Vision Lab H.264 Intra-Mode Decision 38

39 NCTU, EE, Vision Lab H.264 Intra-Mode Decision 39 16*16 plane 4*4 horizontal

40 NCTU, EE, Vision Lab 2015-11-140 Fast integer & fractional pixel motion estimation Integer pixel search scheme -15-10-5051015 -15 -10 -5 0 5 10 15 Cover both small motion and large motions, the search point which gives the smallest matching error from one step is the starting point of next step. Around 130 points searched in this algorithm, the save is (33x33- 130)/(33x33)  90%! If there are 3 starting points are tried, the save is around 64%! Assume the guessed starting point is (0,0). step2-1 Step 2-1. local full-search around the starting point step2-2 Step 2-2. Uneven multi-hexagon search step3-1 Step 3-1. Extended Hexogon-based search The search will continue until the minimal matching error point is the center of the new hexagon. step3-2 Step 3-2. Center biased search. step1 Step 1. Unsymmetrical-cross search

41 NCTU, EE, Vision Lab 2015-11-141 Fast integer & fractional pixel motion estimation Fractional pixel search scheme Best matching integer point coming from integer motion search 1.Search its 1/2 -pixel neighbors 2.Search its 1/4-pixel neighbors 3.Search its 1/8-pixel neighbors The optimal point is the search center of next step search.


Download ppt "NCTU, EE, Vision Lab Implementation of H.264 Based System on Multi-DSPs Board  陳奕安  2008.02.13 1."

Similar presentations


Ads by Google