Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM.

Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM FOR VIDEO TECHNOLOGY, VOL. 21, NO. 2, FEBURARY 2011

Outline Introduction SVT (Spatially varying transform) – Selection of SVT block-size – Selection and coding of candidate LP – Filtering of SVT block boundaries Implementing SVT in H.264/AVC FSVT Experimental Result Conclusion

Introduction Why SVT? Some drawback of H.264/AVC – Most standard doesn’t align the underlying transform with the possible edge location. – [4] directional DCTs is proposed to improve the efficiency for directional edges, but not efficient in vertical, horizontal, and nondirectional edges. – Coding the entire prediction error signal may not be the best in RD tradeoff, e.g., SKIP mode [4] B. Zeng and J. Fu, “Directional discrete cosine transforms: A new framework for image coding,” IEEE Trans. Circuits, Syst. Video Technol., vol. 18, no. 3, pp. 305–313, Mar. 2008.

Rate-Distortion – The classical method of making encoding decisions is for the video encoder to choose the result which yields the highest quality output image. However, this has the disadvantage that the choice it makes might require more bits while giving comparatively little quality benefit. One common example of this problem is in motion estimation, [1] and in particular regarding the use of quarter pixel- precision motion estimation. Adding the extra precision to the motion of a block during motion estimation might increase quality, but in some cases that extra quality isn't worth the extra bits necessary to encode the motion vector to a higher precision.motion estimation [1]quarter pixel- precision motion estimationblock

Introduction (cont.) Basic idea of SVT : – Do not restrict the transform coding inside regular block boundaries. – i.e., selecting and coding the best portion of the prediction error to achieve coding efficiency improvement in terms of RD tradeoff. – SVT can be considered as a special SKIP mode, part of the macroblock (Do not be coded into bitstream) is skipped

Introduction (cont.) Shifting the transform has been used in denoising.[6] – [9] (Often used in post- processing) (e.g. in-loop-filter) Have bad effort if applied at the boundary and to the small area (e.g. macroblock) [6] A. Nosratinia, “Denoising JPEG images by re-application of JPEG,” in Proc. IEEE Workshop MMSP, Dec. 1998, pp. 611–615. [7] R. Samadani, A. Sundararajan, and A. Said, “Deringing and deblocking DCT compression artifacts with efficient shifted transforms,” in Proc. IEEE ICIP, Oct. 2004, pp. 1799–1802. [8] J. Katto, J. Suzuki, S. Itagaki, S. Sakaida, and K. Iguchi, “Denoising intra-coded moving pictures using motion estimation and pixel shift,” in Proc. IEEE ICASSP, Mar. 2008, pp. 1393–1396. [9] O. G. Guleryuz, “Weighted averaging for denoising with overcomplete dictionaries,” IEEE Trans. Image Process., vol. 16, no. 12, pp. 3020 – 3034, Dec. 2007.

Introduction (cont.) Proposed method has no drawback mentioned above. And the location parameter(LP) is coded in the bitstream for decoder to reconstruct MB. Drawback – High encoding complexity due to the brute force search process to select the best LP. – Solution : FSVT

SVT Transform coding is widely used to decorrelate the prediction error and achieve high compression rates. Traditional transform coding drawback – If prediction error at fixed locations has a structure that is not suitable for underlying transform, many high frequency transform coefficients will be generated. (more bits to code) – Notorious visual artifacts may appear (e.g. ringing) when these coefficients get quantized.

SVT (cont.) What’s new is SVT: – Transform coding is not restricted inside regular block boundary. (can be applied to any portion of the prediction error) – The selection is due to the reduction of complexity. This means that the position and shape of the transform block is variable, and the information(shape and position) is signaled to the decoder.

SVT (cont.) Three issues of SVT – Selection of SVT block-size – Selection and coding of candidate LP – Filtering of SVT block boundaries

Selection of SVT Block-Size M*N SVT is applied on a selected M*N block inside a macroblock(size 16*16) and ONLY THIS BLOCK IS TRANSFORM CODED. (17-M)*(17-N) possible LPs. Factors of choosing M and N – Larger M & N will result in fewer possible LPs. – Larger M & N will result in low distortion but need more bits in coding the transform coefficient. – Larger block-size transform is more suitable for flat areas and smaller is suitable for sharp edges.

Selection of SVT Block-Size (cont.) To facilitate the transform design, M = 2^m and N = 2^n. 4 SVT block size in this chapter : 8*8, 4*16, 16*4 and 0*0 (means SKIP mode) Block size can be changed according to different sequence for better performance.

Selection of SVT Block Size (cont.) Due to the well established variable block-size transform (VBT), variable block-size SVT is better than fixed block-size. Different block size issue (drawback) : – When the number of SVT become larger, the bits need to code the LPs gains more.

Selection of SVT Block Size (cont.) As mentioned before, VBT can be used for SVT. For 8*8 SVT, transform kernel in H.264 can be used. For 4*16 and 16*4 SVT, 4*4 transform kernel in H.264 and 16*16 transform kernel in [14] can be used with the butterfly structure of 8*8. [14] S. Ma and C.-C. Kuo, “High-definition video coding with supermacroblocks,” in Proc. SPIE Vis. Commun. Image Process., vol. 6508, 650816. Jan. 2007, pp. 1–12.

Selection and Coding of Candidate LPs When there are nonzero transform coefficient of the SVT, its location needs to be coded and transmitted. The best LP selected according to RDO(rate distortion optimization) [15] [15] T. Wiegand, H. Schwarz, A. Joch, F. Kossentini, and G. J. Sullivan, “Rate-constrained coder control and comparison of video coding standards,” IEEE Trans. Circuits, Syst. Video Technol., vol. 13, no. 7, pp.688–703, Jul. 2003.

Selection and Coding of Candidate LPs (cont.)

As mentioned before, 6-bit fixed length is needed for representing LP index. And in chroma case :

Filtering of SVT Block Boundaries For using SVT, deblocking process needs to be adjusted because the selected SVT block may not align with the regular block boundaries. Both the edges of the selected SVT block and the macroblock may be filtered.

Filtering of SVT Block Boundaries (cont.)

Implementing SVT in H.264/AVC

Implementing SVT in H.264/AVC (cont.) Several key parts of H.264/AVC need to be adjusted. – Macroblock types – Coded block pattern – Entropy coding – deblocking

Marcoblock Type

Coded Block Pattern In experiment, luma CBP is often equal to 1 in high fidelity video coding. Based on the observation, set the new macroblock modes to have luma CBP equal to 1.

Entropy Coding In H.264, CAVLC use a different coding table based on the total number of nonzero coefficients. For SVT, a fixed coding table is used. In order to derive some information about the number of nonzero coefficients in each 4*4 luma block, the following two steps are used :

Entropy Coding (cont.) Step 1 : If luma block overlaps with a coded block that has nonzero coefficients in the selected SVT block, then mark it to have nonzero coefficients. (Using for deblocking) Step 2 : The number of nonzero transform coefficient for 4*4 block is empirically set by And finally, distribute the total nonzero transform coefficient to the blocks that mark as having nonzero coefficient.

Deblocking As mentioned above in SVT chapter

FSVT (Fast Algorithms for SVT) The encoding complexity of SVT is higher due to the brute force search process in RDO. Typically, conducting transform, quantization, and entropy coding,.etc, are needed for RDO. The basic idea to reduce the encoding complexity is to reduce the number of LPs.

FSVT (cont.) There are two case : – 1. Skip testing SVT for macroblocks for which SVT is unlikely to be useful. (by examining RD cost) – 2. The proposed fast algorithm selects LPs based on the motion difference and utilizes a hierarchical search algorithm to select best LP.

Macroblock Level Fast Algorithm SVT is applied for macorblock modes only if Where J are the minimum RD cost without SVT coding. J mode refers to RD cost of the current macroblock mode to be tested with SVT.

Macroblock Level Fast Algorithm (cont.) The threshold represent empirical upper limit of bitrate reduction.

Block-Level Fast Algorithm 1. Selection of Available Candidate LPs Based on Motion Difference 2. Hierarchical Search Algorithm

Selection of Available Candidate LPs Skip testing a candidate LP if one of the following condition is true : – 1. If that SVT block at that position overlaps with at least two neighboring motion compensation blocks and motion vectors of these blocks are larger or equal to predefined threshold. – 2. If the reference frames of these neighboring blocks are different.

Hierarchical Search Algorithm Idea : find the best LP in a relatively coarse resolution and refine the result in a finer resolution. Step1 : Find lowest RD cost as set1, and his two neighbors as set2 Step2 : Find best zone. A zone is available if and only if all three candidate LPs is available Step3 : Select best LP from set1, set2, and best zone

Experimental Environment VBSVT and FVBSVT are performed in both HD and lower resolution video coding. Some coding parameter used – High Profile – QP I = 22,27,32,37 QP P = QP I + 1 – CAVLC/CABAC – Frame structure IPPP – MV search range 64/32 pixels for 720p/CIF – RDO in the high complexity mode

Experimental Environment Intel® Core™2 Quad CPU Q6600 @2.40GHz 2G Measure the average bitrate reduction compared to H.264/AVC using Bjontegaard tool[20]. Two configuration are tested – Low complexity configuration: 4*4 transform is not used – High complexity configuration: Codec with full usage of the tools provided in H.264

Experimental Result

Experimental Result (cont.)

By varying the position of the transform block and its size, the prediction error is better localized, and coding efficiency is improved. The encoding complexity of SVT is relatively high because of brute force searching. (RDO) To deal with question above, FSVT is proposed to skip testing most of macroblock that not suitable with SVT.

Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM.

Similar presentations

Presentation on theme: "Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM.

Similar presentations

Presentation on theme: "Video Coding Using Spatially Varying Transform Cixun Zhang, Kermal Ugur, Jani Lainema, Antti Hallapuro and Moncef IEEE TRANSACTIONS ON CIRCUITS AND SYSTEM."— Presentation transcript:

Similar presentations

About project

Feedback