EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R.

EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R. Rao Presented by: Suvinda Mudigere Srikantaiah UTA ID: 1000646539

Aim and Abstract Aim: To investigate performance analysis of integer DCT of block sizes 8X8, 16X16 and 32X32 used in H.264, AVS China and WMV9. Abstract: This project discusses how the use of larger transforms, especially in high resolution videos, can provide better performance. In particular, transforms of sizes larger than 4x4 or 8x8, especially 16x16 and 32x32 are proposed because of their increased applicability to the de- correlation of high resolution video signals.

Introduction to IntDCT  Discrete cosine transform has been serving as the basic elements of video coding systems.  The integer discrete cosine transform is an integer approximation of the discrete cosine transform.  It can be implemented exclusively with integer arithmetic.  It proves to be highly advantageous in cost and speed for hardware implementations [1].

DCT to IntDCT  DCT matrix elements are real numbers and for a 16-order DCT, 8 bits are needed to represent these numbers in order to ensure perfectly negligible image reconstruction errors due to finite-length number representation  If the transform matrix elements are integers, then it may be possible to have a smaller number of bit representation and at the same time zero truncation errors.  Moreover, the resultant cosine values are difficult to approximate in fixed precision integers, thus producing rounding errors in practical applications. Rounding errors can introduce enough error into the computations and alter the orthogonality property of the transform

Definition: ICT matrix is in the form [2,3]: I = KJ where I is the orthogonal ICT matrix K is a diagonal matrix whose elements take on values that serve to scale the rows of the matrix J so that the relative magnitudes of elements of the ICT matrix I are similar to those in the DCT matrix. The matrix J is orthogonal with elements that are all integers.

Transforms used in some standards StandardTransform 1. MPEG-4 part 10/H.2648 X 8, 4 X 4 integer DCT 2. WMV-98 X 8, 8 X 4, 4 X 8, 4 X 4 integer DCT 3. AVS ChinaAsymmetric 8 X 8 integer DCT Table no.1: Transforms used in standards H.264, WMV-9 and AVS china [4].

DCT The forward Discrete Cosine Transform (DCT) of N samples is formulated by [11] for u = 0, 1,..., N - 1, where The function f(x) represents the value of the x th sample of the input signal. F(u) represents a Discrete Cosine Transformed coefficient for u = 0, 1, …, N – 1 First of all we apply this transformation to the rows, then to the columns of image data matrix

IDCT The Inverse Discrete Cosine Transform (IDCT) of N samples is formulated by: for x = 0, 1,..., N – 1, where The function f(x) represents the value of the x th sample of the input signal. F(u) represents a Discrete Cosine Transformed coefficient for u = 0, 1, …, N – 1 For image decompression we use this DCT.

DCT II  The DCT-II is probably the most commonly used form, and is often simply referred to as "the DCT" [6].  Given an input function f(i,j) over two integer variables i and j (a piece of an image), the 2D DCT transforms it into a new function F(u,v), with integer u and v running over the same range as i and j. The general definition of the transform is: where i,u = 0,1,…,M − 1; j,v = 0,1,…, N − 1; and the constants C(u) (or C(v)) are determined by where l = u,v

OVERVIEW OF CODING STANDARDS H.264, AVS CHINA AND WMV9

Int DCT in H.264:  H.264 video coding standard uses a transform for reduction of spatial correlation, quantization for bitrate control, motion compensated prediction for reduction of temporal correlation, and entropy encoding for reduction of statistical correlation.  One of the important changes in H.264 to fulfill better coding performance was the introduction of Integer transform. It is multiplier free and reduces implementation complexity.  In general, transform and quantization require several multiplications resulting in high complexity for implementation. So, for simple implementation, the exact transform process is modified to avoid the multiplications. Then the transform and quantization are combined by the modified integer forward transform, quantization, scaling.

Int DCT in AVS China  Audio Video Coding Standard (AVS) is the national standard of China. Its Enhanced Profile (EP) targets at high definition video coding.  It is expected that the use of larger transform, especially in high resolution videos, can provide higher coding gain.  The order-16 and order-32 transform proposed is an extended version of the order-8 ICT adopted in AVS.  Without significant increase in complexity, order-8 transform matrix can be extended to order-16 and order-32 transform matrix

Int DCT in WMV9  Windows Media 9 Series includes a variety of audio and video codecs, which are key components for authoring and playback of digital media.  Floating point arithmetic is ruled out on the decoder side in wmv9 for several reasons, the important ones being the need to minimize decoder complexity, and the need to implement decoders that precisely match the specification so as to avoid mismatch.  Floating point operations are not very portable across processors—their definitions usually involve some measure of tolerance, making them unsuitable for perfectly matching implementations.  It is largely accepted that low-precision integer arithmetic is a desirable feature.

EXTENDING ORDER 8 INTEGER TRANSFORM TO ORDER 16 AND ORDER 32

Dyadic symmetry

(1) Order-8 transform matrix (1) T8: Order 8 transform matrix [5].

Extending order 8 to order 16 Denoting even symmetry with ‘E’ and odd symmetry with ‘O’ about the solid line represents mirror image and negative mirror image.

(2) Order-16 transform matrix derived from order-8 transform matrix (2) T16: Order 16 transform matrix [5]. (2)

H.264 The transform matrices of order 8, 16 and 32 for H.264 are shown below. Note the Orthogonality in all three cases:

AVS China The transform matrices of order 8, 16 and 32 for AVS China are shown below. Note the Orthogonality in all three cases:

WMV9 The transform matrices of order 8, 16 and 32 for WMV9 are shown below. Note the Orthogonality in all three cases:

PERFORMANCE ANALYSIS

Performance Evaluation: In finding efficiency of integer DCT, standard images are applied as an input signal. Transforms considered will be DCT, Integer DCT of different block sizes. The following operations are performed in this project for the purpose of performance analysis:  a) Variance distribution for I order Markov process, ρ = 0.9 (Plot and Tabulate)  b) Normalized basis restriction error vs. # of basis function (Plot and Tabulate)  c) Plot fractional correlation (0< ρ <1)

Comparison of performances of 8X8 ICT NDCTH.264WMV9AVS China 1 6.1855 5.9638 2 1.0059 1.0014 1.0048 1.4042 3 0.3461 0.3447 0.3457 0.5565 4 0.1659 0.1674 0.1645 0.2647 5 0.1046 0.4275 6 0.0757 0.0767 0.0761 0.0955 7 0.0616 0.0629 0.0620 0.1008 8 0.0547 0.0567 0.0568 0.0420 a) Variances of transform coefficients

b) Normalized basis restriction error versus the number of basis (Order 8) NDCTH.264WMV9AVS China 1100.0000 2 22.6811 32.6507 3 10.1076 10.1632 10.1214 16.7932 4 5.7813 5.8539 5.7997 10.5088 5 3.7072 3.7616 3.7429 7.5200 6 2.4000 2.4544 2.4357 2.6919 7 1.4535 1.4956 1.4847 1.6132 8 0.6836 0.7088 0.7102 0.4747

Graph 1

Graph 2

Graph 3

Comparison of performances of 16X16 ICT a) Variances of transform coefficients NDCTH.264WMV9AVS China 19.8346 22.93272.88882.90692.9125 31.21081.16271.16551.1668 40.58150.54480.52950.5313 50.34830.3066 60.23140.19960.19620.1944 70.16850.14450.1418 0.1405 80.12950.11830.11890.1133 90.10470.1049 100.08770.1049 110.07590.1047 120.06750.1044 130.06160.1038 140.05740.1020 150.05470.09730.0972 160.05310.0780

b) Normalized basis restriction error versus the number of basis (Order 16) NDCTH.264WMV9AVSChina 1100.0000 238.5335 320.203920.478420.365420.3303 412.636513.211413.081213.0381 59.00239.80619.77169.7174 66.82577.89027.85567.8015 75.37946.64266.62946.5863 84.32655.73945.74345.7083 93.51725.0000 102.86284.3442 4.3441 112.31483.68883.68873.6886 121.84023.03433.03423.0341 131.41802.38172.38162.3815 141.03301.73331.73291.7328 150.67401.09551.09521.0951 160.33210.4876

Graph 1

Graph 2

Graph 3

Comparison of performances of 32X32 ICT a) Variances of transform coefficients NDCTH.264WMV9AVSChina 113.5681 26.84706.71766.78766.7987 33.72023.54203.54453.5496 42.02261.85071.79191.8000 5 1.24401.0464 6 0.83170.66650.65270.6445 70.59620.45340.45080.4457 80.44800.3513 0.35400.3429 90.35050.31050.31040.3106 10 0.28230.30930.3094 110.23330.30700.30710.3072 120.19670.3028 130.16890.29390.29460.2945 140.14710.2753 0.2752 150.12990.24030.23950.2394 160.1161 0.1648 17 0.1048 180.09550.1046 190.08780.1045 200.08140.1044 210.07600.1044 220.07140.1044 230.06760.1044 240.06430.1044 250.06160.1043 260.05930.1040 270.05750.10340.1035 280.05590.1024 290.05470.10010.1003 300.0538 0.09550.0954 310.05310.08670.08650.0864 320.05280.0677

b) Normalized basis restriction error versus the number of basis (Order 32) NDCTH.264WMV9AVSChina 1100.0000 257.5995 336.202836.607036.388336.3537 424.577325.538425.311625.2612 518.256819.754919.711919.6360 614.369316.485116.442116.3661 711.770314.402214.402414.3519 89.907212.985412.993612.9590 98.507111.8875 107.411910.917310.917410.9170 116.52989.9506 9.9500 125.80088.99138.99088.9899 135.18618.04508.04458.0437 144.65847.12647.12397.1233 154.19876.26616.26376.2633 163.79265.5151 173.42995.0000 183.10244.6725 192.80394.3456 202.52944.0190 212.2751 3.6926 222.03773.3663 231.81453.0400 241.60332.7138 251.40222.3875 261.20972.0616 2.0615 271.02421.7366 1.7364 280.84471.41341.41331.4131 290.67001.09351.09341.0932 300.49900.78060.77990.7798 310.3309 0.48230.48170.4816 320.16490.2115

Graph 1

Graph 2

Graph 3

References: 1. N. Ahmed, T. Natarajan, and K. R. Rao, "Discrete Cosine Transform", IEEE Trans. Computers, vol. C- 32, pp. 90-93, Jan 1974. 2. W. K. Cham and Y. T. Chan” An Order-16 Integer Cosine Transform”, IEEE Trans. Signal proc. vol. 39, issue no. 5, pp. 1205 – 1208, May 1991. 3. W. K. Cham, “Development of integer cosine transforms by the principle of dyadic symmetry,” in Proc. Inst. Electr. Eng. I: Commun. Speech Vis., vol. 136. no. 4, pp. 276–282, Aug. 1989. 4. S. Kwon, A. Tamhankar, K.R. Rao, “Overview of H.264/MPEG-4 part 10”, Special issue on “ Emerging H.264/AVC video coding standard”, J. Visual Communication and Image Representation, vol. 17, pp.183-552, Apr. 2006. 5. W. Cham and C. Fong “Simple order-16 integer transform for video coding” IEEE ICIP 2010, Hong Kong, Sept.2010. 6. R. Joshi, Y.A. Reznik and M. Karczewicz, “ Efficient large size transforms for high-performance video coding”, SPIE 0ptics + Photonics, vol. 7798, paper 7798-31, San Diego, CA, Aug. 2010. 7. M. Costa and K. Tong, “A simplified integer cosine transform and its application in image compression”, Communications Systems Research Section, TDA Progress Report pp. 42-119, Nov 1994. 8. A.T. Hinds, “Design of high-performance fixed-point transforms using the common factor method”, SPIE 0ptics + Photonics, vol. 7798, paper 7798-29, San Diego, CA, Aug. 2010.

9. S. Chokchaitam, M. Iwahashi and N. Kambayashi, “Optimum word length allocation of integer DCT and its error analysis”, Elsevier, Signal Processing: Image Communication vol. 19, pp. 465– 478, July 2004. 10. C Wei, P. Hao Q. Shi, “Integer DCT-based Image Coding”, National Lab on Machine Perception, Peking University Beijing, 100871, China. 11. P.C. Yip and K.R. Rao, “ The transform and data compression handbook,” Boca Raton, FL: CRC Press, 2001 12. Y. Zeng, et al “Integer DCTs and Fast Algorithms”, IEEE Trans. Signal proc. vol. 49, No. 11, Nov 2001. 13. P. Chen, Y. Ye and M. Karczewicz, “Video Coding Using Extended Block Sizes,” ITU-T Q.6/SG16, T09-SG16- C-0123, Geneva, Jan 2009. 14. B. Lee, et al “A 16×16 Transform Kernel with Quantization for (Ultra) High Definition Video Coding,” ITU-T Q.6/SG16 VCEG, VCEG-AK13, Yokohoma, Japan, April 2009. 15. G. Mandyam, N. Ahmed, and N. Magotra, “Lossless image compression using the discrete cosine transform”, Journal of Visual Communication and Image Representation, Vol.8, No.1, pp. 21-26, March, 1997. 16. W.Gao, et al “AVS - The Chinese next-generation video coding standard”, Joint development lab., Institute of computing science, Chinese academy of sciences, Beijing, China. 17. S. Srinivasan, et al “Windows Media Video 9: Overview and Applications,” Signal Processing: Image Communication, vol. 9, pp.851-875, Oct. 2004.

THANK YOU!!!

EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R.

Similar presentations

Presentation on theme: "EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R.

Similar presentations

Presentation on theme: "EE 5359: MULTIMEDIA PROCESSING PROJECT PERFORMANCE ANALYSIS OF INTEGER DCT OF DIFFERENT BLOCK SIZES USED IN H.264, AVS CHINA AND WMV9. Guided by Dr. K.R."— Presentation transcript:

Similar presentations

About project

Feedback