H.265 Bitstream To H.264 Bitstream Transcoder

H.265 Bitstream To H.264 Bitstream Transcoder
By Deepak Hingole Advising Professor: Dr. K. R. Rao Good morning everyone!!!! Before I proceed with my talk today, let me thank everyone of you for being here for my thesis defense. My thesis topic is implementation of a H.265 bitstream to H.264 bitstream transcoder.

Outline Introduction to Video Coding Standards
Transcoding and need for it Comparison of H.264 and H.265 Performance of Transcoder Different transcoding algorithms Choice of transcoding schemes H.265 Decoder and H.264 Encoder Results and Conclusions Future work This is going to be my outline for today’s talk. Starting with Introduction to Video Coding Standards, followed by what is transcoding and why is it required in this case. Key differences between the two coding standards involved. This is followed by study of different transcoding architectures and why one of them is deemed as preferred choice for this thesis. Finally the results of proposed method are presented and conclusion is drawn from them. Last but not the least, I’ll talk about the future work in this direction.

Introduction to Video coding standards
Remove redundancies Two standards bodies ITU-T and ISO/IEC JVT : H.264/AVC (May 2003) JCT-VC : H.265 (April 2013) Objective of any compression scheme is to remove the redundancies present in the data, so that it’s storage or transmission will require less number of bits. Various kind of redundancies that can be reduced are spatial, temporal, statistical and perceptual. Spatial redundancies are removed by using scheme called Intra prediction, temporal redundancies are removed by Inter prediction, statistical redundancies by entropy coding and perceptual redundancies by HVS based quantization. The most important developments in video coding standards have been due to two international standards bodies: ITU (International Telecommunications Union, Telecommunication standardization sector) and ISO (International Standards Organization), IEC (International Electrotechnical Commision). As the name suggests, ITU’s focus has been on standards with telecommunication applications in mind. Whereas ISO’s focus has been on standards keeping in mind multimedia storage and transmission. Collaborative effort from group of experts from these two bodies under the group name JVT was responsible for development of H.264 while group name JCT-VC is responsible for development for H.265 i.e., HEVC.

Transcoding [15] Bitstream 1 Bitstream 2 Set 1 of attributes
When I talk about a bitstream, it is nothing but a string of bits representing compressed video following a particular syntax set by the standard. Transcoding is the change of one or more attributes of this bitstream so as to meet target requirements. Attributes that can be changed are bitrate, spatial resolution, temporal resolution, format , etc Attributes: Bitrate, spatial resolution, temporal resolution, format

Coding comparison HEVC with other standards [52]
Need for transcoding High coding efficiency of HEVC at the cost of implementation complexity The Rate – Distortion plot compares how various standards perform against each other. HEVC was developed with the goal of achieving same video quality as that of H.264 at half the bitrate. This coding efficiency of HEVC comes at the cost of increased complexity. Coding comparison HEVC with other standards [52]

Need for transcoding Continued…
HEVC capable hardware not widely available H.264 capable hardware widely available Avoid the need to work on original raw video to generate H.264 encoded video H.264 is very well established and widely used in industry whereas HEVC is finding its ground in industry. Manufactures have started support for hardware capable of running HEVC software only recently while H264 is supported on almost all devices. In order for the content encoded using H.265 to be played on H.264 capable devices and to remove the requirement of uncompressed original video, format conversion transcoder is necessary. In this thesis my focus was on format conversion from H.265 to H.264.

Key Differences in H.264/AVC and HEVC [54]
Tool H.264 (AVC) H.265 (HEVC) Partition Size 16×16 block size (Macroblock) 64×64 block size (Coding Tree Unit) Intra-prediction Upto 9 predictors 35 predictors Transform size 8×8 and 4×4 32×32, 16×16, 8×8 and 4×4 Here’s a quick comparison of the HEVC and H.264 standards involved in transcoding focusing on key differences.

9 predictors used for 4×4 Luma block in H.264/AVC [12]
33 angular + 1 DC + 1 Planar prediction used in HEVC [13]

Performance Of a Transcoder
Bitstream quality Use of information Cost Complexity Performance of a transcoder is measured with these parameters. In order to achieve optimum results after transcoding: Quality of the transcoded bit stream should be comparable with that of the one obtained by direct decoding and re-encoding of the input stream. Information contained in the input stream should be used as much as possible, to avoid multigenerational deterioration. The process should be cost efficient, low in complexity and achieve highest quality possible.

Open Loop Transcoding Architecture [16]
Simple implementation No frame memory required No IDCT required Subject to drift Let’s have a look at some of transcoding architectures: Open loop: This kind of transcoding scheme can be used for bitrate conversion As can be seen from the figure, in the open-loop system, the bit stream is variable-length decoded (VLD) to extract the variable-length code words corresponding to the quantized DCT coefficients, as well as MB data corresponding to the motion vectors and other MB-level information. The quantized transform coefficients are inverse quantized and then simply requantized to satisfy the new output bit rate. Finally, the requantized coefficients and stored MB-level information are variable length coded (VLC). Advantages of this scheme will be: Simple implementation Less costly, since no frame memory is required No IDCT is required Disadvantages of this scheme will be: This scheme is drift prone, since quantization is a lossy process.

Closed Loop Transcoding Architecture [16]
Drift in open loop transcoding architecture is caused because of high frequency information being lost in quantization process. Closed loop architecture tries to eliminate this drift by approximating cascaded decoder encoder architecture. It requires one reconstruction loop, one DCT, one IDCT, frame storage Eliminates drift Requires one reconstruction loop Requires one DCT and one IDCT Requires frame storage

Cascaded Decoder Encoder Architecture [17]
Here’s a block diagram for cascaded decoder – encoder architecture. Cascaded decoder – encoder is also known as cascaded pixel domain. R1(t) and R2(t) are input and output rates. Xn and Yn are the local decoded pictures of the input and output bit streams respectively. Similarly Xn-1 and Yn-1 are those of previous frame. ΔXn = Xn – Xn-1 is the reconstructed error from the incoming bit stream. As in the previous cases, the input bitstream at rate R1(t) is variable length decoded to extract variable length codewords corresponding to compressed coefficients and Motion vector information. At point A we have inverse quantized compressed coefficients. These are inverse transformed to get ΔXn. Decoded frame Xn is obtained by adding previously decoded frame to the error ΔXn. The locally decoded frame Xn in pixel domain is then used as input to encoder. Prediction from previously decoded frames Yn-1 is subtracted from Xn to obtain residual frame, when then goes through transformation. At point B we have transformed coefficients which are then quantized and variable length coded to get bitstream at rate R2(t). Advantages of this scheme will be: easy implementation Low complexity since we can reuse existing software decoders and encoders Decoding is faster since there is no motion estimation step required. However the disadvantage is the fact that we have to implement motion estimation in re-encoding phase.

Choice of a transcoding schemes
Here the figure shows frame based comparison of the quality among the cascaded pixel-domain, open-loop, closed-loop architectures for a Group of Pictures (GOP) of size N = 30 and M=3 . In GOP size, N represents the total distance between two full images (I-frames) while M represents the distance between anchor frames (I or P) in GOP. As can be seen here, open loop suffers from severe drift, whereas the quality of closed loop scheme is very close to that of the cascaded scheme. Cascaded decoder encoder architecture gives optimal results in terms of complexity, quality and cost. It also offers greater flexibility in the sense that it can be used for bit rate conversion, spatial/temporal resolution reduction or format conversion. Cascaded scheme is also considered as an ideal transcoder since it comprises of a full decoder and full encoder. The benefits of using cascaded decoder- encoder architecture outweigh it’s disadvantages and hence I chose cascaded decoder encoder architecture as my preferred algorithm. Frame based comparison of open loop, closed loop and cascaded pixel domain architecture [16]

Block diagram for Transcoding
Here’s a simple block diagram for the transcoder implemented in this thesis. Raw video is encoded using HEVC encoder to first get the H.265 encoded bitstream. Our objective is to transcode this H.265 bitstream to H.264 bitstream. To do so, we first decode it using HEVC decoder i.e., TAppDecoder application in HM software. The reconstructed video is given as input to H.264 encoder i.e., lencod, the final result is H.264 encoded bitstream. Quality of this transcoded bitstream is compared with that of the one obtained after direct encoding of raw video using JM software. Next 2 slides, talk about the general block diagram of HEVC decoder and AVC encoder.

HEVC encoder block diagram [13]

HEVC decoder block diagram [53]
Here’s a simple block diagram for HEVC decoder. Input bitstream is first entropy decoded (in this case by making use of CABAC) followed by inverse quantization and inverse quantization as in any general decoder. Intra / inter mode is selected to get the prediction error which will be added to residual frame to get the reconstructed frame. Reconstructed frame goes through in loop filtering before it is saved in buffer for future predictions or being displayed.

H.264/AVC encoder block diagram [12]
Here’s a block diagram for H.264AVC encoder. An input frame or field Fn is processed in units of a macroblock. Each macroblock is encoded in intra or inter mode and, for each block in the macroblock, a prediction P is formed based on reconstructed picture samples uF’n, where u stands for unfiltered. D’n = F’n - P, is the difference block, which is transformed and quantized. Quantized coefficients at point X are reordered so that entropy encoding can reduce statistical redundancies. There’s an in loop decoder implemented within encoder. The purpose of this in-loop decoding block is to prevent errors caused by using different frames for prediction on decoder and encoder side.

H.264/AVC decoder block diagram [12]

Test sequences [40] akiyo_cif.y4m 352×288 30 fps
These are images for the test sequences used in the experimental simulations. akiyo_cif.y4m 352× fps city_cif.y4m 352× fps

Test sequences Continued…
crew_cif.y4m 352× fps flower_cif.y4m 352× fps football_cif.y4m 352× fps

Test conditions HEVC reference software: HM 16.7
H.264 reference software: JM 19.0 IDE: Microsoft Visual Studio 2013 OS: Windows 10 Home Edition (64 - bit) Processor: Intel(R) Core(TM) 2.00GHz 2.50GHz RAM: 8 GB These are the details of the test environment that was used for all the simulations. HM 16.7 was used as reference HEVC software for implementing Encoding and Decoding of raw video. Similarly JM 19.0 was used as reference H.264 software for implementing Re-encoding and encoding. Visual Studio 2013 was used compilation of the code.

PSNR vs. QP Results

PSNR vs. QP results continued

PSNR vs. QP results continued
Test Sequence akiyo_cif city_cif crew_cif flower_cif football_cif % Decrease in Avg. PSNR 3.047 4.205 2.622 2.621 3.741 As can be seen from these plots of YUV PSNR (dB) against QP, it is evident that PSNR of transcoder and that of JM 19.0 are of comparable values. % decrease in average psnr values is shown in the table.

Bitrate vs. QP Results

Bitrate vs. QP Results continued…

Bitrate vs. QP Results continued…
Test Sequence akiyo_cif city_cif crew_cif flower_cif football_cif % increase in Bitrate (kbps) 7.586 21.480 -0.464 8.671 -2.2 As can be seen from Bitrate vs QP plots, positive %age representing decrease in bitrate.

Rate Distortion Plots

Conclusions Video quality of transcoded video is comparable to that of direct encoded video The implementation was cost effective, although the encoding time can be a constraint since it involved motion estimation. ~2.6 % to ~4.2% decrease in average PSNR. ~-2.2 % to ~21.5% increase in average bitrate. Here are some of the conclusions that can be drawn from results in previous slides. From PSNR plots, Bitrate plots and R-D plots, it is evident that the quality of the transcoded video is comparable to that of direct encoded video. The implementation was also found to be less complex and cost effective, although there is a constraint of re-encoding time, since motion estimation is required. About 2.6% to 4.2% decrease in PSNR was observed Whereas about -2.2% to 21% increase was observed in bitrate

Future work Parallelization for reducing re- encoding latency.
Implement transcoding for different profiles and levels of HEVC and H.264 Map 35 Intra prediction modes of HEVC to 9 Intra prediction modes of H.264 Spatial resolution reduction. During my internship at Adobe, I worked towards reducing encoding latency by making use of parallelization techniques. On the similar lines, I had thought of reducing the encoding time in re-encoding phase but couldn’t get till that portion and hence I have mentioned the same under future work. Apart from that, in this thesis : the raw video was encoded using Main profile of HEVC and it was re-encoded to High profile of H.264. Experiments should be done with different combinations of profiles and levels of HEVC and H.264 and veracity of cascaded decoder- encoder algorithm as ideal transcoder be confirmed. As was mentioned in the key differences slide, HEVC is having 35 intra prediction modes whereas H.264 is having 9 intra prediction modes. We can exploit this fact and do a remapping of these prediction modes from HEVC to H.264. Implement down-sampling in spatial domain after the decoding phase to achieve spatial resolution reduction and then go for re-encoding.

Acronyms ASP : Advanced Simple Profile AVC : Advanced Video Coding
CABAC : Context Adaptive Binary Arithmetic Coder DBLK : De-blocking DCT : Discrete Cosine Transform GIF : Graphics Interchange Format HEVC : High Efficiency Video Coding HLP : High Latency Profile IDCT : Inverse Discrete Cosine Transform IEC : International Electrotechnical Commision ISO : International Organization for Standardization ITU-T: International Telecommunication Union, Telecommunication Standardization Sector List the acronyms used in the slides

Acronyms Cont. JCT-VC : Joint Collaborative Team – Video Coding
JVT : Joint Video Team MC : Motion Compensation ME : Motion Estimation MP : Main Profile MPEG : Moving Picture Experts Group NAL : Network Abstraction Layer Q : Quantization Q-1 : Inverse Quantization QIF : Quarter Intermediate Format QP : Quantization Parameter SAO : Sample Adaptive Offset T : Transform VLC : Variable Length Coding VLD : Variable Length Decoding

References List the references used in thesis.
Iain E. G. Richardson, “Video Codec Design, Developing Image and Video Compression Systems”, Wiley, 2002. Jayesh Dubhashi, “Complexity Reduction of Motion Estimation in HEVC”, M. S. Thesis, EE Department, UTA, Dec. 2014 K. R. Rao, Do Nyeon Kim, Jae Jeong Hwang, “Video coding standards: AVS China, H.264/MPEG-4 PART 10, HEVC, VP6, DIRAC and VC-1”, Springer, 2014. ISO/IEC 14496–10:2003, “Coding of Audiovisual Objects—Part 10: Advanced Video Coding,” 2003, also ITU-T Recommendation H.264 – Advanced Video Coding for Generic Audio-Visual services. Joint Video Team (JVT), ITU-T website: S. Kwon, A. Tamhankar, and K. R. Rao, "Overview of the H.264/MPEG-4 part 10," Journal of Visual Communication and Image Representation, vol. 17, is. 9, pp , April 2006. T. Wiegand and G. J. Sullivan, “The H.264 video coding standard”, IEEE Signal Processing Magazine, vol. 24, pp , March 2007. K. R. Rao and J. J. Hwang, “Techniques and standards for image/video/audio coding”, Prentice-Hall, 1996. A. Puri et al, “Video Coding using the H.264/ MPEG-4 AVC compression standard”, Signal Processing: Image Communication, vol. 19, pp: 793 – 849, Oct D. Marpe and T. Wiegand, “H.264/MPEG4-AVC Fidelity Range Extensions: Tools, Profiles, Performance, and Application Areas”, Proc. IEEE International Conference on Image Processing 2005, vol. 1, pp. I - 596, Sept J. Ostermann et al, “Video coding with H.264/AVC: Tools, Performance, and Complexity”, IEEE Circuits and Systems Magazine, vol. 4, Issue 1, pp. 7 – 28, First Quarter 2004. Iain E. G. Richardson, “H.264 and MPEG-4 Video Compression, Video Coding for Next-generation Multimedia”, Wiley, 2003. G. J. Sullivan et al, “Overview of the High Efficiency Video Coding (HEVC) standard”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 22, No. 12, pp , Dec H. Samet, “The quadtree and related hierarchical data structures,” Comput. Survey, vol. 16, no. 2, pp. 187–260, Jun J.Xin, C.W.Lin, and M.T.Sun , “Digital Video Transcoding” , Proceedings of the IEEE, Vol. 93, Issue 1,pp 84-97, January 2005. A.Vetro, C.Christopoulos and H.Sun, “Video transcoding architectures and techniques: an overview”, IEEE Signal Processing magazine, Vol. 20, Issue 2, pp , March 2003. P. Assunçno and M. Ghanbari, “Post-processing of MPEG-2 coded video for transmission at lower bit-rates,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Atlanta, GA, 1996, pp J.L. Wu, S.J. Huang, Y.M. Huang, C.T. Hsu, and J. Shiu, “An efficient JPEG to MPEG-1 transcoding algorithm,” IEEE Trans. Consumer Electron., vol. 42, pp , Aug N. Memon and R. Rodilia, “Transcoding GIF images to JPEG-LS,” IEEE Trans. Consumer Electron., vol. 43, pp , Aug N. Feamster and S. Wee, “An MPEG-2 to H.263 transcoder,” in Proc. SPIE Conf. Voice, Video Data Communications, Boston, MA, Sept List the references used in thesis.

H. Kato, H. Yanagihara, Y. Nakajima, and Y
H. Kato, H. Yanagihara, Y. Nakajima, and Y. Hatori, “A fast motion estimation algorithm for DV to MPEG-2 conversion,” in Proc. IEEE Int. Conf. Consumer Electronics, Los Angeles, CA, June 2002, pp W. Lin, D. Bushmitch, R. Mudumbai, and Y. Wang, “Design and implementation of a high-quality DV50-MPEG2 software transcoder,” in Proc. IEEE Int. Conf. Consumer Electronics, Los Angeles, CA, June 2002, pp S.F. Chang and D.G. Messerschmidt, “Manipulation and compositing of MC-DCT compressed video,” IEEE J. Select. Areas Commun., vol. 13, pp. 1-11, Jan H. Sun, A. Vetro, J. Bao, and T. Poon, “A new approach for memory-efficient ATV decoding,” IEEE Trans. Consumer Electron., vol. 43, pp , Aug J. Wang and S. Yu, “Dynamic rate scaling of coded digital video for IVOD applications,” IEEE Trans. Consumer Electron., vol. 44, pp , Aug P. Assunção and M. Ghanbari, “A frequency-domain video transcoder for dynamic bit-rate reduction of MPEG-2 bitstreams,” IEEE Trans. Circuits Syst. Video Technol., vol. 8, pp , Dec Sreejana Sharma, “Transcoding of H.264 bitstream to MPEG-2 bitstream”, M. S. Thesis, EE Department, UT Arlington, May 2007. HEVC and VP9 video codecs - try them yourself: HEVC resources: How to use HM software: Exercise on Running video codec: H.264 reference software: YUV video sequences: HEVC wavefront parallel processing animation: HEVC and its extensions: HEVC walkthrough by vcodex: HEVC Analyzers: Relax its only HEVC: HEVC reference software: Test sequences: HM_16.7 location: JM_19.0 location: C. Fogg, “Suggested figures for the HEVC specification”, ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) document JCTVC-J0292rl, July 2012. G. Sullivan, P. Topiwala and A. Luthra, “The H.264/AVC Advanced Video Coding Standard: Overview and Introduction to the Fidelity Range Extensions”, SPIE Conference on Applications of Digital Image Processing XXVII Special Session on Advances in the New Emerging Standard: H.264/AVC, August, 2004.

MPL Website: http://www.uta.edu/faculty/krrao/dip/
Video and Image compression techniques blog: M. Wien, “High Efficiency Video Coding, Coding Tools and Specification”, Springer 2015. V. Sze, M. Budagavi and G. Sullivan, “High Efficiency Video Coding (HEVC), Algorithms and Architectures”, Springer 2014. JM Software manual: HM software manual: D. Grois, et al, “HEVC/H.265 Video Coding Standard including the Range Extensions, Scalable Extensions, and Multiview Extensions”, IEEE International Conference on Image Processing (ICIP) – Quebec 2015. J. R. Ohm et al, "Comparison of the Coding Efficiency of Video Coding Standards—Including High Efficiency Video Coding (HEVC)", IEEE Trans. on CVST, Vol. 22, No. 12, Dec Understanding in-loop filtering in the HEVC video standard : Next-Generation Video Transcoding:

Thank You ! Thank you to Dr. Rao for being my advisor.
I am grateful to Dr. Dillon and Dr. Schizas for taking time out of their busy schedule and agreeing to be on my thesis committee. A big thank you to both of you. I also thank MPL lab mates, my friends and my family for helping me as and when required.

Questions ??? Any questions that you may have.

H.265 Bitstream To H.264 Bitstream Transcoder

Similar presentations

Presentation on theme: "H.265 Bitstream To H.264 Bitstream Transcoder"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

H.265 Bitstream To H.264 Bitstream Transcoder

Similar presentations

Presentation on theme: "H.265 Bitstream To H.264 Bitstream Transcoder"— Presentation transcript:

Similar presentations

About project

Feedback