Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California.

Slides:



Advertisements
Similar presentations
Low-Complexity Transform and Quantization in H.264/AVC
Advertisements

Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
INTERNATIONAL CONFERENCE ON TELECOMMUNICATIONS, ICT '09. TAREK OUNI WALID AYEDI MOHAMED ABID NATIONAL ENGINEERING SCHOOL OF SFAX New Low Complexity.
A Performance Analysis of the ITU-T Draft H.26L Video Coding Standard Anthony Joch, Faouzi Kossentini, Panos Nasiopoulos Packetvideo Workshop 2002 Department.
Basics of MPEG Picture sizes: up to 4095 x 4095 Most algorithms are for the CCIR 601 format for video frames Y-Cb-Cr color space NTSC: 525 lines per frame.
-1/20- MPEG 4, H.264 Compression Standards Presented by Dukhyun Chang
SWE 423: Multimedia Systems
H.264/AVC Baseline Profile Decoder Complexity Analysis Michael Horowitz, Anthony Joch, Faouzi Kossentini, and Antti Hallapuro IEEE TRANSACTIONS ON CIRCUITS.
Limin Liu, Member, IEEE Zhen Li, Member, IEEE Edward J. Delp, Fellow, IEEE CSVT 2009.
SCHOOL OF COMPUTING SCIENCE SIMON FRASER UNIVERSITY CMPT 820 : Error Mitigation Schaar and Chou, Multimedia over IP and Wireless Networks: Compression,
Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of.
Video Coding with Linear Compensation (VCLC) Arif Mahmood, Zartash Afzal Uzmi, Sohaib A Khan Department of Computer.
Efficient Moving Object Segmentation Algorithm Using Background Registration Technique Shao-Yi Chien, Shyh-Yih Ma, and Liang-Gee Chen, Fellow, IEEE Hsin-Hua.
Video Transmission Adopting Scalable Video Coding over Time- varying Networks Chun-Su Park, Nam-Hyeong Kim, Sang-Hee Park, Goo-Rak Kwon, and Sung-Jea Ko,
An Error-Resilient GOP Structure for Robust Video Transmission Tao Fang, Lap-Pui Chau Electrical and Electronic Engineering, Nanyan Techonological University.
1 Static Sprite Generation Prof ︰ David, Lin Student ︰ Jang-Ta, Jiang
SWE 423: Multimedia Systems Chapter 7: Data Compression (1)
Motion-compensation Fine-Granular-Scalability (MC-FGS) for wireless multimedia M. van der Schaar, H. Radha Proceedings of IEEE Symposium on Multimedia.
MPEG Audio Compression by V. Loumos. Introduction Motion Picture Experts Group (MPEG) International Standards Organization (ISO) First High Fidelity Audio.
Scalable Wavelet Video Coding Using Aliasing- Reduced Hierarchical Motion Compensation Xuguang Yang, Member, IEEE, and Kannan Ramchandran, Member, IEEE.
Encoder and Decoder Optimization for Source-Channel Prediction in Error Resilient Video Transmission Hua Yang and Kenneth Rose Signal Compression Lab ECE.
Efficient Fine Granularity Scalability Using Adaptive Leaky Factor Yunlong Gao and Lap-Pui Chau, Senior Member, IEEE IEEE TRANSACTIONS ON BROADCASTING,
Investigation of Motion-Compensated Lifted Wavelet Transforms Information Systems Laboratory Department of Electrical Engineering Stanford University Markus.
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
Scalable Rate Control for MPEG-4 Video Hung-Ju Lee, Member, IEEE, Tihao Chiang, Senior Member, IEEE, and Ya-Qin Zhang, Fellow, IEEE IEEE TRANSACTIONS ON.
Rate-Distortion Optimized Motion Estimation for Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Lab ECE Department University.
A Sequence-Based Rate Control Framework for Consistent Quality Real-Time Video Bo Xie and Wenjun Zeng CSVT 2006.
H.264/AVC for Wireless Applications Thomas Stockhammer, and Thomas Wiegand Institute for Communications Engineering, Munich University of Technology, Germany.
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
4/24/2002SCL UCSB1 Optimal End-to-end Distortion Estimation for Drift Management in Scalable Video Coding H. Yang, R. Zhang and K. Rose Signal Compression.
Statistical Multiplexer of VBR video streams By Ofer Hadar Statistical Multiplexer of VBR video streams By Ofer Hadar.
09/24/02ICIP20021 Drift Management and Adaptive Bit Rate Allocation in Scalable Video Coding H. Yang, R. Zhang and K. Rose Signal Compression Lab ECE Department.
Variable Bit Rate Video Coding April 18, 2002 (Compressed Video over Networks: Chapter 9)
An Introduction to H.264/AVC and 3D Video Coding.
On Error Preserving Encryption Algorithms for Wireless Video Transmission Ali Saman Tosun and Wu-Chi Feng The Ohio State University Department of Computer.
GODIAN MABINDAH RUTHERFORD UNUSI RICHARD MWANGI.  Differential coding operates by making numbers small. This is a major goal in compression technology:
Kai-Chao Yang Hierarchical Prediction Structures in H.264/AVC.
Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.
 Coding efficiency/Compression ratio:  The loss of information or distortion measure:
Frame by Frame Bit Allocation for Motion-Compensated Video Michael Ringenburg May 9, 2003.
Video Coding. Introduction Video Coding The objective of video coding is to compress moving images. The MPEG (Moving Picture Experts Group) and H.26X.
Picture typestMyn1 Picture types There are three types of coded pictures. I (intra) pictures are fields or frames coded as a stand-alone still image. These.
Audio Compression Usha Sree CMSC 691M 10/12/04. Motivation Efficient Storage Streaming Interactive Multimedia Applications.
Rate-distortion modeling of scalable video coders 指導教授:許子衡 教授 學生:王志嘉.
Low Bit Rate H Video Coding: Efficiency, Scalability and Error Resilience Faouzi Kossentini Signal Processing and Multimedia Group Department of.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
Compression video overview 演講者:林崇元. Outline Introduction Fundamentals of video compression Picture type Signal quality measure Video encoder and decoder.
Rate-distortion Optimized Mode Selection Based on Multi-channel Realizations Markus Gärtner Davide Bertozzi Classroom Presentation 13 th March 2001.
Compression of Real-Time Cardiac MRI Video Sequences EE 368B Final Project December 8, 2000 Neal K. Bangerter and Julie C. Sabataitis.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Advance in Scalable Video Coding Proc. IEEE 2005, Invited paper Jens-Rainer Ohm, Member, IEEE.
Fast motion estimation and mode decision for H.264 video coding in packet loss environment Li Liu, Xinhua Zhuang Computer Science Department, University.
Proxy-Based Reference Picture Selection for Error Resilient Conversational Video in Mobile Networks Wei Tu and Eckehard Steinbach, IEEE Transactions on.
UNDER THE GUIDANCE DR. K. R. RAO SUBMITTED BY SHAHEER AHMED ID : Encoding H.264 by Thread Level Parallelism.
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Flow Control in Compressed Video Communications #2 Multimedia Systems and Standards S2 IF ITTelkom.
Encoding Stored Video for Streaming Applications IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 2, FEBRUARY 2001 I.-Ming.
COMPARATIVE STUDY OF HEVC and H.264 INTRA FRAME CODING AND JPEG2000 BY Under the Guidance of Harshdeep Brahmasury Jain Dr. K. R. RAO ID MS Electrical.
Motion Estimation Multimedia Systems and Standards S2 IF Telkom University.
A Frame-Level Rate Control Scheme Based on Texture and Nontexture Rate Models for HEVC IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY,
Implementation and comparison study of H.264 and AVS china EE 5359 Multimedia Processing Spring 2012 Guidance : Prof K R Rao Pavan Kumar Reddy Gajjala.
Image Processing Architecture, © Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,
Multi-Frame Motion Estimation and Mode Decision in H.264 Codec Shauli Rozen Amit Yedidia Supervised by Dr. Shlomo Greenberg Communication Systems Engineering.
MPEG Video Coding I: MPEG-1 1. Overview  MPEG: Moving Pictures Experts Group, established in 1988 for the development of digital video.  It is appropriately.
Complexity varying intra prediction in H.264 Supervisors: Dr. Ofer Hadar, Mr. Evgeny Kaminsky Students: Amit David, Yoav Galon.
Introduction to H.264 / AVC Video Coding Standard Multimedia Systems Sharif University of Technology November 2008.
Injong Rhee ICMCS’98 Presented by Wenyu Ren
Fully Scalable Multiview Wavelet Video Coding
Standards Presentation ECE 8873 – Data Compression and Modeling
Kyoungwoo Lee, Minyoung Kim, Nikil Dutt, and Nalini Venkatasubramanian
Presentation transcript:

Compression Efficiency and Delay Tradeoffs for Hierarchical B-Pictures and Pulsed-Quality Frames Athanasios Leontaris, Pamela C. Cosman Univ. of California at San Diego IEEE Transactions on Image Processing, July 2007

Outline Introduction End-to-End Delay Effect of branch removal from HIER coders Delay due to encoder output buffer Proposed framework for rate allocation Motivation Theoretical background Proposed estimate Rate allocation algorithm Rate control Results Conclusion

Introduction Constraining delay is critical for real-time communication and live event broadcast Compression efficiency can be improved by Increasing the buffering delay (bit rate allocated to each frame can vary) More flexible motion-compensated prediction structures When temporal correlation among several neighboring frames is better exploited, additional delay is incurred Example: Tradeoffs of delay and compression in MCTF, which delay was reduced by selectively removing the update step Delay is an issue for hierarchical bi-predictive structures, as well The delay in the hierarchical case depends on the GOP size Delay can’t be reduced by removing update steps while keeping the GOP size intact But in this work, it can be reduced by removing the MPC branches

Introduction One can also have increased delay when using a single- direction (forward) prediction One codec that employs two reference frames, short-term (ST) and long-term (LT) At constant transmission bit rate, the LT frames will take longer to transmit, introducing delay Compression efficiency can improved for certain sequences, but how about delay? A key element of a delay-constrained video encoder is the rate control scheme Rate control algorithms: Test Model 5 for MPEG-2, TM5 of replacing the block variance with the block SAD, and quadratic rate distortion model adopted in H.264

Introduction When the bit rate is distributed unevenly among the frames, extra buffering delay at the encoder output and decoder input is incurred So, given constraints on bit rate and buffering delay, such a R-D model can yield an efficient rate allocation To obtain a model for hierarchical prediction, we need to account for the temporal prediction distance In [16], the rate and distortion were calculated as functions of the power spectral density of the prediction error This model introduced the concept of MC accuracy In this work, we tend to use the accuracy to model the temporal prediction distance [16] B. Griod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences”

End-to-End Delay The encoder is free to allocate the rate within the frames of the time unit as to optimize some quality criterion, while making sure that the unit as a whole adheres to the CBR target rate The encoder output buffer determines how tightly the rate allocation and rate control must operate Allowing the encoder output buffer to be larger leads to higher video quality

End-to-End Delay Bits buffered in decoder input buffer is the same as the encoder output buffer The encoder buffer fullness and the decoder buffer fullness are always complementary to each other and have a constant sum equal to the max size of each buffer The source coding end-to-end delay D e2e is : Assuming the size of the encoder output buffer and the decoder input buffer is B, we can obtain

Five types of encoders (1) (2)

Five types of encoders (3)

Five types of encoders (4)(5) IBBBP coder, where all B- coded pictures are disposable and use only I-and P-coded pictures as references

End-to-End Delay For all codecs apart from IBBBP For IBBBP Source coding end-to-end delay For IPPPP and PULSE codecs (N GOP = 1), the end- to-end delay is

Effect of branch removal from HIER coders

Truncated branch brings down the structural delay by one half The structure is similar to a GOP size 2 structure Differences: Still have 3 hierarchical levels and allows more granular temporal scalability or network condition Frame 4 is predicted frame 0, instead of being predicted from frame 2 as for GOP size 2. This means the compression performance will be worse than for a GOP size 2 structure For hierarchical B-pictures, the tradeoff of compression efficiency for delay becomes a tradeoff of compression efficiency for increased temporal scalability and bit-stream resilience and decreased delay

Delay due to encoder output buffer To avoid a buffer overflow during encoding, the necessary condition is The encoder can estimate the encoder output buffer length from the bit rate allocation A useful and intuitive lower bound is It translates the delay constraint into a rate allocation constraint

Proposed Framework for Rate Allocation

Motivation In the JM reference software uses a single QP for the entire frame however, rate allocation under tight delay constraints can’t use the same QP for the entire frame Extend the rate control algorithm to offer per-block decisions of QP, and seek to avoid buffer overflow and underflow and satisfy the target rate Goal : establish the bit rate allocation for different hierarchical levels with B-coded pictures We found through experiments that the efficiency of the bi-directional prediction of a frame depends on the distance from its references

Motivation Assumption a) Frames within a temporal decomposition level have similar entropy → can be afforded the same number of bits We seek a solution that doesn’t depend on video content: fixed proportion of bits for each temporal decomposition level The requirement for fixed ratios is a result of computational and delay constrains

Motivation Assumption b) Closed-loop coding Refers to using as references the previously reconstructed version of the frames Approaches for rate allocation in open loop MCTF are based on temporal propagation of the error didn’t take into account the temporal distance between the frames and lack any delay constraints Not appropriate for this work since we can’t afford the delay and the computational complexity

Motivation Assumption c) High-rate operation Closed-loop prediction at high rates doesn’t alter the signal significantly Effect of quantization error on prediction efficiency can be neglected for fine quantization Then, using a closed-loop video coder with the optimal open-loop rate allocation performs close the optimal closed-loop rate allocation

Theoretical background Rate-distortion (R-D) modeling scheme from [16] and [24] [16] B. Girod, “The efficiency of motion-compensating prediction for hybrid coding of video sequences” [24] B. Girod, “Efficiency analysis of multihypothesis motion-compensated prediction for video coding”

Theoretical background The prediction error:

Theoretical background Assume the p.d.f. of the displacement and is a function of the temporal prediction distance If the power spectral density of the prediction error is known, then the error variance is given as The well-known rate distortion function for memoryless coding is

Theoretical background Power spectrum is calculated for N- hypothesis prediction in [24] N=1 N=2

Theoretical background The power spectrum of the signal s is found in (19) in [17] as The noise power spectrum is The displacement error p.d.f Fourier transforms of the above p.d.f

Theoretical background and represent the Fourier transform of the spatial filter for single and double hypothesis For N=1 hypotheses, For N=2 hypotheses, Continue the derivation of the power spectral densities

Proposed estimate seems to be a logarithmic function of the temporal distance

Proposed estimate Replace with the expression derived the error variance Term produces approximately logarithmically spaced rate- distortion function

Proposed estimate For fixed the standard deviation of the motion compensation displacement error varies approximately linearly with the temporal prediction distance The final rate-distortion model is written as The motivation behind adding the term to the denominator of R l is that hybrid video coding is closed-loop and thus a case of dependent video coding

Rate allocation algorithm

An alternative approach to calculate the rate ratios was proposed by a reviewer for the HIER structure with N GOP = 4 Constraining D t, solve the equation (21) to find D r. After D r has been calculated, the calculation of R 0, R 1, and R 2 is straightforward

Rate control For the IPPPP codec, the rate control algorithm is included in JM 10.1 reference software is directed used To ensure accurate rate control under tight delay constraints we adopt the rate control approach of the PULSE codec, with multiple rate-control paths, each of which maintains its own quadratic model For a hierarchical stream, the number of rate control bins is equal to the number of temporal decomposition levels

Results Video sequences: Akiyo : very static image sequence Carphone : include localized motion of various kinds. The majority of activity is due to the instability of the camera inside the car. There is repetitive translational global motion Flower : high freq. content, and the motion is global and follows mainly the affine model Football : extremely active with local object motion Mobile : substantial high freq. content and the motion is mostly global due to the horizontal camera pan Stefan : sports clip featuring a tennis court with very high motion

Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Akiyo” CIF 352x288 at 15 fps. Initial QP 39. (b) “Akiyo” CIF at 30 fps. Initial QP 30.

Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Carphone” QCIF 176x144 at 15 fps. Initial QP 31. (b) “Carphone” QCIF at 30 fps. Initial QP 29

Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Flower-Garden” SIF 352x240 at 30 fps. Initial QP 29. (b) “Football” SIF at 30 fps. Initial QP 33

Results PSNR versus delay (in seconds) for fixed source coding bit rate. (a) “Mobile-Calendar” QCIF 176x144 at 30 fps. Initial QP 29. (b) “Stefan” CIF 352x288 at 30 fps. Initial QP 31.

Results

Conclusion The study of the delay tradeoffs yielded the following conclusions: IPPPP performs well for low delay applications and for sequences with high motion PULSE is advantageous for relatively static sequences with repetitive content N GOP > 1 structures benefit from static sequences and from sequences with global motion As N GOP increases, the gain is nontrivial only if the sequence is either static, or if the global motion is translational For the sequences we evaluated, the delay thresholds are as follows: between 40 and 80 ms, IPPPP is the best choice, between 80 and 125 ms PULSE performs well, the large space between 125 and 270 ms is dominated by N GOP = 2, and for delays larger than 270 ms, then N GOP = 4 is the best choice. Delays larger than 270 ms are only however useful in cases of live event broadcast or streaming of stored content. They are prohibitive for real-time interactive communication The truncated N GOP = 4 codec underperforms the N GOP = 2 codec but has similar delay with the added advantage of increased temporal scalability