Frame by Frame Bit Allocation for Motion-Compensated Video Michael Ringenburg May 9, 2003.

Slides:



Advertisements
Similar presentations
Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Advertisements

Quick Sort, Shell Sort, Counting Sort, Radix Sort AND Bucket Sort
Efficient Bit Allocation and CTU level Rate Control for HEVC Picture Coding Symposium, 2013, IEEE Junjun Si, Siwei Ma, Wen Gao Insitute of Digital Media,
Visual Recognition Tutorial
Light Field Compression Using 2-D Warping and Block Matching Shinjini Kundu Anand Kamat Tarcar EE398A Final Project 1 EE398A - Compression of Light Fields.
CABAC Based Bit Estimation for Fast H.264 RD Optimization Decision
Linear Methods for Regression Dept. Computer Science & Engineering, Shanghai Jiao Tong University.
x – independent variable (input)
Recursive End-to-end Distortion Estimation with Model-based Cross-correlation Approximation Hua Yang, Kenneth Rose Signal Compression Lab University of.
GG 313 Geological Data Analysis # 18 On Kilo Moana at sea October 25, 2005 Orthogonal Regression: Major axis and RMA Regression.
Video Coding with Linear Compensation (VCLC) Arif Mahmood, Zartash Afzal Uzmi, Sohaib A Khan Department of Computer.
Optimum Bit Allocation and Rate Control for H.264/AVC Wu Yuan, Shouxun Lin, Yongdong Zhang, Wen Yuan, and Haiyong Luo CSVT 2006.
MAE 552 – Heuristic Optimization Lecture 6 February 6, 2002.
MultiHypothesis Pictures For H.26L Markus Flierl Telecommunications Laboratory University of Erlangen-Nuremberg Erlangen, Germany
Frederic Payan, Marc Antonini
Prediction and model selection
Maximum Likelihood (ML), Expectation Maximization (EM)
Data Mining CS 341, Spring 2007 Lecture 4: Data Mining Techniques (I)
Investigation of Motion-Compensated Lifted Wavelet Transforms Information Systems Laboratory Department of Electrical Engineering Stanford University Markus.
Visual Recognition Tutorial
Source-Channel Prediction in Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Laboratory ECE Department University of California,
Rate-Distortion Optimized Motion Estimation for Error Resilient Video Coding Hua Yang and Kenneth Rose Signal Compression Lab ECE Department University.
A Sequence-Based Rate Control Framework for Consistent Quality Real-Time Video Bo Xie and Wenjun Zeng CSVT 2006.
Xinqiao LiuRate constrained conditional replenishment1 Rate-Constrained Conditional Replenishment with Adaptive Change Detection Xinqiao Liu December 8,
Improved results for a memory allocation problem Rob van Stee University of Karlsruhe Germany Leah Epstein University of Haifa Israel WADS 2007 WAOA 2007.
456/556 Introduction to Operations Research Optimization with the Excel 2007 Solver.
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Kai-Chao Yang Hierarchical Prediction Structures in H.264/AVC.
Philipp Merkle, Aljoscha Smolic Karsten Müller, Thomas Wiegand CSVT 2007.
Simple Linear Regression Models
 Coding efficiency/Compression ratio:  The loss of information or distortion measure:
1 Efficient Reference Frame Selector for H.264 Tien-Ying Kuo, Hsin-Ju Lu IEEE CSVT 2008.
Rate-distortion modeling of scalable video coders 指導教授:許子衡 教授 學生:王志嘉.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
Part 5 Parameter Identification (Model Calibration/Updating)
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
An Algorithm For Constant- Quality Compressed Video Michael F. Ringenburg Richard E. Ladner Eve A. Riskin UW CSE Industrial Affiliates Meeting February.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Deterministic vs. Random Maximum A Posteriori Maximum Likelihood Minimum.
Image Processing and Computer Vision: 91. Image and Video Coding Compressing data to a smaller volume without losing (too much) information.
Adaptive Multi-path Prediction for Error Resilient H.264 Coding Xiaosong Zhou, C.-C. Jay Kuo University of Southern California Multimedia Signal Processing.
8. 1 MPEG MPEG is Moving Picture Experts Group On 1992 MPEG-1 was the standard, but was replaced only a year after by MPEG-2. Nowadays, MPEG-2 is gradually.
Adaptive Rate Control for HEVC Visual Communications and Image Processing (VCIP), 2012 IEEE Junjun Si, Siwei Ma, Xinfeng Zhang, Wen Gao 1.
MSE-415: B. Hawrylo Chapter 13 – Robust Design What is robust design/process/product?: A robust product (process) is one that performs as intended even.
Compression of Real-Time Cardiac MRI Video Sequences EE 368B Final Project December 8, 2000 Neal K. Bangerter and Julie C. Sabataitis.
Lecture 4: Statistics Review II Date: 9/5/02  Hypothesis tests: power  Estimation: likelihood, moment estimation, least square  Statistical properties.
Guillaume Laroche, Joel Jung, Beatrice Pesquet-Popescu CSVT
Advances in digital image compression techniques Guojun Lu, Computer Communications, Vol. 16, No. 4, Apr, 1993, pp
Fast motion estimation and mode decision for H.264 video coding in packet loss environment Li Liu, Xinhua Zhuang Computer Science Department, University.
100 kbps. 100 kbps Current and Future Directions Class Histories –Are the significance probabilities of the coefficients.
R EGRESSION S HRINKAGE AND S ELECTION VIA THE L ASSO Author: Robert Tibshirani Journal of the Royal Statistical Society 1996 Presentation: Tinglin Liu.
1  The Problem: Consider a two class task with ω 1, ω 2   LINEAR CLASSIFIERS.
Non-Linear Models. Non-Linear Growth models many models cannot be transformed into a linear model The Mechanistic Growth Model Equation: or (ignoring.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 10 Rate-Distortion.
Rate-Control in Video Codec Heejune AHN Embedded Communications Laboratory Seoul National Univ. of Technology Fall 2008 Last updated
1 Introduction to Statistics − Day 4 Glen Cowan Lecture 1 Probability Random variables, probability densities, etc. Lecture 2 Brief catalogue of probability.
Video Compression—From Concepts to the H.264/AVC Standard
Page 11/28/2016 CSE 40373/60373: Multimedia Systems Quantization  F(u, v) represents a DCT coefficient, Q(u, v) is a “quantization matrix” entry, and.
Global MINMAX Interframe Bit Allocation for Embedded Video Coding Michael Ringenburg Qualifying Project Presentation Advisors: Richard Ladner (CSE) and.
Image Processing Architecture, © Oleh TretiakPage 1Lecture 4 ECE-C490 Winter 2004 Image Processing Architecture Lecture 4, 1/20/2004 Principles.
Chapter 8 Lossy Compression Algorithms. Fundamentals of Multimedia, Chapter Introduction Lossless compression algorithms do not deliver compression.
Managing VBR Videos. The VBR Problem Constant quality Burstiness over multiple time scales Difference within and between scenes Frame structure of encoding.
Rate Distortion Theory. Introduction The description of an arbitrary real number requires an infinite number of bits, so a finite representation of a.
Ch. Eick: Num. Optimization with GAs Numerical Optimization General Framework: objective function f(x 1,...,x n ) to be minimized or maximized constraints:
The simple linear regression model and parameter estimation
Pyramid Vector Quantization
Large Margin classifiers
FHTW Wavelet Based Video Compression Using Long Term Memory Motion-Compensated Prediction and Context-based Adaptive Arithmetic Coding D.Marpe, H.L.Cycon,
Video Compression - MPEG
Foundation of Video Coding Part II: Scalar and Vector Quantization
Presentation transcript:

Frame by Frame Bit Allocation for Motion-Compensated Video Michael Ringenburg May 9, 2003

The Problem… Given a maximum bit budget B and a video with F frames, how many bits b i should we allocate to each frame f i in order to maximize the overall quality of the video? Formally, our constraint is: We assume a lossy, embedded coding scheme. Thus we can choose the exact number of bits to allocate to each frame.

Rate-Distortion Curves If we increase the number of bits allocated to a frame (the frame’s bit rate) and hold everything else constant, the frame’s distortion (the mean squared error or MSE) decreases. The distortion decays exponentially (2 -b )

Motion Compensation Each frame is predicted by the previous frame. We find blocks in the previous frame which are similar to blocks in the current frame, and calculate motion vectors which estimate the disparity between the previous and the current blocks. We only encode the difference, or the “residue”, between the predicted and actual frame. This complicates the task of bit allocation, because the quality of a frame depends not only on its rate, but also on the rates of all of the previous frames.

Measuring Video Quality We can measure the quality of individual frames with MSE (distortion) or PSNR (Peak Signal-to-Noise Ratio), but how do we measure the overall quality of the whole video? Method 1 (MMSE): Minimize the Mean MSE. Method 2 (MINMAX): Minimize the Maximum MSE. Leads to constant quality, which may be more visually appealing.

Outline of Talk Cheng-Li-Kuo ‘97 algorithm for MMSE Yang-Hemami ‘99 algorithm for MINMAX If time permits: –Adapting these algorithms for Group Testing for Video (GTV) –A new algorithm for the MINMAX metric

Preliminaries I frames: “Independent” frames - not predicted. Distortion of an I frame depends only on its own bit rate. P frames: “Predicted” frames. Distortion depends on own bit rate, plus bit rate of most recent I frame and all P frames in between. Group of Pictures (GOP): I frame, followed by some number of P frames. Both algorithms optimize individual GOP’s.

Cheng-Li-Kuo ‘97 Algorithm From “Rate Control for an Embedded Wavelet Video Coder”, by Po-Yuen Cheng, Jin Li, and C.-C. Jay Kuo. Minimizes the MMSE of each Group of Pictures. Based on experimental observations of rate- distortion and motion-compensation behavior of typical videos. Uses Lagrange Multiplier method to derive optimal allocations.

Rate-Distortion Curves Authors experimentally determined that Rate- Distortion curves can be approximated by: D max is distortion at rate 0. This is equivalent to the variance of the wavelet coefficients.

More on the ß parameter Larger ß indicates more efficient coding. ß I is typically larger than ß P, because the I frame quality is dependent only on its allocation. The ratio ß I / ß P is usually between 1.1 and 1.4 Examples: –Flower: ß I = 2.07, ß P = 1.50 –Mobile: ß I = 1.65, ß P = 1.50 –Tennis: ß I = 1.08, ß P = 0.86 –Cheer: ß I = 1.72, ß P = 1.43

Frame Dependency If e is the residue of motion compensation, f is the predicted frame, d is the displacement, and g is the reference frame after lossy encoding, then: Then if E represents the expected value, the variance is:

Frame Dependency Let a be the actual reference frame (as opposed to the reference frame after lossy encoding): This is the residue with respect to the original reference frame, plus the coding error of the reference frame. We assume they are not correlated.

Frame Dependency Let  a 2 be the variance if the reference frame was perfectly coded: The second part is the distortion of the motion-translated reference frame, which is linearly related to the actual distortion of the non-translated reference frame. Thus:

More on the  parameter Typically ranges from 0.5 to 0.9 Higher indicates better quality motion compensation. Decreases if there is violent motion or a scene change.

Lagrange multiplier method Using the experimentally observed rate distortion model and the frame dependency we just derived, we can solve for the optimal allocation using the Lagrange multiplier method. Let R GOP be the number of bits assigned to a group of pictures. We minimize:

Solution The authors solve the minimization, and derive:

Parameters We need to determine the variance  a 2, the coding efficiency ß, and the dependency  for every frame in the Group Of Pictures before we begin coding it. Alternatively, if this is too expensive, we can estimate the values using the previous GOP. K is an adjustable parameter. We perform a binary search until the rate constraint is met.

Experimental Results

Yang-Hemami ‘99 Algorithm From “MINMAX Frame Rate Control Using a Rate-Distortion Optimized Wavelet Coder”, by Yan Yang and Sheila Hemami. Minimizes the maximum distortion of any frame in the Group of Pictures. Leads to constant quality within a GOP.

Outline of Algorithm Let N be the number of frames in a GOP. Recall the first frame is an I frame, the rest are P frames. Step 1: Find rates R I and R P such that: R I + (N-1)R P =R t and D 1 (R I ) = D 2 (R P )=D. Step 2: Code the rest of the frames to distortion D. If the rate starts to get too high, adaptively raise D. If the rate starts to get too low, adaptively lower D.

Finding the Initial Rates Initially assume that all P frames have identical Rate-Distortion curves. Binary search: –Let R(D) = R 1 (D) + (N-1)R 2 (D) –Find a D 1 and D 2 such that R(D 1 ) < R t < R(D 2 ) –Repeat until |R(D) - R t | <  = R t x 1%: Let D = (D 1 + D 2 )/2 If R(D) < R t let D 1 = D Else, let D 2 = D Variant – force D I to be slightly less than D P

Adaptive Adjustment Algorithm In reality, the P frames are not identical. Iterate over the P frames, coding each to the current target distortion D P. Let  be the mismatch between jR P and the number of bits actually used to code the first j P frames. Let  up and  low be upper and lower bounds on the allowable mismatch. If  >  up then raise D P using update algorithm. If  <  low then lower D P using update algorithm.

Update Algorithm Code current P frame j at rate R j = R j-1 -  /(N - j). If  is still outside the allowable range, we use the update algorithm again on the next frame. Once  is back within the acceptable range, we update the target distortion D P.

Experimental Results

Adapting for GTV Both algorithms optimize GOP’s. Group Testing for video doesn’t have GOP’s in the traditional sense – just a single I frame at the beginning of the video, and then only P frames. But GTV does have “pseudo”-I frames. When there is a scene change, or violent motion, the frame is not predicted very well, and thus it behaves like an I frame. They are usually much less frequent than traditional I frames, though, and not evenly distributed. Detect these frames by the residual magnitude.

Adapting Yang-Hemami ‘99 Approach 1: Run initial rate algorithm on first two frames. Run adaptive adjustment algorithm until we encounter a “pseudo”-I frame. Repeat. Approach 2: Create N frame GOP’s at the beginning and at every “pseudo”-I frame. Use constant bit-rate coding outside the GOP’s.

Adapting Cheng-Li-Kuo ‘97 Approach 1: Start a new GOP after N frames or at the next “pseudo”-I frame, whichever comes first. Even if the first frame of the new GOP is a P frame, it will behave like an I frame with a low ß value, because all previous allocations are fixed. Approach 2:Create N frame GOP’s at the beginning and at every “pseudo”-I frame. Use constant bit-rate coding outside the GOP’s.

New algorithm Set targetD to a small value. Repeat: –Encode all frames in GOP to distortion targetD –If |bits_used – max_bits| < , break. –Scale all allocations by max_bits/bits_used and encode GOP. –If variance of the frame distortions is less than max_variance, break. –Set targetD to average distortion.

My Project Implement Cheng-Li-Kuo, Yang- Hemami, and my new algorithm in the context of GTV. Compare the speed and quality of the three algorithms.