Presentation is loading. Please wait.

Presentation is loading. Please wait.

Global MINMAX Interframe Bit Allocation for Embedded Video Coding Michael Ringenburg Qualifying Project Presentation Advisors: Richard Ladner (CSE) and.

Similar presentations


Presentation on theme: "Global MINMAX Interframe Bit Allocation for Embedded Video Coding Michael Ringenburg Qualifying Project Presentation Advisors: Richard Ladner (CSE) and."— Presentation transcript:

1 Global MINMAX Interframe Bit Allocation for Embedded Video Coding Michael Ringenburg Qualifying Project Presentation Advisors: Richard Ladner (CSE) and Eve Riskin (EE)

2 Outline Quick Introduction to Video Compression Quick Introduction to Video Compression The Bit Allocation Problem The Bit Allocation Problem Algorithms Algorithms Our New Algorithms Our New Algorithms MultiStage MultiStage Ratio Ratio Enhancements to our algorithms Enhancements to our algorithms Buffer Control Buffer Control Pseudo I-Frames Pseudo I-Frames Results Results

3 The Quick Introduction to Video Compression 4 Steps: 4 Steps: Motion compensation. Motion compensation. Outputs motion vectors, which predict the frame from the previous frame. Outputs motion vectors, which predict the frame from the previous frame. Actual - Predicted = Residual Actual - Predicted = Residual Transform the residual. Transform the residual. Concentrates the energy (information). Concentrates the energy (information). Lossy encoding of the transformed residual. Lossy encoding of the transformed residual. Decoding. Decoding. Generates an approximation of the original frame. Generates an approximation of the original frame.

4 Motion Compensation T=1 (reference)T=2 (current) Thanks to Gidon Shavit

5 Residual Examples (Boosted) OriginalResidual Thanks to Gidon Shavit

6 The Quick Introduction to Video Compression 4 Steps: 4 Steps: Motion compensation. Motion compensation. Outputs motion vectors, which predict the frame from the previous frame. Outputs motion vectors, which predict the frame from the previous frame. Actual - Predicted = Residual Actual - Predicted = Residual Transform the residual. Transform the residual. Concentrates the energy (information). Concentrates the energy (information). Lossy encoding of the transformed residual. Lossy encoding of the transformed residual. Decoding. Decoding. Generates an approximation of the original frame. Generates an approximation of the original frame.

7 Transform Example (DCT) Thanks to Gidon Shavit

8 The Quick Introduction to Video Compression 4 Steps: 4 Steps: Motion compensation. Motion compensation. Outputs motion vectors, which predict the frame from the previous frame. Outputs motion vectors, which predict the frame from the previous frame. Actual - Predicted = Residual Actual - Predicted = Residual Transform the residual. Transform the residual. Concentrates the energy (information). Concentrates the energy (information). Lossy encoding of the transformed residual. Lossy encoding of the transformed residual. Quantization vs. Embedded Coding Quantization vs. Embedded Coding Decoding. Decoding. Generates an approximation of the original frame. Generates an approximation of the original frame.

9 Quantization-Based Coders Quantization-based coders round off (“quantize”) the coefficients Quantization-based coders round off (“quantize”) the coefficients Losslessly encode the resulting approximations Losslessly encode the resulting approximations Example: 101011 ---> 101000. Example: 101011 ---> 101000. Encoder can specify the number of significant digits to preserve Encoder can specify the number of significant digits to preserve But, can not know how large the frame will be after lossless encoding But, can not know how large the frame will be after lossless encoding

10 Embedded Coding Embedded coding allows the encoder to specify an exact size for each frame. Embedded coding allows the encoder to specify an exact size for each frame. Produces a bit stream which can be cut off at any point and still produce a valid approximation of the frame. Produces a bit stream which can be cut off at any point and still produce a valid approximation of the frame. More bits generally results in a better approximation (but not always). More bits generally results in a better approximation (but not always).

11 Embedded Bit Plane Coding The n-th bit plane consists of the n-th bit of each coefficient. The n-th bit plane consists of the n-th bit of each coefficient. Each bit plane is compressed with an embedded, lossless code. Each bit plane is compressed with an embedded, lossless code. Send the bit planes in order of significance. Send the bit planes in order of significance. The most important information is sent first. The most important information is sent first. Can terminate the stream between or within bit planes. Can terminate the stream between or within bit planes.

12 The Quick Introduction to Video Compression 4 Steps: 4 Steps: Motion compensation. Motion compensation. Outputs motion vectors, which predict the frame from the previous frame. Outputs motion vectors, which predict the frame from the previous frame. Actual - Predicted = Residual Actual - Predicted = Residual Transform the residual. Transform the residual. Concentrates the energy (information). Concentrates the energy (information). Lossy encoding of the transformed residual. Lossy encoding of the transformed residual. Decoding. Decoding. Generates an approximation of the original frame. Generates an approximation of the original frame.

13 The GTV Coder An embedded bit plane video coder developed at the University of Washington. An embedded bit plane video coder developed at the University of Washington. Written by Jeff West, Gidon Shavit and Michael Ringenburg, with Richard Ladner and Eve Riskin. Written by Jeff West, Gidon Shavit and Michael Ringenburg, with Richard Ladner and Eve Riskin. Motion Compensation on 16 x 16 blocks Motion Compensation on 16 x 16 blocks 8 x 8 Discrete Cosine Transform 8 x 8 Discrete Cosine Transform Lossy coding based on the GT-DCT image coder developed by Ed Hong with Richard Ladner and Eve Riskin Lossy coding based on the GT-DCT image coder developed by Ed Hong with Richard Ladner and Eve Riskin All results in this talk use the GTV coder All results in this talk use the GTV coder

14 Where we are Quick Introduction to Video Compression Quick Introduction to Video Compression The Bit Allocation Problem The Bit Allocation Problem Algorithms Algorithms Our New Algorithms Our New Algorithms MultiStage MultiStage Ratio Ratio Enhancements to our algorithms Enhancements to our algorithms Buffer Control Buffer Control Pseudo I-Frames Pseudo I-Frames Results Results

15 Interframe Bit Allocation Given : An embedded coder, a maximum bit budget B and a video with F frames. Given : An embedded coder, a maximum bit budget B and a video with F frames. How many bits b i should we allocate to each frame f i in order to maximize the “overall quality” of the video? How many bits b i should we allocate to each frame f i in order to maximize the “overall quality” of the video? Formally, our constraint is: Formally, our constraint is:

16 Why Is This Important? Some frames require more bits (are harder) to encode, because … Some frames require more bits (are harder) to encode, because … Scene changes prevent good motion compensation. Scene changes prevent good motion compensation. As do lighting changes. As do lighting changes. And, to a lesser extent, dramatic actions. And, to a lesser extent, dramatic actions. Also, some images are inherently more difficult to encode. Also, some images are inherently more difficult to encode.

17 Why Is This Difficult? Motion Compensation!!! Motion Compensation!!! The quality of frame n is a function of the allocation to frame n, and the quality of frame (n-1). The quality of frame n is a function of the allocation to frame n, and the quality of frame (n-1). Quality of frame (n-1) is a function of the allocation to frame (n-1) and the quality of frame (n-2). Et cetera. Quality of frame (n-1) is a function of the allocation to frame (n-1) and the quality of frame (n-2). Et cetera. Thus, the quality of frame n is a function of n variables. Thus, the quality of frame n is a function of n variables. The quality function (a.k.a. rate-distortion curve) is different for every frame. The quality function (a.k.a. rate-distortion curve) is different for every frame. The rate-distortion curves are not monotonic. The rate-distortion curves are not monotonic. Approximately monotonic, though. Approximately monotonic, though.

18 Video Quality Metrics Typically, measure the quality of individual frames with: Typically, measure the quality of individual frames with: Distortion: MSE (Mean Squared Error) Distortion: MSE (Mean Squared Error) Quality: PSNR (Peak Signal-to-Noise Ratio) Quality: PSNR (Peak Signal-to-Noise Ratio) How do we measure the overall quality of the whole video? How do we measure the overall quality of the whole video? Method 1 (MMSE): Minimize the Mean MSE Method 1 (MMSE): Minimize the Mean MSE Method 2 (MINMAX): Minimize the Maximum MSE Method 2 (MINMAX): Minimize the Maximum MSE Prevents low quality segments Prevents low quality segments Leads to more constant quality Leads to more constant quality We choose MINMAX for this work. We choose MINMAX for this work.

19 Where we are Quick Introduction to Video Compression Quick Introduction to Video Compression The Bit Allocation Problem The Bit Allocation Problem Algorithms Algorithms Our New Algorithms Our New Algorithms MultiStage MultiStage Ratio Ratio Enhancements to our algorithms Enhancements to our algorithms Buffer Control Buffer Control Pseudo I-Frames Pseudo I-Frames Results Results

20 Algorithms Many algorithms exist for allocation in a quantizer-based setting. Many algorithms exist for allocation in a quantizer-based setting. Major standards (MPEG-x, H.26x) are quantizer- based due to decoding efficiency. Major standards (MPEG-x, H.26x) are quantizer- based due to decoding efficiency. As computers become faster, though, embedded video coding becomes more practical. As computers become faster, though, embedded video coding becomes more practical. Embedded coders allow much more flexibility in the bit allocation process. Embedded coders allow much more flexibility in the bit allocation process. Embedded image coders often offer better performance (e.g., GT-DCT vs. JPEG) Embedded image coders often offer better performance (e.g., GT-DCT vs. JPEG) The obligatory Bold Prognostication: “Embedded coding is the future of video compression!” The obligatory Bold Prognostication: “Embedded coding is the future of video compression!”

21 Embedded Algorithms Existing algorithms: Existing algorithms: Cheng, Li, and Kuo ‘97 targets MMSE metric. Cheng, Li, and Kuo ‘97 targets MMSE metric. Yang and Hemami ‘99 targets MINMAX metric. Yang and Hemami ‘99 targets MINMAX metric. Both are local algorithms - i.e., they allocate over a single Group of Pictures (20 or 30 frame window). Both are local algorithms - i.e., they allocate over a single Group of Pictures (20 or 30 frame window). We propose: We propose: The MultiStage algorithm and the Ratio algorithm The MultiStage algorithm and the Ratio algorithm Global algorithms - allocate over the entire video. Global algorithms - allocate over the entire video. In practice, both have linear running times in the length of the video. In practice, both have linear running times in the length of the video.

22 MultiStage Algorithm A global allocation algorithm (allocates across the entire video). A global allocation algorithm (allocates across the entire video). Alternates between two stages which provide feedback to each other. Alternates between two stages which provide feedback to each other. The Constant Quality Stage The Constant Quality Stage The Target Allocation Stage The Target Allocation Stage After every stage, check if a termination condition has been met. After every stage, check if a termination condition has been met.

23 Constant Quality Stage Encode every frame of the video to the same constant quality (PSNR) currPsnr. Encode every frame of the video to the same constant quality (PSNR) currPsnr. First iteration: The value of currPsnr is a parameter to the algorithm. First iteration: The value of currPsnr is a parameter to the algorithm. We choose 40.00 dB for our experiments. We choose 40.00 dB for our experiments. Subsequent iterations: Obtain the value of currPsnr from the previous iteration of the Target Allocation Stage. Subsequent iterations: Obtain the value of currPsnr from the previous iteration of the Target Allocation Stage.

24 Target Allocation Stage Let v i be the allocation to frame i after the last Constant Quality Stage Let v i be the allocation to frame i after the last Constant Quality Stage Let T be the sum of the v i ‘s Let T be the sum of the v i ‘s Set the new allocations v i ‘ = (B/T)v i where B is the bit budget. Set the new allocations v i ‘ = (B/T)v i where B is the bit budget. This new allocation will be proportional to the Constant Quality allocation, but will hit the exact target bit budget. This new allocation will be proportional to the Constant Quality allocation, but will hit the exact target bit budget. Let q i be the PSNR of frame i after this new allocation. For the next Constant Quality Stage, let currPsnr be the average of the q i ’s. Let q i be the PSNR of frame i after this new allocation. For the next Constant Quality Stage, let currPsnr be the average of the q i ’s.

25 Termination After the Constant Quality Stage, the quality of every frame will be approximately equal. After the Constant Quality Stage, the quality of every frame will be approximately equal. Terminate if within  of bit budget. Terminate if within  of bit budget. After the Target Allocation Stage, our total allocation is exactly the bit budget. After the Target Allocation Stage, our total allocation is exactly the bit budget. Terminate if variance of PSNRs < . Terminate if variance of PSNRs < . We choose  = 0.01 (1%) and  = 0.1 We choose  = 0.01 (1%) and  = 0.1 In all experiments, terminated after the second Target Allocation Stage. In all experiments, terminated after the second Target Allocation Stage.

26 Example, Step 1 Constant Quality T=2000

27 Example, Step 2 Target Allocation T=1000

28 Example, Step 3 Constant Quality T=800

29 Example, Step 4 Target Allocation T=1000

30 The Ratio Algorithm Empirically: Number of bits to code a frame to a given PSNR is roughly proportional to the norm of the residual. Empirically: Number of bits to code a frame to a given PSNR is roughly proportional to the norm of the residual. Norm is the square root of the sum of the squares of the coefficients. Norm is the square root of the sum of the squares of the coefficients. Why? Why? A large residual means the predicted frame is a poor approximation of the actual frame. A large residual means the predicted frame is a poor approximation of the actual frame. A poor prediction requires more bits to “fix”. A poor prediction requires more bits to “fix”.

31 The Ratio Algorithm Step 1: Quick pass finds the norm of each residual assuming perfect coding of the reference frame. Step 1: Quick pass finds the norm of each residual assuming perfect coding of the reference frame. No transform, because norm is preserved under orthogonal transforms. No transform, because norm is preserved under orthogonal transforms. Don’t encode/decode - assuming perfect reference. Don’t encode/decode - assuming perfect reference. Step 2: “Smooth” by capping the max norm at five times the average, and the min at half the average. Step 2: “Smooth” by capping the max norm at five times the average, and the min at half the average. Step 3: Set allocations in proportion to the calculated approximate norms of the residuals. Step 3: Set allocations in proportion to the calculated approximate norms of the residuals.

32 Yang-Hemami ‘99 Algorithm From “MINMAX Frame Rate Control Using a Rate-Distortion Optimized Wavelet Coder”, by Yan Yang and Sheila Hemami. From “MINMAX Frame Rate Control Using a Rate-Distortion Optimized Wavelet Coder”, by Yan Yang and Sheila Hemami. Attempts to minimize the maximum distortion within a Group of Pictures (GOP). Attempts to minimize the maximum distortion within a Group of Pictures (GOP). A GOP is an I frame, followed by some P frames. A GOP is an I frame, followed by some P frames. An I frame is a frame which doesn’t use motion compensation. An I frame is a frame which doesn’t use motion compensation. A P frame uses motion compensation. A P frame uses motion compensation. A local algorithm (allocates within the GOP). A local algorithm (allocates within the GOP).

33 Outline of Algorithm Let N be the number of frames in a GOP. Let N be the number of frames in a GOP. The first frame is an I frame, the rest are P frames. The first frame is an I frame, the rest are P frames. Step 1: Find rates R I and R P such that: R I + (N-1)R P =R t and D 1 (R I ) = D 2 (R P )=D. Step 1: Find rates R I and R P such that: R I + (N-1)R P =R t and D 1 (R I ) = D 2 (R P )=D. Step 1 is assuming that all P frames have identical rate-distortion curves. To fix … Step 1 is assuming that all P frames have identical rate-distortion curves. To fix … Step 2: Code the rest of the frames to distortion D. If the rate starts to get too high, adaptively raise D. If the rate starts to get too low, adaptively lower D. Step 2: Code the rest of the frames to distortion D. If the rate starts to get too high, adaptively raise D. If the rate starts to get too low, adaptively lower D.

34 Buffer Control Ignored potential for buffer underflow due to time constraints. Ignored potential for buffer underflow due to time constraints. An issue with streaming video An issue with streaming video Underflow occurs when we can not decode the current frame because the data is not yet in the buffer Underflow occurs when we can not decode the current frame because the data is not yet in the buffer The “elevator chat” version: The “elevator chat” version: Encoder simulates decoder’s buffer Encoder simulates decoder’s buffer If the simulation indicates an underflow, the encoder must modify the allocations If the simulation indicates an underflow, the encoder must modify the allocations Move bits before the underflow to frames after the underflow Move bits before the underflow to frames after the underflow

35 Pseudo I-Frames If encoding time is an issue, it may be desirable to only run bit allocation algorithms on small portions of the video. If encoding time is an issue, it may be desirable to only run bit allocation algorithms on small portions of the video. PSNR graphs typically consist of periods of fairly constant quality, followed by large dips at scene change frames. PSNR graphs typically consist of periods of fairly constant quality, followed by large dips at scene change frames. These frames are where bit allocation is most useful. These frames are where bit allocation is most useful.

36 Pseudo I-frames Thus, this method allocates only over 30 frame windows starting at the scene changes. Thus, this method allocates only over 30 frame windows starting at the scene changes. Uses constant rate allocation elsewhere. Uses constant rate allocation elsewhere. Identify scene changes by the norm of the residual. Identify scene changes by the norm of the residual. If r i > z, where z is a user tunable threshold, switch to allocation mode. If r i > z, where z is a user tunable threshold, switch to allocation mode.

37 Where We Are Quick Introduction to Video Compression Quick Introduction to Video Compression The Bit Allocation Problem The Bit Allocation Problem Algorithms Algorithms Our New Algorithms Our New Algorithms MultiStage MultiStage Ratio Ratio Enhancements to our algorithms Enhancements to our algorithms Buffer Control Buffer Control Pseudo I-Frames Pseudo I-Frames Results Results

38 Results News.qcif News.qcif 100 kilobits per second 100 kilobits per second 30 frames per second 30 frames per second 3 mini scene changes 3 mini scene changes Trevor.qcif Trevor.qcif 100 kilobits per second 100 kilobits per second 30 frames per second 30 frames per second 1 major scene change 1 major scene change

39 Method Comparison

40

41 Pseudo I-Frames

42

43 Buffer Control

44 Conclusions and Future Work Presented two new global MINMAX interframe bit allocation algorithms for embedded video coders Presented two new global MINMAX interframe bit allocation algorithms for embedded video coders MultiStage algorithm MultiStage algorithm Ratio algorithm Ratio algorithm MultiStage has the lowest variance, and the highest minimum PSNR (lowest max MSE). MultiStage has the lowest variance, and the highest minimum PSNR (lowest max MSE). Ratio is second in both categories Ratio is second in both categories Can we improve the buffer control? Can we improve the buffer control? Can we eliminate passes from the multistage algorithm by combining it with the ratio algorithm to find a better starting point? Can we eliminate passes from the multistage algorithm by combining it with the ratio algorithm to find a better starting point?

45 Acknowledgements Thanks to my advisors Richard Ladner and Eve Riskin. Thanks to my advisors Richard Ladner and Eve Riskin. Thanks to Gidon Shavit and Jeff West, for work on the GTV code. Thanks to Gidon Shavit and Jeff West, for work on the GTV code. Thanks also to Gidon Shavit for three of the slides for this talk. Thanks also to Gidon Shavit for three of the slides for this talk.


Download ppt "Global MINMAX Interframe Bit Allocation for Embedded Video Coding Michael Ringenburg Qualifying Project Presentation Advisors: Richard Ladner (CSE) and."

Similar presentations


Ads by Google