Download presentation
Presentation is loading. Please wait.
Published byAbraham Lamb Modified over 8 years ago
2
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory, Quantizers and DCT Oleh Tretiak Drexel University
3
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 2Lecture 5 Quality - Rate Tradeoff Given: 512x512 picture, 8 bits per pixel - Bit reduction oFewer bits per pixel oFewer pixels oBoth Issues: - How do we measure compression? oBits/pixel — does not work when we change number of pixel oTotal bits — valid, but hard to interpret - How do we measure quality? oRMS noise oPeak signal to noise ratio (PSR) in dB oSubjective quality
4
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 3Lecture 5 Comparison, Bit and Pixel Reduction
5
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 4Lecture 5 Quantizer Performance Questions: - How much error does the quantizer introduce (Distortion)? - How many bits are required for the quantized values (Rate)? Rate: - 1. No compression. If there are N possible quantizer output values, then it takes ceiling(log 2 N) bits per sample. - 2(a). Compression. Compute the histogram of the quantizer output. Design Huffman code for the histogram. Find the average lentgth. - 2(b). Find the entropy of the quantizer distribution - 2(c). Preprocess quantizer output,.... Distortion: Let x be the input to the quantizer, x* the de-quantized value. Quantization noise n = x* - x. Quantization noise power is equal to D = Average(n 2 ).
6
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 5Lecture 5 Quantizer: practical lossy encoding Quantizer - Symbols x — input to quantizer, q — output of quantizer, S — quantizer step - Quantizer: q = round(x/S) - Dequantizer characteristic x* = Sq - Typical noise power added by quantizer- dequantizer combination: D = S 2 /12 noise standard deviation = sqrt(D) = 0.287S Example: S = 8, D = 8 2 /12 = 5.3, rms. quatization noise = sqrt(D) = 2.3 If input is 8 bits, max input is 255. There are 255/8 ~ 32 quantizer output values PSNR = 20 log10(255/2.3) = 40.8 dB Quantizer characteristic Dequantizer characteristic S S
7
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 6Lecture 5 Rate-Distortion Theorem When long sequences (blocks) are encoded, it is possible to construct a coder-decoder pair that achieves the specified distortion whenever bits per sample are R(D) + Formula: X ~ Gaussian random variable, Q = E[X 2 ] ~ signal power D = E[(X–Y) 2 ] ~ noise power
8
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 7Lecture 5 This Lecture Decorrelation and Bit Allocation Discrete Cosine Transform Video Coding
9
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 8Lecture 5 Coding Correlated Samples How to code correlated samples - Decorrelate - Code Methods for decorrelation - Prediction - Transformation oBlock transform oWavelet transform
10
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 9Lecture 5 Prediction Rules Simplest: previous value
11
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 10Lecture 5 General Predictive Coding General System Example of linear predictive image coder
12
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 11Lecture 5 Rate-distortion theory — correlated samples Given: x = (x 1, x 2,... x n ), a sequence of Gaussian correlated samples Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) that de-correlates the samples. This is called a Karhunen-Loeve transformation Perform lossy encoding of (y 1, y 2,... y n ) - get y* = (y 1 *, y 2 *,... y n *) after decoding Reconstruct: x* = A -1 y*
13
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 12Lecture 5 Block-Based Coding Discrete Cosine Transform (DCT) is used instead of the K-L transform Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: - Image divided into ‘small’ blocks - Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks
14
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 13Lecture 5 Rate-distortion theory: non- uniform random variables Given (x 1, x 2,... x n ), use orthogonal transform to obtain (y 1, y 2,... y n ). Sequence of independent Gaussian variables (y 1, y 2,... y n ), Var[y i ] = Q i. Distortion allocation: allocate D i distortion to Q i Rate (bits) for i-th variable is R i = max[0.5 log 2 (Q i /D i ), 0] Total distortion Total rate (bits) We specify R. What are the values of D i to get minimum total distortion D?
15
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 14Lecture 5 Bit allocation solution Implicit solution (water-filling construction) Choose (parameter) D i = min(Q i, ) - If (Qi > ) then D i = , else D i = Q i R i = max[0.5 log 2 (Q i /D i ), 0] - If (Qi > ) then R i = 0.5 log 2 (Q i / ), else R i = 0. Find value of to get specified R
16
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 15Lecture 5
17
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 16Lecture 5 Wavelet Transform Filterbank and wavelets 2 D wavelets Wavelet Pyramid
18
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 17Lecture 5 Filterbank and Wavelets Put signal (sequence) through two filters - Low frequencies - High frequencies Downsample both by factor of 2 Do it in such a way that the original signal can be reconstructed! 100 50 100
19
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 18Lecture 5 Filterbank Pyramid 1000 500 250 125
20
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 19Lecture 5 2D Wavelets Apply wavelet processing along rows of picture Apply wavelet processing along columns of picture Pyramid processing
21
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 20Lecture 5 Lena: Top Level, next level 1.01 0.372.52 48.81 9.23 15.456.48
22
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 21Lecture 5 Lena, more levels
23
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 22Lecture 5 Decorrelation of Images x = (x 1, x 2,... x n ), a sequence of image gray values Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) Theoretical best (for Gaussian process): A is the Karhunen- Loeve transformation matrix - Images are not Gaussian processes - Karhunen-Loeve matrix is image-dependent, computationally expensive to find - Evaluating y = Ax with K-L transformation is computationally expensive In practice, we use DCT (discrete cosine transform) for decorrelation - Computationally efficient - Almost as good as the K-L transformation
24
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 23Lecture 5 DPCM Simple to implement (low complexity) - Prediction: 3 multiplications and 2 additions - Estimation: 1 addition - Encoding: 1 addition + quantization Performance for 2-D coding not as good as block quantization - In theory, for large past history the performance (rate-distortion) should be as good as other linear methods, but in that case there is no computational advantage Bottom line: useful when complexity is limited Important idea: Lossy predictive encoding.
25
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 24Lecture 5 Review: Image Decorrelation x = (x 1, x 2,... x n ), a sequence of image gray values Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) Theoretical best (for Gaussian process): A is the Karhunen- Loeve transformation matrix - Images are not Gaussian processes - Karhunen-Loeve matrix is image-dependent, computationally expensive to find - Evaluating y = Ax with K-L transformation is computationally expensive In practice, we use DCT (discrete cosine transform) for decorrelation - Computationally efficient - Almost as good as the K-L transformation
26
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 25Lecture 5 Rate-Distortion: 1D vs. 2D coding Theory on tradeoff between distortion and least number of bits Interesting tradeoff only if samples are correlated “Water-filling” construction to compute R(d)
27
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 26Lecture 5 Review: Block-Based Coding Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: - Image divided into ‘small’ blocks - Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks
28
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 27Lecture 5 What is the DCT? One-dimensional 8 point DCT Input x 0,... x 7, output y 0,... y 7 One-dimensional inverse DCT Input y 0,... y 7, output x 0,... x 7 Matrix form of equations: x, y are one column matrices Note: in these equations, p stands for
29
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 28Lecture 5 Forward 2DDCT. Input x ij i = 0,... 7, j = 0,... 7. Output y kl k = 0,... 7, l = 0,... 7 Matrix form, X, Y ~ 8x8 matrices with coefficients x ij, y kl The 2DDCT is separable! Two-Dimensional DCT Note: in these equations, p stands for
30
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 29Lecture 5 General DCT One dimension Two dimensions
31
Image Processing Architecture, © 2001-2004 Oleh TretiakPage 30Lecture 5
Similar presentations
© 2024 SlidePlayer.com Inc.
All rights reserved.