Presentation is loading. Please wait.

Presentation is loading. Please wait.

Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,

Similar presentations


Presentation on theme: "Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,"— Presentation transcript:

1

2 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory, Quantizers and DCT Oleh Tretiak Drexel University

3 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 2Lecture 5 Quality - Rate Tradeoff Given: 512x512 picture, 8 bits per pixel - Bit reduction oFewer bits per pixel oFewer pixels oBoth Issues: - How do we measure compression? oBits/pixel — does not work when we change number of pixel oTotal bits — valid, but hard to interpret - How do we measure quality? oRMS noise oPeak signal to noise ratio (PSR) in dB oSubjective quality

4 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 3Lecture 5 Comparison, Bit and Pixel Reduction

5 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 4Lecture 5 Quantizer Performance Questions: - How much error does the quantizer introduce (Distortion)? - How many bits are required for the quantized values (Rate)? Rate: - 1. No compression. If there are N possible quantizer output values, then it takes ceiling(log 2 N) bits per sample. - 2(a). Compression. Compute the histogram of the quantizer output. Design Huffman code for the histogram. Find the average lentgth. - 2(b). Find the entropy of the quantizer distribution - 2(c). Preprocess quantizer output,.... Distortion: Let x be the input to the quantizer, x* the de-quantized value. Quantization noise n = x* - x. Quantization noise power is equal to D = Average(n 2 ).

6 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 5Lecture 5 Quantizer: practical lossy encoding Quantizer - Symbols x — input to quantizer, q — output of quantizer, S — quantizer step - Quantizer: q = round(x/S) - Dequantizer characteristic x* = Sq - Typical noise power added by quantizer- dequantizer combination: D = S 2 /12 noise standard deviation  = sqrt(D) = 0.287S Example: S = 8, D = 8 2 /12 = 5.3, rms. quatization noise = sqrt(D) = 2.3 If input is 8 bits, max input is 255. There are 255/8 ~ 32 quantizer output values PSNR = 20 log10(255/2.3) = 40.8 dB Quantizer characteristic Dequantizer characteristic S S

7 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 6Lecture 5 Rate-Distortion Theorem When long sequences (blocks) are encoded, it is possible to construct a coder-decoder pair that achieves the specified distortion whenever bits per sample are R(D) +  Formula: X ~ Gaussian random variable, Q = E[X 2 ] ~ signal power D = E[(X–Y) 2 ] ~ noise power

8 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 7Lecture 5 This Lecture Decorrelation and Bit Allocation Discrete Cosine Transform Video Coding

9 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 8Lecture 5 Coding Correlated Samples How to code correlated samples - Decorrelate - Code Methods for decorrelation - Prediction - Transformation oBlock transform oWavelet transform

10 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 9Lecture 5 Prediction Rules Simplest: previous value

11 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 10Lecture 5 General Predictive Coding General System Example of linear predictive image coder

12 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 11Lecture 5 Rate-distortion theory — correlated samples Given: x = (x 1, x 2,... x n ), a sequence of Gaussian correlated samples Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) that de-correlates the samples. This is called a Karhunen-Loeve transformation Perform lossy encoding of (y 1, y 2,... y n ) - get y* = (y 1 *, y 2 *,... y n *) after decoding Reconstruct: x* = A -1 y*

13 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 12Lecture 5 Block-Based Coding Discrete Cosine Transform (DCT) is used instead of the K-L transform Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: - Image divided into ‘small’ blocks - Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks

14 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 13Lecture 5 Rate-distortion theory: non- uniform random variables Given (x 1, x 2,... x n ), use orthogonal transform to obtain (y 1, y 2,... y n ). Sequence of independent Gaussian variables (y 1, y 2,... y n ), Var[y i ] = Q i. Distortion allocation: allocate D i distortion to Q i Rate (bits) for i-th variable is R i = max[0.5 log 2 (Q i /D i ), 0] Total distortion Total rate (bits) We specify R. What are the values of D i to get minimum total distortion D?

15 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 14Lecture 5 Bit allocation solution Implicit solution (water-filling construction) Choose  (parameter) D i = min(Q i,  ) - If (Qi >  ) then D i = , else D i = Q i R i = max[0.5 log 2 (Q i /D i ), 0] - If (Qi >  ) then R i = 0.5 log 2 (Q i /  ), else R i = 0. Find value of  to get specified R

16 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 15Lecture 5

17 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 16Lecture 5 Wavelet Transform Filterbank and wavelets 2 D wavelets Wavelet Pyramid

18 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 17Lecture 5 Filterbank and Wavelets Put signal (sequence) through two filters - Low frequencies - High frequencies Downsample both by factor of 2 Do it in such a way that the original signal can be reconstructed! 100 50 100

19 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 18Lecture 5 Filterbank Pyramid 1000 500 250 125

20 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 19Lecture 5 2D Wavelets Apply wavelet processing along rows of picture Apply wavelet processing along columns of picture Pyramid processing

21 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 20Lecture 5 Lena: Top Level, next level 1.01 0.372.52 48.81 9.23 15.456.48

22 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 21Lecture 5 Lena, more levels

23 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 22Lecture 5 Decorrelation of Images x = (x 1, x 2,... x n ), a sequence of image gray values Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) Theoretical best (for Gaussian process): A is the Karhunen- Loeve transformation matrix - Images are not Gaussian processes - Karhunen-Loeve matrix is image-dependent, computationally expensive to find - Evaluating y = Ax with K-L transformation is computationally expensive In practice, we use DCT (discrete cosine transform) for decorrelation - Computationally efficient - Almost as good as the K-L transformation

24 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 23Lecture 5 DPCM Simple to implement (low complexity) - Prediction: 3 multiplications and 2 additions - Estimation: 1 addition - Encoding: 1 addition + quantization Performance for 2-D coding not as good as block quantization - In theory, for large past history the performance (rate-distortion) should be as good as other linear methods, but in that case there is no computational advantage Bottom line: useful when complexity is limited Important idea: Lossy predictive encoding.

25 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 24Lecture 5 Review: Image Decorrelation x = (x 1, x 2,... x n ), a sequence of image gray values Preprocess: convert to y = (y 1, y 2,... y n ), y = Ax, A ~ an orthogonal matrix (A -1 = A T ) Theoretical best (for Gaussian process): A is the Karhunen- Loeve transformation matrix - Images are not Gaussian processes - Karhunen-Loeve matrix is image-dependent, computationally expensive to find - Evaluating y = Ax with K-L transformation is computationally expensive In practice, we use DCT (discrete cosine transform) for decorrelation - Computationally efficient - Almost as good as the K-L transformation

26 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 25Lecture 5 Rate-Distortion: 1D vs. 2D coding Theory on tradeoff between distortion and least number of bits Interesting tradeoff only if samples are correlated “Water-filling” construction to compute R(d)

27 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 26Lecture 5 Review: Block-Based Coding Full image DCT - one set of decorrelated coefficients for whole image Block-based coding: - Image divided into ‘small’ blocks - Each block is decorrelated separately Block decorrelation performs almost as well (better?) than full image decorrelation Current standards (JPEG, MPEG) use 8x8 DCT blocks

28 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 27Lecture 5 What is the DCT? One-dimensional 8 point DCT Input x 0,... x 7, output y 0,... y 7 One-dimensional inverse DCT Input y 0,... y 7, output x 0,... x 7 Matrix form of equations: x, y are one column matrices Note: in these equations, p stands for 

29 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 28Lecture 5 Forward 2DDCT. Input x ij i = 0,... 7, j = 0,... 7. Output y kl k = 0,... 7, l = 0,... 7 Matrix form, X, Y ~ 8x8 matrices with coefficients x ij, y kl The 2DDCT is separable! Two-Dimensional DCT Note: in these equations, p stands for 

30 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 29Lecture 5 General DCT One dimension Two dimensions

31 Image Processing Architecture, © 2001-2004 Oleh TretiakPage 30Lecture 5


Download ppt "Image Processing Architecture, © 2001-2004 Oleh TretiakPage 1Lecture 5 ECEC 453 Image Processing Architecture Lecture 5, 1/22/2004 Rate-Distortion Theory,"

Similar presentations


Ads by Google