Download presentation
1
Still Image Compression
EE 視訊處理 Chapter 8 Still Image Compression
2
8.1 Basics of Image Compression
Purposes: 1. To remove image redundancy 2. To increase storage and/or transmission efficiency Why? l Still Pictures for ISO JPEG Standard Test Pictures: 720(pels) × 576(lines) × 1.5bytes =4.977 Mbits 64k(bits/sec)
3
8.1 Basics of Image Compression (cont.)
How? 1. Use characteristics of images(statistical): (a) General ---using statistical model from Shannon’s rate-distortion theory (b) Particular ---using nonstationary properties of images (wavelets, fractals, …) 2. Use characteristics of human perception (psychological): (a) Color representation (b) Weber-Fetcher law (c) Spatial/temporal masking
4
8.1.1 Elements of an Image Compression System
A source encoder consists of the following blocks: Fig. 8.1 :Block diagram of an image compression system
5
8.1.2 Information Theory l Entropy: Average Uncertainty (Randomness)of a stationary, ergodic signal source X, i.e., Bits needed to resolve its uncertainty. Discrete- Amplitude Memoryless Sources (DMS) X: ,which is a finite-alphabet M set containing k symbols.
6
8.1.2 Information Theory (cont.)
Assume that is an independent identically distributed process; that is The entropy = bits/ letter where
7
8.1.2 Information Theory (cont.)
l Sources with Memory Discrete-Amplitude Sources with Memory X: Assume that is stationary in strict sense, i.e.,
8
8.1.2 Information Theory (cont.)
The Nth order Entropy (N-tuple, N-block) is given by bits/ letter Where Lower bound of the source coding is
9
8.1.2 Information Theory (cont.)
l Variable- Length Source Coding Theorem Lets X be discrete –amplitude stationary & ergodic source and be its Nth order entropy, then there exits a binary prefix code with an average bit rate satisfying Where the prefix condition is observed by that no codeword is prefix (initial part ) of another codeword.
10
8.1.2 Information Theory (cont.)
Example : source X X: Entropy : bits/letter
11
8.1.2 Information Theory (cont.)
l How many bits needed to remove uncertainty? s Single-Letter Block:N= bits/letter Source Word Codeword (Bits) Probability 1 1×0.8+1×0.2=1
12
8.1.2 Information Theory (cont.)
l How many bits needed to remove uncertainty? s Two-Letter Block: N= bits/letter Source Word Codeword (Bits) Probability 1×0.64+2×0.16+3×0.16+3×0.04=0.78 1 01 001 000 1
13
8.2 Entropy Coding How to construct a real code that achieves the theoretical limits? (a) Huffman coding (1952) (b) Arithmetic coding (1976) (c) Ziv-Lempel coding (1977)
14
8.2.1 Huffman coding l Variable-Length-Coding (VLC) with the following characteristics: t Lossless entropy coding for digital signals t Fewer bits for highly probable events t Prefix codes
15
8.2.1 Huffman coding (cont.) l Procedure
Two-stages: (Given probability distribution of X) Stage 1: Construct a binary tree from events Select two least probable events a & b, and replace them by a single node, where probability is the sum of the probability for a & b. Stage 2: Assign codes sequentially from the root.
16
8.2.1 Huffman coding (cont.) Example: Let the alphabet Ax consist of four symbols as shown in the following table. Symbol Probability The entropy of the source is H=-0.5 In In In In0.25 =1.75
17
8.2.1 Huffman coding (cont.) The tree-diagram for Huffman coding is:
18
8.2.1 Huffman coding (cont.) Which yields the Huffman code
Symbol Huffman code Probability 10 110 111 The average bit-rate = 1×0.5+2×0.25+3× ×0.125=1.75=H
19
8.2.1 Huffman coding (cont.) l Performance Implementation:
Step 1: Estimate probability distribution from samples. Step 2: Design Huffman codes using the probability obtained at Step 1.
20
8.2.1 Huffman coding (cont.) Advantage:
t Approach H(X) (with or without memory), when block size t Relative simple procedure, easy to follow.
21
8.2.1 Huffman coding (cont.) Disadvantage:
t Large N or preprocessing is needed for source with memory. t Hard to adjust codes in real time.
22
8.2.1 Huffman coding (cont.) Variations: Modified Huffman code:
Codewords longer than L become fixed-length Adaptive Huffman codes.
23
from Lesing Compression II
Arithmetic coding: Unlike the variable-length codes described previously, arithmetic coding, generates non-block codes. In arithmetic coding, a one-to-one correspondence between source symbols and code words does not exist. Instead, an entire sequence of source symbols (or message) is assigned a single arithmetic code word. The code word itself defines an interval of real numbers between 0 and 1. As the number of symbols in the message increases, the interval used to represent it becomes smaller and the number of information units (say, bits) required to represent the interval becomes larger. Each symbol of the message reduces the size of the interval in accordance with the probability of occurrence. It is supposed to approach the limit set by entropy.
24
from Compression II Let the message to be encoded be a1a2a3a3a4
25
from Compression II
26
from Compression II So, any number in the interval [ ,0.0688) , for example can be used to represent the message. Here 3 decimal digits are used to represent the 5 symbol source message. This translates into 3/5 or 0.6 decimal digits per source symbol and compares favorably with the entropy of -(3x0.2log log100.4) = digits per symbol
27
from Compression II As the length of the sequence increases, the resulting arithmetic code approaches the bound set by entropy. In practice, the length fails to reach the lower bound, because: The addition of the end of message indicator that is needed to separate one message from another The use of finite precision arithmetic
28
Since 0.8>code word > 0.4, the first symbol should be a3.
from Compression II Decoding: Decode Since 0.8>code word > 0.4, the first symbol should be a3.
29
Therefore, the message is a3a3a1a2a4
from Compression II Therefore, the message is a3a3a1a2a4 1.0 0.8 0.72 0.592 0.5728 0.8 0.72 0.688 0.5856 0.4 0.56 0.624 0.5728 056896 0.2 0.48 0.592 0.5664 0.0 0.4 0.56 0.56 0.5664
31
8.2.3 Arithmetic Coding Variable-length to Variable-length
Lossless entropy coding for digital signals One source symbol may produce several bits; several source symbols (letters) may produce a single bit. Source model (Probability distribution) can be derived in real time. Similar to Huffman prefix codes in special cases.
32
8.2.3 Arithmetic Coding (cont.)
Principle: A message (source string) is represented by an interval of real numbers between 0 and 1.More frequent messages have larger intervals allowing fewer bits to specify those intervals.
33
8.2.3 Arithmetic Coding (cont.)
Example: Ax: Source symbol Probability (binary) Cumulative Probability
34
8.2.3 Arithmetic Coding (cont.)
The length of an interval is proportional to its probability 0.750 Any point in the interval [0.0,0.5) represents “a”; say, 0.25(binary: ), or 0.0(binary: 0.00) Any point in the interval [0.75,0.875) represents “c”; say, (binary: ), or 0.75(binary: 0.110)
35
8.2.3 Arithmetic Coding (cont.)
Transmitting 3 letters: 0.0011 t Any point in the interval [0.001,0.0011) identifies “aab”; say, , or t Need a model (probability distribution)
36
8.2.3 Arithmetic Coding (cont.)
Procedure: Recursive computation of key values of an interval: C (Code Point)-leftmost point A (Interval Width) Receiving a symbol New C = Current C + (current A × ) New A = Current A × Where = Cumulative probability of = Probability of
37
8.2.3 Arithmetic Coding (cont.)
【Encoder】 Step 0: Initial C = 0; Initial A = 1 Step 1: Receive a source symbol (If no more symbols, it’s EOF) Compute New C and New A Step 2: If EOF, {send the code string that identifies this current interval; stop} Else {Send the code string that has been uniquely determined so far. Goto step 1}
38
8.2.3 Arithmetic Coding (cont.)
【Decoder】 Step 0: Initial C = 0; Initial A = 1 Step 1: Examine the code string received so far, and search for the interval in which it lies. Step 2: If a symbol can be decided, decode it. Else goto step 1 Step 3: If {this symbol is EOF, STOP} Else {Adjust C and A; goto step 2}
39
8.2.3 Arithmetic Coding (cont.)
More details (I.H. Witten et al, “Arithmetic Coding for Data Compression”, COMM ACM, pp , June 1987) Integer arithmetic scale intervals up Bits to follow (undecided symbol…) Updating model
40
8.2.3 Arithmetic Coding (cont.)
Performance Advantages (1) Approach when possible delay and data precision (2) Adapted to the local statistics (3) Inter-letter correlation can be reduced by using conditional probability(model with context) (4) Simple procedures without multiplication and division have been developed(IBM Q-coder, AT&T Minimax-coder) Disadvantages: Sensitive to channel errors.
41
8.3 Lossless Compression Methods
(1) Lossless prediction coding (2) Run-length coding of bit planes.
42
8.3.1 Lossless Predictive Coding
Figure: Block diagram of a)an encoder, and b) a decoder using a simple predictor
43
8.3.1 Lossless Predictive Coding (cont.)
Example: integer prediction or b c d a
44
8.3.1 Lossless Predictive Coding (cont.)
Histogram of (a)the original image intensity and (b) integer prediction error
45
8.3.2 Run-Length Coding Source model: First-order Markov sequence, probability distribution of the current state depends only on the previous state, Procedure: Run=k: (k-1) non-transitions followed by a transition.
46
8.3.2 Run-Length Coding (cont.)
lRemarks: ♦ All runs are independent (If runs are allowed to be ) ♦ Entropy of runs Entropy of the original source ♦ Modified run-length codes: RUNS a limit L
47
8.3.2.1 Run-Length Coding of Bit-Plane
Figure: Bit-plane decomposition of an 8-bit image Gray code 1_D RLC 2_D RLC
48
8.4 Rate-Distortion Theory
Distortion Measure:
49
8.4 Rate-Distortion Theory (cont.)
Random variables X and Y are related by mutual information I with Where represents the uncertainty about X before knowing Y, represents the uncertainty about X after knowing Y, and represents the average mutual information, i.e., the information provided about X by Y.
50
8.4.1 Rate-Distortion Function
For Discrete-Amplitude Memoryless Source (DMS) X: with given, and Single-letter distortion measure: Hence, average distortion
51
8.4.1 Rate-Distortion Function (cont.)
l Rate-Distortion Function or (Distortion-Rate Function ) -The average number of bits needed to represent a source symbol, if an average distortion D is allowed.
52
8.4.2 Source Coding Theorem l A code or codebook B of size M, block length N is a set of reproducing vectors (code words) , where each code word has N components, l Coding Rule: A mapping between all the N-tuple source words and B. Each source word is mapped to the codeword y B that minimizes ; that is,
53
8.4.2 Source Coding Theorem (cont.)
Average distortion of code B: Source Coding Theorem: For a DMS X with alphabet Ax ,probability p(x) , and a single-letter distortion measure d( , ) , then, for a given average distortion D, there exists a sufficient large block length N, and a code B of size M and block length N such that
54
8.4.2 Source Coding Theorem (cont.)
In other words, there exits a mapping from the source symbols to codewords such that for a given distortion D, bits/symbol are sufficient to enable source reconstruction with an average distortion that is arbitrarily close to D. The function is called the rate-distortion function. Note that The actual rate R should obey for the fidelity level D.
55
8.4.2 Source Coding Theorem (cont.)
Distortion, D
56
8.5 Scalar Quantization Block size = 1
--Approximate a continuous-amplitude source with finite levels , given by A scalar quantizer is a function that is defined in terms of a finite set of decision levels and reconstruction levels where L is the number of output states.
57
8.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer)
To minimize w.r.t , it can be shown that the necessary conditions are given by
58
8.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer) (cont.)
Example 1: Gaussian DMS with squared-error distortion Uniform scalar quantizer at high rates: where
59
8.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer) (cont.)
Example 2: Lloyd-Max quantization of a Laplacian distribution signal with unity variance. Table: The decision and reconstruction levels for Lloyd-Max quantizatizers. levels 2 4 8 1.141 1.087 0.731
60
8.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer) (cont.)
Where defines uniform quantizers. In case of nonuniform quantizer with L=4, Example 3: Quantizer noise For a memoryless Gaussian signal s with zero mean and variance , we express the mean square quantization noise as
61
8.5.1 Lloyd-Max Quantizer (Nonuniform Quantizer) (cont.)
then the signal noise ratio in is given by It can be seen that implies Substituting this result into the rate-distortion function for a memoryless Gaussian source, given by we have bits/sample. Likewise, we can show that quantization with 8 bits/sample yields approximately
62
8.6 Differential Pulse Code Modulation (DPCM)
with for reconstruction Block diagram of a DPCM a)encoder b)decoder Prediction:
63
8.6.1 Optimal Prediction For 1-D case in the source model
64
Image Modeling Modeling the source image by a stationary random field, a linear minimum mean square error (LMMSE) predictor, in the form can be designed to minimize the mean square prediction error
65
Image Modeling (cont.) The optimal coefficient vector is given by where and
66
Image Modeling (cont.) Although the optimal coefficient vector is optimal in the LMMSE sense, they are not necessarily optimal in the sense of minimizing the entropy of the prediction error. Furthermore, images rarely obey the stationary assumption. As a result, most DPCM schemes employ a fixed predictor.
67
Image Modeling (cont.) The analysis and design of the optimum predictor is difficult, because the quantizer is inside the feedback loop. A heuristic idea is adding a quantization noise rejection filter before the predictor. To avoid channel error propagation, leaky predictor may be useful. Variation of DPCM: Adaptive Prediction and Quantization.
68
8.6.2 Adaptive Quantization
Adjusting the decision and reconstruction levels according to the local statistics of the prediction error.
69
8.7 Delta Modulation Fig. The quantizer for delta modulation
70
8.7 Delta Modulation (cont.)
Fig. Illustration of granular noise and slope overload
71
8.8 Transform Coding Motivation
Rate-Distortion Theory –Insert distortion in frequency domain following the rate-distortion theory formula. Decorrelation-Transform coefficients are (almost) independent. Energy Concentration-Transform coefficients are ordered according to the importance of their information contents.
72
8.8.1 Linear Transforms Discrete-Space Linear Orthogonal Transforms
Separable Transform: DFT(Discrete Fourier Transform) DCT(Discrete Cosine Transform) DST(Discrete Sine Transform) WHT(Walsh-Hadamard Transform) … <= Fixed Basis Functions
73
8.8.1 Linear Transforms (cont.)
Non-separable Transform: KLT(Karhunen-Loeve) -Basis functions are derived from the auto-correlation matrix of the source signals by
74
8.8.1 Linear Transforms (cont.)
KL Transform The KLT coefficients are uncorrelated. (If is gaussian, KLT coefficients are independent.) KLT offers the best energy compaction. If is 1st-order Markov with correlation coefficients : DCT is the KLT of such a source with DST is the KLT of such a source with Performance of typical still images: DCT KLT
75
8.8.2 Optimum Transform Coder
For stationary random vector source witch covariance matrix Goal: Minimize mean square coding errors
76
8.8.2 Optimum Transform Coder (cont.)
Question: What are the optimum A, B & Q ? Answer: A is the KLT of Q is the optimum (entropy-constrained) quantizer for each
77
8.8.3 Bit Allocation How many bits should be assigned to each transform coefficient? Total Rate where N is the block size, and is the bits assigned to the th coefficient,
78
8.8.3 Bit Allocation (cont.) Distortion: MSE in transform domain
Recall the rate-distortion function of the optimal scalar quantizer is
79
8.8.3 Bit Allocation (cont.) where is the variance of the coefficient , and is a constant depending on the source probability distribution (=2.71 for Gaussian distribution). Hence,
80
8.8.3 Bit Allocation (cont.) For a given total rate R, assuming all ’s have the same value (= ), the results are: If is given, is obtained by solving the last equation.
81
8.8.3 Bit Allocation (cont.) Except for a constant (due to scalar quantizer), the above results are identical to the rate-distortion function of the stationary Gaussian source. That is, transform coefficients less than are not transmitted.
82
8.8.4 Practical Transform Coding
[Encoder] [Decoder]
83
8.8.4 Practical Transform Coding (cont.)
Block Size: 8×8 Transform: DCT (type-2 2D) where
84
8.8.4 Practical Transform Coding (cont.)
Threshold Coding:
85
8.8.4 Practical Transform Coding (cont.)
Zig-zag Scanning:
86
8.8.4 Practical Transform Coding (cont.)
Entropy Coding: Huffman, or Arithmetic Coding DC, AC
87
8.8.5 Performance For typical CCIR 601 pictures: Excellent 2 bits/pel
Good bits/pel Blocking artifacts on reconstructed pictures at very low bit rates (< 0.5 bits/pel) Close to the best known algorithm around 0.75 to 2.0 bits/pel Complexity is acceptable.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.