Multimedia Processing

Multimedia Processing
Chapter 6. Overview of Compression

Topics Need for Compression
Introduction Topics Need for Compression Data Sources and Information Representation Overview of Lossless Compression Techniques Overview of Lossy Compression Techniques

6.1 The Need for Compression
Multimedia information has to be stored and transported efficiently. Multimedia information is bulky – Interlaced HDTV: 1080x1920x30x12 = 745 Mb/s! Use technologies with more bandwidth Expensive Research Find ways to reduce the number of bits to transmit without compromise on the “information content” Compression Technology Reliable Transmission

6.2 The Basics of Information Theory
Pioneered primarily by Claude Shannon Information Theory deals with two sub topics Source Coding – How can a communication system efficiently transmit the information that a source produces? Relates to Compression Channel Coding – How can a communication system achieve reliable communication over a noisy channel? Relates to minimizing Transmission Error

Shannon’s Communication Model Information Source Receiver Encoder Decoder Channel

Note: Claude E. Shannon (1916-2001)
1916, born in Petoskey, Mich., U.S.A. 1936, earned bachelor’s degree, mathematics and electrical eng., MIT 1940, earned master’s degree, electrical eng., MIT and Ph. D., mathematics, MIT 1941, joined mathematics dept., Bell Lab. 1948, published “A Mathematical Theory of Communications.” 1956, visiting professor, MIT 1958, permanent member of the faculty, MIT 1978, professor emeritus, MIT

A Mathematical Theory of Communications
Claude E. Shannon A Mathematical Theory of Communications C. E. Shannon, “A Mathematical Theory of Communication,” B.S.T.J. vol. 27, pp (Part I), (Part II), 1948. shannonday/work.html shannonday/shannon1948.pdf

6.2.1 Information Theory Definitions Symbols Information can be thought of as an organization of individual elements Alphabet a distinct and nonempty set of symbols Vocabulary the number or length of symbols in the set If your vocabulary has N symbols, each symbol is represented with log 2 N bits. Speech: 16 bits/sample: N =216 =65,536 symbols Color Image: 3x8 bits/sample: N =224=17x106 symbols 8x8 Image Blocks: 8x64 bits/block: =2512=1077symbols

A code is a set of vectors called codewords. Fixed Length Code (FLC) Symbols are equally probable Variable Length Code (VLC) Symbols are not equally probable FLC and VLC for alphabet (A-H) FLC VLC Letter Codeword A 000 E 100 B 001 F 101 C 010 G 110 011 H 111 Letter Codeword A 00 E 101 B 010 F 110 C 011 G 1110 100 H 1111

A Prefix Code is one in which no codeword forms the prefix of any other codeword. Such codes are also called Uniquely Decodable or Instantaneous Codes. Ex; for ABAD CAB with VLC1 and VLC2 VLC1: bits VLC2: bits VLC VLC2 Letter Codeword A 00 E 101 B 010 F 110 C 011 G 1110 100 H 1111 Letter Codeword A E 10 B 1 F 11 C 00 G 000 01 H 111

Types of compression Lossless (entropy coding) Does not lose information - the signal is perfectly reconstructed after decompression Produces a variable bit-rate It is not guaranteed to actually reduce the data size Lossy Loses some information - the signal is not perfectly reconstructed after decompression Produces any desired constant bit-rate

Lossless Compression Lossless compression techniques ensure no loss of data after compression/decompression Coding: “Translate” each symbol in the vocabulary into a “binary codeword”. Codewords may have different binary lengths. Example: You have 4 symbols (a, b, c, d). Each in binary may be represented using 2 bits each, but coded using a different number of bits. a (00) =>000 b (01) => 001 c (10) => 01 d (11) => 1 Goal of Coding is to minimize the average symbol length

Example Symbol Code Nb of occurrences(prob) Bits used s (0.7) 70x2 = 140 s (0.05) 5x2 = 10 s (0.2) 20x2 = 40 s (0.05) 5x2 = 10 200 Symbol Code Nb of occurrences(prob) Bits used s (0.7) 70x1 = 70 s (0.05) 5x3 = 15 s (0.2) 20x2 = 40 s (0.05) 5x3 = 15 140 Prefix codes: uniquely decodable code Symbol Code Nb of occurrences(prob) Bits used s (0.7) 70x1 = 70 s (0.05) 5x2 = 10 s (0.2) 20x1 = 20 s (0.05) 5x2 = 10 110 But, not prefix codes !!

Symbol probability Probability P(i) of a symbol si: number of times it can occur in the transmission (also relative frequency) and is defined as : Pi=mi/M M: message length mi: number of times symbol si occurs From Probability Theory we know Average codeword length is defined as li: codeword length for symbol si

Minimum average symbol length Main goal is to minimize the number of bits being transmitted <=> minimize the average symbol length. Basic idea for reducing the average symbol length: assign Shorter Codewords to symbols that appear more frequently, Longer Codewords to symbols that appear less frequently Theorem: “the Average Binary Length of the encoded symbols is always greater than or equal to the entropy H of the source” What is the entropy of a source of symbols and how is it computed?

6.2.2 Information Representation The message of length M contains mi instances of symbol si. The total number of bits that the source has produced for message M is given by The average per symbol length is given by Additionally, in terms of symbol probabilities, λ may be expressed as where Pi is the probability of occurrence of symbol si.

6.2.3 Entropy Self Information Units determined by the base of the logarithm 2 : bits e : nats (natural units) 10: Hartely (introduced in 1928)

Entropy Entropy is defined as It is the average self-information The entropy is characteristic of a given source of symbols H is highest (equal to log2N) if all symbols are equally probable Entropy is small (always ≥0) when some symbols are much more likely to appear than other symbols

Example 0.5 1.0 s1 s2 s3 s4

Images Entropy Entropy 1

6.3 A Taxonomy of Compression
Repetitive Sequence Compression Repetition Suppression Statistical Compression Huffman Coding Arithmetic Coding Goulomb Coding Prediction Based Frequency Based Signal Importance Hybrid DPCM Motion Compensation Block Transforms Eg DCT Subband Eg Wavelets, mp3 Filtering Quantization Sub Sampling JPEG CCITT Fax MPEG Compression Techniques Lossless Lossy Run length Encoding Dictionary Based Lempel-Ziv Welch

Examples of entropy coding Types of techniques vary depending on how much you statistically analyze the signal Run-length Encoding Sequence of elements c1, ci… is mapped to a run – (ci, li) where ci = code and li = length of the ith run Repetition Suppression Repetitive occurrences of a specific character are replaced by a special flag ab →ab 9 Pattern Substitution A pattern, which occurs frequently, is replaced by a specific symbol eg LZW Huffman Coding Arithmetic Coding

6.3.1 Compression Metric Compression rate Is absolute term and is simply defined as the rate of compressed data Is expressed as bits/symbol or bits/sample, bits/pixel Compression ratio Is relative term and is defined as the ratio of the size of the original data to the size of the compressed data

6.3.2 Rate Distortion original signal y is compressed, then reconstructed as ŷ. distortion or error measured by error = f(y- ŷ), where f measures the difference. One popular function is the Mean Square Error. others, such as Signal to Noise Ratio (SNR) and Peak Signal to Noise Ratio (PSNR) can also used.

Rate Distortion Rate R(D) Distortion ŷ MSE: Mean Squared Error Where is the average squared value of the signal Where σpeak is the squared value of the peak signal

6.4 Lossless Compression 6.4.1 Run Length Encoding
Simplest form of compression technique, but widely used in bitmap file formats such as RIFF, BMP and PCX Concept: Discard repeating string of characters by indication them with special flags. Consider bit stream : This can be represented (15,1) (19, 0) (4, 1) Coded version is (01111,1) (10011,0) (00100,1) PCX format 192 colors with flag byte.(not 256 colors) Example 1.17

6.4 Lossless Compression Application to FAX

Advantages and Disadvantages
6.4 Lossless Compression Advantages and Disadvantages

6.4.3 Pattern Substitution: LZW
6.4 Lossless Compression 6.4.3 Pattern Substitution: LZW Algorithm : Initialize the dictionary to contain all blocks of length one ( D={x,y} in the example below ). Search for the longest block W which has appeared in the dictionary. Encode W by its index in the dictionary. Add W followed by the first symbol of the next block to the dictionary. Go to Step 2.

6.4 Lossless Compression x y y x x y y x x y x y y x x x x y x x y y x
Iteration Current Dictionary Next New Dictionary New Index symb Code Code Generated Generated x 0 y 1 1 x 0 y x y 2 2 y 1 y y y 3 3 y 1 x y x 4 4 x 0 x x x 5 5 x y 2 y x y y 6 6 y x 4 x y x x 7 7 x y 2 x x y x 8 8 x y y 6 x x y y x 9 9 x x 5 x x x x 10 10 x x 5 y x x y 11 11 y x x 7 y y x x y 12 12 y y 3 x y y x 13 13 x 0 Index Entry Index Entry 0 x 7 y x x 1 y 8 x y x 2 x y 9 x y y x 3 y y 10 x x x 4 y x 11 x x y 5 x x 12 y x x y 6 x y y 13 y y x

c.f) Concept of LZ coding
6.4 Lossless Compression c.f) Concept of LZ coding 시작 새로운 어구를 가져온다 끝 등록된 어구인가? 등록하고 어구를 출력한다 사전상의 어구 위치를 출력한다 Yes No

Let consider given string : 101011011010101011
6.4 Lossless Compression Example Let consider given string : phrase the string as follows 1, 0, 10, 11, 01, 101, 010, 1011 above phrases can be described by (000,1) (000,0) (001,0) (001,1) (010,1) (011,1) (101,0) (110,1) Coded version of the above string is

6.4 Lossless Compression 6.4.4 Huffman Coding Huffman code assignment procedure is based on the frequency of occurrence of a symbol. It uses a binary tree structure and the algorithm is as follows The leaves of the binary tree are associated with the list of probabilities Take the two smallest probabilities in the list and make the corresponding nodes siblings. Generate an intermediate node as their parent and label the branch from parent to one of the child nodes 1 and the branch from parent to the other child 0. Replace the probabilities and associated nodes in the list by the single new intermediate node with the sum of the two probabilities. If the list contains only one element, quit. Otherwise, go to step 2.

Example Average Symbol Length 3 Entropy 2.574 Efficiency 0.858 Symbol
6.4 Lossless Compression Example Symbol Probability Binary Code A 0.28 000 B 0.2 001 C 0.17 010 D 011 E 0.1 100 F 0.05 101 G 0.02 110 H 0.01 111 Average Symbol Length 3 Entropy Efficiency 0.858

P(A)=0.28 0.62 P(B)=0.2 P(C)=0.17 1.0 0.34 P(D)=0.17 0.38 P(E)=0.1
6.4 Lossless Compression P(A)=0.28 P(B)=0.2 P(D)=0.17 P(C)=0.17 P(E)=0.1 P(F)=0.05 P(H)=0.01 P(G)=0.02 0.03 0.08 0.18 0.34 0.38 0.62 1 1110 11110 11111 110 010 1.0 00 10 011 1 1

Symbol Huffman code Code length A 00 2 B 10 C 010 3 D 110 E 011 F 1110
6.4 Lossless Compression Symbol Huffman code Code length A 00 2 B 10 C 010 3 D 110 E 011 F 1110 4 G 11110 5 H 11111 Average Symbol Length 2.63 Entropy Efficiency

Advantages and Disadvantages of Huffman coding
6.4 Lossless Compression Advantages and Disadvantages of Huffman coding

6.4 Lossless Compression Fax encoding Facsimile: bilevel image, 8.5” x 11” page scanned (left to-right) at 200 dpi: 3.74 Mb = (1728 pixels on each line) Encoding Process Count each run of white/black pixels Encode the run-lengths with Huffman coding Probabilities of run-lengths are not the same for black and white pixels Decompose each run-length n as n =k x 64 +m (with  m  64), and create Huffman tables for both k and m Special Codewords for end-of-line, end-of-page, synchronization Average compression ratio on text documents: 20-to-1

6.4 Lossless Compression 6.4.5 Arithmetic Coding Map a message of symbols to a real number interval. If you have a vocabulary of M symbols - Divide the interval [0,1) into segments corresponding to the M symbols; the segment of each symbol has a length proportional to its probability. Choose the segment of the first symbol in the string message. Divide the segment of this symbol again into M new segments with length proportional to the symbols probabilities. From these new segments, choose the one corresponding to the next symbol in the message. Continue steps 3) and 4) until the whole message is coded. Represent the segment's value by a binary fraction.

6.4 Lossless Compression Example The vocabulary has two symbols a, b having probabilities as follows P(a) = 2/3, P(b) = 1/3 We want to encode “abba” the final interval [48/81, 52/81] represents the message “abba” Choose 50/81= = Set “abba”  1001

Example 6.4 Lossless Compression 1 a b 2/3 6/9 4/9 12/27 16/27 18/27
1 a b 2/3 6/9 4/9 12/27 16/27 18/27 48/81 52/81 54/81 aa ba ab bb aaa baa aba bba aab abb bab bbb 8/27 22/27 24/27 26/27

6.5 Lossy Compression 6.5 Lossy Compression Decompressed signal is not like original signal – data loss Objective: minimize the distortion for a given compression ratio Ideally, we would optimize the system based on perceptual distortion (difficult to compute) We’ll need a few more concepts from statistics…

Definitions y = the original value of a sample
6.5 Lossy Compression Definitions y = the original value of a sample ŷ = the sample value after compression/decompression e = y - ŷ = error (or noise or distortion) = power (variance) of the signal = power (variance) of the error Signal to noise ratio

6.5 Lossy Compression 6.5.1 Differential PCM If the (n-1)-th sample has value y(n-1), what is the “most probable” value for y(n)? The simplest case: the predicted value for y(n) is y(n-1) Differential PCM Encoding (DPCM): Don’t quantize and transmit y(n) Quantize and transmit the residual d(n) = y(n)-y(n-1) Decoding: ŷ(n) = y(n-1) + d’(n) where d’(n) is the quantized version of d(n) We can show that SNR DPCM > SNR PCM

Differential pulse code modulation
6.5 Lossy Compression Differential pulse code modulation d(n) The left signal shows the original signal. The right signal samples are obtained by computing the differences between successive samples. Note the first difference d(1) is the same as y(1), all the successive differences d(n) are computed as y(n)-y(n-1)

Open loop Differential pulse code modulation
6.5 Lossy Compression Open loop Differential pulse code modulation Quantizer Differencer y(n)- y(n-1) y(n) y(n-1) d(n) đ(n) Adder y(n-1)+ đ(n) ŷ(n) Encoder (Top) works by predicting difference d(n) and quantizing it. Decoder works by recovering quantized signal ŷ(n) from đ(n)

y(n) - y(n-1) but y(n) - ŷ (n-1)
6.5 Lossy Compression Closed loop Differential pulse code modulation the decoder computes ŷ(n) = y(n-1) + đ(n), but the signal y(n-1) is not available at the decoder. The decoder has ŷ(n-1), which is the same as y(n-1), with some quantization error en-1. This error accumulates at every stage, which is not desirable. a closed loop DPCM can be performed, where at the encoder, the difference d(n) is not computed as y(n) - y(n-1) but y(n) - ŷ (n-1)

Closed loop Differential pulse code modulation
6.5 Lossy Compression Quantizer Differencer y(n)- ŷ(n-1) Adder ŷ(n-1)+ đ(n) Predictor y(n) ŷ(n-1) d(n) đ(n) ŷ (n) Closed loop Differential pulse code modulation At the encoder (above) the difference d(n) is computed as y(n) - ŷ(n-1). This reduces the error that accumulates under the open loop case. A decoder is simulated at the encoder, and the decoded values are used for computing the difference.

6.5.2 Vector Quantization Input data is organized as vectors.
6.5 Lossy Compression 6.5.2 Vector Quantization Input data is organized as vectors. In the encoder, a search engine finds the closest vector from the codebook and output a codebook index for each input vector. The decoding process proceeds as a index look up into the codebook and outputs the appropriate vector. Along with the coded indexes, the encoder needs to communicate the codebook to the decoder

Vector Quantization Encoder Decoder 6.5 Lossy Compression 1 2 3 n .
Codebook Index Code vector Code Vector Search Encoder Index Lookup For Code Vector Decoder Input Vectors Output Vectors

6.5 Lossy Compression 6.5.3 Transform Coding In Transform Coding a segment of information undergoes an invertible mathematical transformation Segments of information can be samples (assume M samples) such as - image frame 8x8 pixel block (M =64) a segment of speech Transform Coding works as follows - Apply a suitable invertible transformation (typically a matrix multiplication) x(1), x(2), ... ,x(M) => y(1), y(2), ... ,y(M) (channels) Quantize and transmit y(1),...,y(M)

6.5 Lossy Compression Transform Coding II

6.5 Lossy Compression Transform Coding III

Split Signal Into Blocks Combine Blocks into Signal
6.5 Lossy Compression Transform Coding IV Y1 Y2 Y3 Y4 Y5 Y6 Q1 Q2 Q3 Q4 Q5 Q6 YT1 YT2 YT3 YT4 YT5 YT6 ŶT1 ŶT 2 ŶT 3 ŶT 4 ŶT 5 Ŷ T6 ŶT Ŷ1 Ŷ2 Ŷ3 Ŷ4 Ŷ5 Ŷ6 T Combine Quantized Blocks Split Signal Into Blocks Y Split Signal Ŷ 1 Ŷ 2 Ŷ 3 Ŷ 4 Ŷ 5 Ŷ 6 T-1 Combine Blocks into Signal Ŷ

6.5 Lossy Compression 6.5.4 Subband Coding It is a kind of transform coding (although operating on the whole signal) It is not implemented as a matrix multiplication but as a bank of filters followed by decimation (i.e., subsampling) Again, each channel represents the signal energy content along different frequencies. Bits can then be allocated to the coding of an individual subband signal depending upon the energy within the band

6.5 Lossy Compression Subband Coding II

Homework Homework Read Text Chapter 7. Prepare Presentation

Multimedia Processing

Similar presentations

Presentation on theme: "Multimedia Processing"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Multimedia Processing

Similar presentations

Presentation on theme: "Multimedia Processing"— Presentation transcript:

Similar presentations

About project

Feedback