Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007.

Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007

Introduction  Compress methods are key enabling techniques for multimedia applications.  Raw media takes much storage and bandwidth –A raw video with 30 frame/sec, resolution of 640x480, 24bit color One second of video 30 * 640 * 480 * 3 = 27.6480 Mbytes One hour video is about 100Gbytes

Some Terms Encoder (compression) Storage or networks Decoder (decompression) Data Input ( a sequence of symbols from an alphabet ) Recovered data sequence Lossless compression: The recovered data is exactly the same as the input. Lossy compression: The recovered data approximates the input data. Compression ratio = (bits used to represent the input data) / (bits of the code) Code ( a sequence of codewords ) Information source

Entropy  The number of bits needed to encode a media source is lower-bounded by its “Entropy”.  Self information of an event A is defined as -log b P(A) where P(A) is the probability of event A. If b equals 2, the unit is “bits”. If b equals e, the unit is “nats” If b is 10, the unit is “hartleys”

Example  A source outputs two symbols (the alphabet has 2 symbols) 0 or 1. P(0) = 0.25, P(1) = 0.75. Information we get when receiving a 0 is log_2 (1/0.25) = 2 bit ; when receiving a 1 is log_2 (1/0.75) = 0.4150 bit.

Properties of Self Information  The letter with smaller probability has high self information.  The information we get when receiving two independent letters are summation of each of the self information. -log 2 P(s a,s b ) = -log 2 P(s a )P(s b ) = [-log 2 P(s a )] + [- log 2 P(s a )]

Entropy  An source has symbols {s1, s2, …, sn}, and the symbols are independent, the average self-information is H=  1 n P(s i )log 2 (1/P(s i )) bits  H is called the Entropy of the source.  The number of bits per symbol needed to encode a media source is lower-bounded by its “Entropy”.

Entropy (cont)  Example: A source outputs two symbols (the alphabet has 2 letters) 0 or 1. P(0) = 0.25, P(1) = 0.75. H = 0.25 * log_2 (1/0.25) + 0.75 * log_2(1/0.75) = 0.8113 bits We need at least 0.8113 bits per symbol in encoding.

The Entropy of an Image  An grayscale image with 256 possible levels. A={0, 1, 2, …, 255}. Assuming the pixels are independent and the grayscales are have equal probabilities, H = 256 * 1/256 *log2(1/256) = 8bits  What about an image with only 2 levels 0 and 255? Assuming, P(0) = 0.5 and P(255) = 0.5. H = 1 bit

Estimate the Entropy a a a b b b b c c c c d d P(a) = 3/13 P(b) = 4/13 P(c) = 4/13 P(d) = 2/13 H = [-P(a)log_2P(a)] + [-P(b)log_2P(b)] + [-P(c)log_2P(c)] + [-P(d)log_2P(d)] = 1.95bits Assuming the symbols are independent:

Coding Schemes A = {s1, s2, s3, s4} P(s1) = 0.125 P(s2) = 0.125 P(s3) = 0.25 P(s4) = 0.5 s1 s2 s3 s4 0 1 11 01 s1 s2 s3 s4 0 10 111 110 s1 s2 s3 s4 0 0 11 10 Its entropy H = 1.75 Not uniquely decodeable Good codewords and achieves lower bound

Huffman Coding s1 s2 s3 s4 0.125 0.25 0.5 0.25 0.5 1 0 0 1 0 1 (01) (1) (001) (000)

Another Example 0.1 a5 0.1 a4 0.2 a3 0.2 a2 0.4 a1 0.2 0.4 0.2 0.6 0.4 1 0 1 0 1 0 1 0 (0) (10) (111) (1101) (1100)

Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007.

Similar presentations

Presentation on theme: "Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007.

Similar presentations

Presentation on theme: "Lossless Compression - I Hao Jiang Computer Science Department Sept. 13, 2007."— Presentation transcript:

Similar presentations

About project

Feedback