Dr.-Ing. Khaled Shawky Hassan

Dr.-Ing. Khaled Shawky Hassan Email: khaled.shawky@guc.edu.eg
Lecture 2 (Week 2) Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, 1 1

Variable-Length Codes
Non-singular codes A code is non-singular if each source symbol is mapped to a different non-empty bit string, i.e. the mapping from source symbols to bit strings is injective. Uniquely decodable codes A code is obviously not uniquely decodable if two symbols have the same codeword, i.e., if c(ai) = c(aj) for any or combination of two codewords gives a third one Prefix codes A code is a prefix code if no codeword is the prefix of the other. This means that symbols can be decoded instantaneously after their entire codeword is received

Source Coding Theorem Code Word and Code Length:
Let C(sn) be the codeword corresponding to sn and let l(sn) denote the length of C(sn). To proceed, let us focus on codes that are “instantaneously” decoded, e.g., Prefix Code Definition 1.3: A code is called a prefix code or an instantaneous code if no codeword is a prefix of any other codeword. Example: Non-prefix: {s1=0, s2=01, s3=011, s4=0111} Prefix: {s1=0, s2=10, s3=110, s4=111 }

Four different classes
Singular Non-singular a1 a2 a3 a4 Decoding probem: 010 could mean a1a4 or a2 or a3a1. Then, it is not uniquly decodable All codes Non-singular codes

Singular Non-singular Uniquely decodable Decoding probem: … is uniqely decodable, but the first symbol (a3 or a4) cannot be decoded until the third ’1’ arrives a1 a2 a3 a4 All codes Non-singular codes Uniquely decodable

Singular Non-singular Uniqely decodable Instantaneous a1 a2 a3 a4 All codes Non-singular codes Uniqely decodable Instantaneous

Examples: code classes

Code Trees and Tree Codes
Consider, again, our favourite example code {a1, …, a4} = {0, 10, 110, 111}. The codewords are the leaves in a code tree. Tree codes are instantaneous. No codeword is a prefix of another! a1 a2 a3 1 1 1 a4

Binary trees as prefix decoders
|Symbol | Code | | a | | | b | | | c | | 1 c 1 a b repeat curr= root if get_bit(input) = 1 curr= curr.right else curr= curr.left endif until isleaf(curr) output curr.symbol Until eof(input)

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | abracadabra =

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | abracadabra = Output =

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | a abracadabra = Output = a

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | Abracadabra = – Output = a

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | b Abracadabra = –– Output = ab

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | Abracadabra = ––– Output = ab

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | Abracadabra = –––– Output = ab

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | Abracadabra = ––––– Output = ab

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | r Abracadabra = –––––– Output = abr

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | Abracadabra = ––––––– Output = abr

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | a Abracadabra = ––––––– Output = abr

|Symbol | Code | | a | | | b | | | c | | | d | | | r | | a and so on !!! Abracadabra = –––––––– Output = abra

Source Coding Theorem Theorem 1.2: (Kraft-McMillan Inequality)
Any prefix (prefix-free) code satisfies 𝐾 𝐶 = 𝑠∈𝑆 2 −𝑙 𝑠 ⩽1 Conversely, given a set of codeword lengths satisfying the above Inequality, one can have instantaneous code with these word lengths!! (see only can!)

Can you say why? Source Coding Theorem Theorem 1.3: (Shannon, 1992)
Any prefix code satisfies 𝑙 𝑠 𝑛 ⩾𝐻 𝐶 𝑖 Can you say why?

Do It Your Self!! Source Coding Theorem
Theorem 1.2: (Kraft-McMillan Inequality) Any prefix code satisfies Do It Your Self!! Proof: Try to prove that does not grow up! i.e., less than 1 ! is simply the length of n codewords; can this < n ? Can you follow ? (self study; K. Sayood page: 32, chapter 2)

Do It Your Self!! Optimal Codes!! Minimize under the constraint
Average codeword length [bits/codeword] Kraft’s Inequality Disregarding integer constraints, we get that Should be minimized Do It Your Self!! Differentiate: then, - - Kraft's inequality: 27 Optimal codeword lengths ) the entropy limit is reached! 27

Optimal Codes!! But what about the integer constraints? li = – log pi is not always an integer! Choose And ssume the truncation factor such that After ceilling: -1 28 28

Optimal Codes and Theorem 1
Assume that the source X is memory-free, and create the tree code for the extended source, i.e., blocks of n symbols. We have instead of single symbol, we have n of the: or We can come arbitrarily close to the entropy for large n! 29 29

Shannon Coding: Two practical problems need to be solved:
Bit-assignment The integer constraint (number of bits are integer!) Theoretically: Chose li =ceil(– log pi ) and then, Rounding up not always the best! (Shannon Coding!) OR JUST SELECT NICE VALUES :-) Example (Bad ONE): Binary source p1 = 0.25, p2 = l1 = – log = log2 4 = 2 l2 = – log = log2 4/3 = ?! Then what is l2? 30 30

Shannon Coding: Example: Find the binary code words of the following random variable sequence: 𝑙 1 =𝑐𝑒𝑖𝑙 log = − log 2 (0.49) = =2 𝑙 2 =𝑐𝑒𝑖𝑙 log = − log 2 (0.26) = =2 𝑙 3 =𝑐𝑒𝑖𝑙 log = − log 2 (0.12) = =4 𝑙 4,5 =𝑐𝑒𝑖𝑙 log = − log 2 (0.04) = =5 𝑙 6 =𝑐𝑒𝑖𝑙 log = − log 2 (0.03) = =6 𝑙 7 =𝑐𝑒𝑖𝑙 log = − log 2 (0.02) = =6 Find the value of ”s” (the different between the integer length and the self info.) (DO IT YOUR SELF!) 31 31

Shannon Coding: Example: The average codeword length
𝑙 1 =2, 𝑙 2 =2, 𝑙 3 =4, 𝑙 4 = 𝑙 5 =5, 𝑙 6 =6, and 𝑙 7 =6 𝑙 =∑ 𝑝 𝑖 𝑙 𝑖 =2.68 The Entropy: Then: 𝐻=−∑ 𝑝 𝑖 log 2 𝑝 𝑖 =2.0127 OR, instead, use, e.g., the Huffman algorithm (developed by D.Huffman, 1952) to create an optimal tree code! 𝐻≤ 𝑙 <H+1 32 32

Ex.: Kraft’s inequality and Optimal codes
Example:

Modeling & Coding Developing compression algorithms
Phase I: Modeling Develop the means to extract redundancy information Redundancy → Predictability Phase II: Coding Binary representation of the The representation depends on the “Modeling” 34 34

Modeling Example 1 Let us consider this arbitrary sequence:
Sn= 9, 11, 11, 11, 14, 13, 15, 17, 16, 17, 20, 21 Binary encoding requires 5 – bits/sample; WHY ? Now, let us consider the model: Ŝn= n + 8: Thus: Ŝn = 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Can be decoded as: en= Sn-Ŝn: 0, 1, 0, -1, 1, -1, 0, 1, -1, -1, 1, 1 How many bits are required now? Coding: ‘00’ <=> -1, ‘01’ <=> 0, ‘10’ <=> 1 2 bits/sample Therefore, the model also needs to be designed and encoded in the algorithm 35 35

Modeling: Yet another example
Let us consider this arbitrary sequence: Sn= Assume it correctly describes the probabilities generated by the source; then P(1) = P(6) = P(7) = p(10) = 1/16 P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16 Assuming the sequence is independent and identically distributed (i.i.d.) Then: However, if we found somehow a sort of correlation, then: Thus, instead of coding the samples, code the difference only: i.e.,: Now, P(1) = 13/16, P(-1) = 3/16 Then: H = 0.70 bits (per symbol) 36 36

Markov Models Assume that each output symbol depends on previous k ones. Formally: Let {xn} be a sequence of observations We call {xn} a kth-order discrete Markov chain (DMC) if Usually, we use a first order DMC (the knowledge of 1 symbol in the past is enough) The State of the Process 37 37

Non-Linear Markov Models
Consider a BW image as a string of black & white pixels (e.g. row-by-row) Define two states: Sb& Sw for the current pixel Define probabilities: P(Sb) = prob of being in Sb P(Sw) = prob of being in Sw Transition probabilities P(b|b), P(w|b) P(b|w), P(w|w) 38 38

Markov Models Example Assume P(Sw) = 30/31 P(Sb) = 1/31
P(w/w) = P(b/w) = 0.01 P(b/b) = P(w/b) = 0.3 For the Markov Model H(Sb) = -0.3log(0.3) – 0.7log(0.7) = 0.881 H(Sw) = 0.01log(0.01) – 0.99log(0.99) = 0.088 HMarkov = 30/31* /31*0.881 = 0.107 What is iid different in ?? 39 39

Markov Models in Text Compression
In written English, probability of next letter is heavily influenced by previous ones E.g. “u” after “q” Shannon's work: He used 2nd-order MM, 26 letters + space,, found H = 3.1 bits/letter Word-based model, H=2.4bits/letter Human prediction based on 100 previous letters, he found the limits 0.6 ≤H ≤1.3 bits/letter Longer context => better prediction 40 40

Composite Source Model
In many applications, it is not easy to use a single model to describe the source. In such cases, we can define a composite source, which can be viewed as a combination or composition of several sources, with only one source being active at any given time. E.g.: an executable contains: For very complicated resources (text, images, …) Solution: composite model, different sources/each with different model work in sequence : 41 41

Let Us Decorate This: Knowing something about the source it self can help us to ‘reduce’ the entropy This is called; Entropy Encoding Note the we cannot actually reduce the entropy of the source, as long as our coding is lossless Strictly speaking, we are only reducing our estimate of the entropy 42 42

Examples: Modelling! 43 43

Dr.-Ing. Khaled Shawky Hassan

Similar presentations

Presentation on theme: "Dr.-Ing. Khaled Shawky Hassan"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Dr.-Ing. Khaled Shawky Hassan

Similar presentations

Presentation on theme: "Dr.-Ing. Khaled Shawky Hassan"— Presentation transcript:

Similar presentations

About project

Feedback