Dr.-Ing. Khaled Shawky Hassan

Slides:



Advertisements
Similar presentations
DCSP-8: Minimal length coding I Jianfeng Feng Department of Computer Science Warwick Univ., UK
Advertisements

15-583:Algorithms in the Real World
Lecture 4 (week 2) Source Coding and Compression
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Arithmetic Coding. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a How we can do better than Huffman? - I As we have seen, the.
Information Theory EE322 Al-Sanie.
Source Coding Data Compression A.J. Han Vinck. DATA COMPRESSION NO LOSS of information and exact reproduction (low compression ratio 1:4) general problem.
An introduction to Data Compression
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Problem: Huffman Coding Def: binary character code = assignment of binary strings to characters e.g. ASCII code A = B = C =
Data Compression.
Entropy and Shannon’s First Theorem
Lecture 6 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
3. 1 Static Huffman A known tree is used for compressing a file. A different tree can be used for each type of file. For example a different tree for an.
Lecture04 Data Compression.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
SWE 423: Multimedia Systems
Fundamental limits in Information Theory Chapter 10 :
SWE 423: Multimedia Systems Chapter 7: Data Compression (2)
A Data Compression Algorithm: Huffman Compression
Lossless data compression Lecture 1. Data Compression Lossless data compression: Store/Transmit big files using few bytes so that the original files.
Data Structures – LECTURE 10 Huffman coding
1 Chapter 5 A Measure of Information. 2 Outline 5.1 Axioms for the uncertainty measure 5.2 Two Interpretations of the uncertainty function 5.3 Properties.
Variable-Length Codes: Huffman Codes
Lecture 4 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
CSI Uncertainty in A.I. Lecture 201 Basic Information Theory Review Measuring the uncertainty of an event Measuring the uncertainty in a probability.
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
Data Compression Arithmetic coding. Arithmetic Coding: Introduction Allows using “fractional” parts of bits!! Used in PPM, JPEG/MPEG (as option), Bzip.
Huffman Codes Message consisting of five characters: a, b, c, d,e
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
Algorithms in the Real World
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
15-853Page :Algorithms in the Real World Data Compression II Arithmetic Coding – Integer implementation Applications of Probability Coding – Run.
Source Coding-Compression
296.3Page 1 CPS 296.3:Algorithms in the Real World Data Compression: Lecture 2.5.
Basics of Data Compression Paolo Ferragina Dipartimento di Informatica Università di Pisa.
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
DCSP-8: Minimal length coding I Jianfeng Feng Department of Computer Science Warwick Univ., UK
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-217, ext: 1204, Lecture 4 (Week 2)
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Communication System A communication system can be represented as in Figure. A message W, drawn from the index set {1, 2,..., M}, results in the signal.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 2.
Source Coding Efficient Data Representation A.J. Han Vinck.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
. Parameter Estimation and Relative Entropy Lecture #8 Background Readings: Chapters 3.3, 11.2 in the text book, Biological Sequence Analysis, Durbin et.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 10 Rate-Distortion.
Bahareh Sarrafzadeh 6111 Fall 2009
Lecture 7 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Lossless Compression(2)
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 7 (W5)
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Chapter 7 Lossless Compression Algorithms 7.1 Introduction 7.2 Basics of Information Theory 7.3 Run-Length Coding 7.4 Variable-Length Coding (VLC) 7.5.
SEAC-3 J.Teuhola Information-Theoretic Foundations Founder: Claude Shannon, 1940’s Gives bounds for:  Ultimate data compression  Ultimate transmission.
Entropy vs. Average Code-length Important application of Shannon’s entropy measure is in finding efficient (~ short average length) code words The measure.
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
EE465: Introduction to Digital Image Processing
Huffman Coding, Arithmetic Coding, and JBIG2
Advanced Algorithms Analysis and Design
Greedy: Huffman Codes Yin Tat Lee
Huffman Coding Greedy Algorithm
Presentation transcript:

Dr.-Ing. Khaled Shawky Hassan Email: khaled.shawky@guc.edu.eg Lecture 2 (Week 2) Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Email: khaled.shawky@guc.edu.eg 1 1

Variable-Length Codes Non-singular codes A code is non-singular if each source symbol is mapped to a different non-empty bit string, i.e. the mapping from source symbols to bit strings is injective. Uniquely decodable codes A code is obviously not uniquely decodable if two symbols have the same codeword, i.e., if c(ai) = c(aj) for any or combination of two codewords gives a third one Prefix codes A code is a prefix code if no codeword is the prefix of the other. This means that symbols can be decoded instantaneously after their entire codeword is received

Source Coding Theorem Code Word and Code Length: Let C(sn) be the codeword corresponding to sn and let l(sn) denote the length of C(sn). To proceed, let us focus on codes that are “instantaneously” decoded, e.g., Prefix Code Definition 1.3: A code is called a prefix code or an instantaneous code if no codeword is a prefix of any other codeword. Example: Non-prefix: {s1=0, s2=01, s3=011, s4=0111} Prefix: {s1=0, s2=10, s3=110, s4=111 }

Four different classes 0 0 0 0 Singular 0 010 01 10 Non-singular a1 a2 a3 a4 Decoding probem: 010 could mean a1a4 or a2 or a3a1. Then, it is not uniquly decodable All codes Non-singular codes

Four different classes 0 0 0 0 Singular 0 010 01 10 Non-singular 10 00 11 110 Uniquely decodable Decoding probem: 1100000000000000001… is uniqely decodable, but the first symbol (a3 or a4) cannot be decoded until the third ’1’ arrives a1 a2 a3 a4 All codes Non-singular codes Uniquely decodable

Four different classes 0 0 0 0 Singular 0 010 01 10 Non-singular 10 00 11 110 Uniqely decodable 0 10 110 111 Instantaneous a1 a2 a3 a4 All codes Non-singular codes Uniqely decodable Instantaneous

Examples: code classes

Code Trees and Tree Codes Consider, again, our favourite example code {a1, …, a4} = {0, 10, 110, 111}. The codewords are the leaves in a code tree. Tree codes are instantaneous. No codeword is a prefix of another! a1 a2 a3 1 1 1 a4

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 00 | | b | 01 | | c | 1 | 1 c 1 a b repeat curr= root if get_bit(input) = 1 curr= curr.right else curr= curr.left endif until isleaf(curr) output curr.symbol Until eof(input)

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | abracadabra = 010111101100111001011110

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | abracadabra = 010111101100111001011110 Output = -----------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | a abracadabra = 010111101100111001011110 Output = a----------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –10111101100111001011110 Output = a----------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –10111101100111001011110 Output = a----------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | b Abracadabra = ––0111101100111001011110 Output = ab---------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –––111101100111001011110 Output = ab---------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –––111101100111001011110 Output = ab---------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = ––––11101100111001011110 Output = ab---------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –––––1101100111001011110 Output = ab---------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | r Abracadabra = ––––––101100111001011110 Output = abr--------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | Abracadabra = –––––––01100111001011110 Output = abr--------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | a Abracadabra = –––––––01100111001011110 Output = abr--------

Binary trees as prefix decoders |Symbol | Code | ------------------------ | a | 0 | | b | 10 | | c | 110 | | d | 1110 | | r | 1111 | a and so on !!! Abracadabra = ––––––––1100111001011110 Output = abra-------

Source Coding Theorem Theorem 1.2: (Kraft-McMillan Inequality) Any prefix (prefix-free) code satisfies 𝐾 𝐶 = 𝑠∈𝑆 2 −𝑙 𝑠 ⩽1 Conversely, given a set of codeword lengths satisfying the above Inequality, one can have instantaneous code with these word lengths!! (see only can!)

Can you say why? Source Coding Theorem Theorem 1.3: (Shannon, 1992) Any prefix code satisfies 𝑙 𝑠 𝑛 ⩾𝐻 𝐶 𝑖 Can you say why?

Do It Your Self!! Source Coding Theorem   Theorem 1.2: (Kraft-McMillan Inequality) Any prefix code satisfies Do It Your Self!! Proof: Try to prove that does not grow up! i.e., less than 1 ! is simply the length of n codewords; can this < n ? Can you follow ? (self study; K. Sayood page: 32, chapter 2)  

Do It Your Self!! Optimal Codes!! Minimize under the constraint Average codeword length [bits/codeword] Kraft’s Inequality Disregarding integer constraints, we get that Should be minimized Do It Your Self!! Differentiate: then, - - Kraft's inequality: 27 Optimal codeword lengths ) the entropy limit is reached! 27

Optimal Codes!! But what about the integer constraints? li = – log pi is not always an integer! Choose And ssume the truncation factor such that After ceilling: -1 28 28

Optimal Codes and Theorem 1 Assume that the source X is memory-free, and create the tree code for the extended source, i.e., blocks of n symbols. We have instead of single symbol, we have n of the: or We can come arbitrarily close to the entropy for large n! 29 29

Shannon Coding: Two practical problems need to be solved: Bit-assignment The integer constraint (number of bits are integer!) Theoretically: Chose li =ceil(– log pi ) and then, Rounding up not always the best! (Shannon Coding!) OR JUST SELECT NICE VALUES :-) Example (Bad ONE): Binary source p1 = 0.25, p2 = 0.75 l1 = – log2 0.25 = log2 4 = 2 l2 = – log2 0.75 = log2 4/3 = 0.41504?! Then what is l2? 30 30

Shannon Coding: Example: Find the binary code words of the following random variable sequence: 𝑙 1 =𝑐𝑒𝑖𝑙 log 2 1 0.49 = − log 2 (0.49) = 1.0291 =2 𝑙 2 =𝑐𝑒𝑖𝑙 log 2 1 0.26 = − log 2 (0.26) = 1. 9434 =2 𝑙 3 =𝑐𝑒𝑖𝑙 log 2 1 0.12 = − log 2 (0.12) = 3.0589 =4 𝑙 4,5 =𝑐𝑒𝑖𝑙 log 2 1 0.04 = − log 2 (0.04) = 4.6439 =5 𝑙 6 =𝑐𝑒𝑖𝑙 log 2 1 0.03 = − log 2 (0.03) = 5.0589 =6 𝑙 7 =𝑐𝑒𝑖𝑙 log 2 1 0.02 = − log 2 (0.02) = 5.6439 =6 Find the value of ”s” (the different between the integer length and the self info.) (DO IT YOUR SELF!) 31 31

Shannon Coding: Example: The average codeword length 𝑙 1 =2, 𝑙 2 =2, 𝑙 3 =4, 𝑙 4 = 𝑙 5 =5, 𝑙 6 =6, and 𝑙 7 =6 𝑙 =∑ 𝑝 𝑖 𝑙 𝑖 =2.68 The Entropy: Then: 𝐻=−∑ 𝑝 𝑖 log 2 𝑝 𝑖 =2.0127 OR, instead, use, e.g., the Huffman algorithm (developed by D.Huffman, 1952) to create an optimal tree code! 𝐻≤ 𝑙 <H+1 32 32

Ex.: Kraft’s inequality and Optimal codes Example:

Modeling & Coding Developing compression algorithms Phase I: Modeling Develop the means to extract redundancy information Redundancy → Predictability Phase II: Coding Binary representation of the The representation depends on the “Modeling” 34 34

Modeling Example 1 Let us consider this arbitrary sequence: Sn= 9, 11, 11, 11, 14, 13, 15, 17, 16, 17, 20, 21 Binary encoding requires 5 – bits/sample; WHY ? Now, let us consider the model: Ŝn= n + 8: Thus: Ŝn = 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 Can be decoded as: en= Sn-Ŝn: 0, 1, 0, -1, 1, -1, 0, 1, -1, -1, 1, 1 How many bits are required now? Coding: ‘00’ <=> -1, ‘01’ <=> 0, ‘10’ <=> 1 2 bits/sample Therefore, the model also needs to be designed and encoded in the algorithm 35 35

Modeling: Yet another example Let us consider this arbitrary sequence: Sn= 1 2 3 2 3 4 5 4 5 6 7 8 9 8 9 10 Assume it correctly describes the probabilities generated by the source; then P(1) = P(6) = P(7) = p(10) = 1/16 P(2) = P(3) = P(4) = P(5) = P(8) = P(9) = 2/16 Assuming the sequence is independent and identically distributed (i.i.d.) Then: However, if we found somehow a sort of correlation, then: Thus, instead of coding the samples, code the difference only: i.e.,: 1 1 1 -1 1 1 1 -1 1 1 1 1 1 -1 1 1 Now, P(1) = 13/16, P(-1) = 3/16 Then: H = 0.70 bits (per symbol) 36 36

Markov Models Assume that each output symbol depends on previous k ones. Formally: Let {xn} be a sequence of observations We call {xn} a kth-order discrete Markov chain (DMC) if Usually, we use a first order DMC (the knowledge of 1 symbol in the past is enough) The State of the Process 37 37

Non-Linear Markov Models Consider a BW image as a string of black & white pixels (e.g. row-by-row) Define two states: Sb& Sw for the current pixel Define probabilities: P(Sb) = prob of being in Sb P(Sw) = prob of being in Sw Transition probabilities P(b|b), P(w|b) P(b|w), P(w|w) 38 38

Markov Models Example Assume P(Sw) = 30/31 P(Sb) = 1/31 P(w/w) = 0.99 P(b/w) = 0.01 P(b/b) = 0.7 P(w/b) = 0.3 For the Markov Model H(Sb) = -0.3log(0.3) – 0.7log(0.7) = 0.881 H(Sw) = 0.01log(0.01) – 0.99log(0.99) = 0.088 HMarkov = 30/31*0.081 + 1/31*0.881 = 0.107 What is iid different in ?? 39 39

Markov Models in Text Compression In written English, probability of next letter is heavily influenced by previous ones E.g. “u” after “q” Shannon's work: He used 2nd-order MM, 26 letters + space,, found H = 3.1 bits/letter Word-based model, H=2.4bits/letter Human prediction based on 100 previous letters, he found the limits 0.6 ≤H ≤1.3 bits/letter Longer context => better prediction 40 40

Composite Source Model In many applications, it is not easy to use a single model to describe the source. In such cases, we can define a composite source, which can be viewed as a combination or composition of several sources, with only one source being active at any given time. E.g.: an executable contains: For very complicated resources (text, images, …) Solution: composite model, different sources/each with different model work in sequence : 41 41

Let Us Decorate This: Knowing something about the source it self can help us to ‘reduce’ the entropy This is called; Entropy Encoding Note the we cannot actually reduce the entropy of the source, as long as our coding is lossless Strictly speaking, we are only reducing our estimate of the entropy 42 42

Examples: Modelling! 43 43