Lossless data compression Lecture 1. Data Compression Lossless data compression: Store/Transmit big files using few bytes so that the original files.

Slides:



Advertisements
Similar presentations
15-583:Algorithms in the Real World
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Greedy Algorithms Amihood Amir Bar-Ilan University.
An introduction to Data Compression
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Chain Rules for Entropy
Data Compression.
Lecture04 Data Compression.
Compression & Huffman Codes
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
DL Compression – Beeri/Feitelson1 Compression דחיסה Introduction Information theory Text compression IL compression.
Data Structures – LECTURE 10 Huffman coding
1 Chapter 5 A Measure of Information. 2 Outline 5.1 Axioms for the uncertainty measure 5.2 Two Interpretations of the uncertainty function 5.3 Properties.
Chapter 9: Huffman Codes
Variable-Length Codes: Huffman Codes
Lecture 4 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
CSI Uncertainty in A.I. Lecture 201 Basic Information Theory Review Measuring the uncertainty of an event Measuring the uncertainty in a probability.
Information Theory and Security
Noise, Information Theory, and Entropy
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Noise, Information Theory, and Entropy
Huffman Codes Message consisting of five characters: a, b, c, d,e
Lecture 3. Relation with Information Theory and Symmetry of Information Shannon entropy of random variable X over sample space S: H(X) = ∑ P(X=x) log 1/P(X=x)‏,
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
©2003/04 Alessandro Bogliolo Background Information theory Probability theory Algorithms.
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
Information Theory & Coding…
Algorithms in the Real World
15-853Page :Algorithms in the Real World Data Compression II Arithmetic Coding – Integer implementation Applications of Probability Coding – Run.
Source Coding-Compression
Dr.-Ing. Khaled Shawky Hassan
Huffman Codes. Encoding messages  Encode a message composed of a string of characters  Codes used by computer systems  ASCII uses 8 bits per character.
1 Analysis of Algorithms Chapter - 08 Data Compression.
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Channel Capacity.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5.
© Jalal Kawash 2010 Trees & Information Coding Peeking into Computer Science.
JHU CS /Jan Hajic 1 Introduction to Natural Language Processing ( ) Essential Information Theory I AI-lab
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Prof. Amr Goneid, AUC1 Analysis & Design of Algorithms (CSCE 321) Prof. Amr Goneid Department of Computer Science, AUC Part 8. Greedy Algorithms.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Code and Data Decomposition Pranav Shah CS157B.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Lecture 4: Lossless Compression(1) Hongli Luo Fall 2011.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Bahareh Sarrafzadeh 6111 Fall 2009
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Index construction: Compression of documents Paolo Ferragina Dipartimento di Informatica Università di Pisa Reading Managing-Gigabytes: pg 21-36, 52-56,
ECE 101 An Introduction to Information Technology Information Coding.
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
Information Theory Information Suppose that we have the source alphabet of q symbols s 1, s 2,.., s q, each with its probability p(s i )=p i. How much.
Data Compression: Huffman Coding in Weiss (p.389)
Design & Analysis of Algorithm Huffman Coding
HUFFMAN CODES.
EE465: Introduction to Digital Image Processing
Introduction to Information theory
Applied Algorithmics - week7
Chapter 9: Huffman Codes
Advanced Algorithms Analysis and Design
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Presentation transcript:

Lossless data compression Lecture 1

Data Compression Lossless data compression: Store/Transmit big files using few bytes so that the original files can be perfectly retrieved. Example: zip. Lossy data compression: Store/Transmit big files using few bytes so that the original files can be approximately retrieved. Example: mp3. Motivation: Save storage space and/or bandwidth.

Definition of Codec Let  be an alphabet and let S µ  * be a set of possible messages. A lossless codec (c,d) consists of –A coder c : S ! {0,1}* –A decoder d: {0,1}* !  * so that – 8 x 2 S: d(c(x))=x

Remarks It is necessary for c to be an injective map. If we do not worry about efficiency, we don’t have to specify d if we have specified c. Terminology: Some times we just say “code” rather than “codec”. Terminology: The set c(S) is called the set of code words of the codec. In examples to follow, we often just state the set of code words.

Proposition Let S = {0,1} n. Then, for any codec (c,d) there is some x 2 S, so that |c(x)| ¸ n. “Compression is impossible”

Proposition For any message x, there is a codec (c,d) so that |c(x)|=1. “The Encyclopedia Britannica can be compressed to 1 bit”.

Remarks We cannot compress all data. Thus, we must concentrate on compressing “relevant” data. It is trivial to compress data known in advance. We should concentrate on compressing data about which there is uncertainty. We will use probability theory as a tool to model uncertainty about relevant data.

Can random data be compressed? Suppose  = {0,1} and S = {0,1} 2. We know we cannot compress all data, but can we do well on the average? Let us assume the uniform distribution on S and look at the expected length of the code words.

!!!!!!! Random data can be compressed well on the average! ………. There is something fishy going on

Definition of prefix codes A prefix code c is a code with the property that for all different messages x and y, c(x) is not a prefix of c(y). Example: Fixed length codes (such as ascii). Example: {0,11,10} All codes in this course will be prefix codes.

Proposition If c is a prefix code for S =  1 then c n is a prefix code for S =  n where c n (x 1 x 2.. x n ) = c(x 1 ) ¢ c(x 2 ) …. ¢ c(x n )

Prefix codes and trees Set of code words of a prefix code: {0,11,10}

Alternative view of prefix codes A prefix code is an assignment of the messages of S to the leaves of a rooted binary tree. The codeword of a message x is found by reading the labels on the edges on the path from the root of the tree to the leaf corresponding to x.

Binary trees and the interval [0,1) /21/43/4 [0,1/2) [1/2,3/4)[3/4,1)

Alternative view of prefix codes A prefix code is an assignment of the messages of S to disjoint dyadic intervals. A dyadic interval is a real interval of the form [ k 2 - m, (k+1) 2 - m ) with k+1 · 2 m. The corresponding code word is the m-bit binary representation of k.

Kraft-McMillan Inequality Let m 1, m 2, … be the lengths of the code words of a prefix code. Then,  2 - m i · 1. Let m 1, m 2, … be integers with  2 - m i · 1. Then there is prefix code c so that {m i } are the lengths of the code words of c.

Probability A probability distribution p on S is a map p: S ! [0,1] so that  x 2 S p(x) = 1. A U-valued stochastic variable is a map Y: S ! U. If Y: S ! R is a stochastic variable, its expected value E[Y] is  x 2 S p(x) Y(x).

Self-entropy Given a probability distribution p on S, the self-entropy of x 2 S is the defined as H(x) = – log 2 p(x). The self-entropy of a message with probability 1 is 0 The self-entropy of a message with probability 0 is + 1. The self-entropy of a message with probabiltiy ½ is 1 We often measure entropy is unit “bits” bits. bit.

Entropy Given a probability distribution p on S, its entropy H[p] is defined as E[H], i.e. H[p] = –  x 2 S p(x) log 2 p(x). For a stochastic variable X, its entropy H[X] is the entropy of its underlying distribution: H[X] = –  i Pr[X=i] log 2 Pr[X=i]

Facts The entropy of the uniform distribution on {0,1} n is n bits. Any other distribution on {0,1} n has strictly smaller entropy. If X 1 and X 2 are independent stochastic variables, then H(X 1, X 2 ) = H(X 1 ) + H(X 2 ). For any function f, H(f(X)) · H(X).

Shannon’s theorem Let S be a set of messages and let X be an S- valued stochastic variable. For all prefix codes c on S, E[ |c(X)| ] ¸ H[X]. There is a prefix code c on S so that E[ |c(X)| ] < H[X] + 1 In fact, for all x in S, |c(x)| < H[x] + 1.