Huffman Coding. Main properties : –Use variable-length code for encoding a source symbol. –Shorter codes are assigned to the most frequently used symbols,

Slides:



Advertisements
Similar presentations
Noise, Information Theory, and Entropy (cont.) CS414 – Spring 2007 By Karrie Karahalios, Roger Cheng, Brian Bailey.
Advertisements

CY2G2 Information Theory 1
Lecture 4 (week 2) Source Coding and Compression
Sampling and Pulse Code Modulation
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Arithmetic Coding. Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a How we can do better than Huffman? - I As we have seen, the.
Bounds on Code Length Theorem: Let l ∗ 1, l ∗ 2,..., l ∗ m be optimal codeword lengths for a source distribution p and a D-ary alphabet, and let L ∗ be.
Greedy Algorithms (Huffman Coding)
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
Data Compression Michael J. Watts
Lecture04 Data Compression.
School of Computing Science Simon Fraser University
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
SWE 423: Multimedia Systems Chapter 7: Data Compression (3)
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
CSCI 3 Chapter 1.8 Data Compression. Chapter 1.8 Data Compression  For the purpose of storing or transferring data, it is often helpful to reduce the.
SWE 423: Multimedia Systems Chapter 7: Data Compression (2)
A Data Compression Algorithm: Huffman Compression
Lecture 6: Greedy Algorithms I Shang-Hua Teng. Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
Information Theory Eighteenth Meeting. A Communication Model Messages are produced by a source transmitted over a channel to the destination. encoded.
Lecture 4 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan
Fundamentals of Multimedia Chapter 7 Lossless Compression Algorithms Ze-Nian Li and Mark S. Drew 건국대학교 인터넷미디어공학부 임 창 훈.
Lossless Data Compression Using run-length and Huffman Compression pages
EEE377 Lecture Notes1 EEE436 DIGITAL COMMUNICATION Coding En. Mohd Nazri Mahmud MPhil (Cambridge, UK) BEng (Essex, UK) Room 2.14.
1 Lossless Compression Multimedia Systems (Module 2) r Lesson 1: m Minimum Redundancy Coding based on Information Theory: Shannon-Fano Coding Huffman Coding.
Huffman Codes Message consisting of five characters: a, b, c, d,e
Huffman Coding Vida Movahedi October Contents A simple example Definitions Huffman Coding Algorithm Image Compression.
8. Compression. 2 Video and Audio Compression Video and Audio files are very large. Unless we develop and maintain very high bandwidth networks (Gigabytes.
Chapter 2 Source Coding (part 2)
Lecture 1 Contemporary issues in IT Lecture 1 Monday Lecture 10:00 – 12:00, Room 3.27 Lab 13:00 – 15:00, Lab 6.12 and 6.20 Lecturer: Dr Abir Hussain Room.
Noiseless Coding. Introduction Noiseless Coding Compression without distortion Basic Concept Symbols with lower probabilities are represented by the binary.
1 Lossless Compression Multimedia Systems (Module 2 Lesson 2) Summary:  Adaptive Coding  Adaptive Huffman Coding Sibling Property Update Algorithm 
Page 110/6/2015 CSE 40373/60373: Multimedia Systems So far  Audio (scalar values with time), image (2-D data) and video (2-D with time)  Higher fidelity.
Huffman Encoding Veronica Morales.
CS-2852 Data Structures LECTURE 13B Andrew J. Wozniewicz Image copyright © 2010 andyjphoto.com.
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
 The amount of data we deal with is getting larger  Not only do larger files require more disk space, they take longer to transmit  Many times files.
Basic Concepts of Encoding Codes, their efficiency and redundancy 1.
Multimedia Data Introduction to Lossless Data Compression Dr Sandra I. Woolley Electronic, Electrical.
Compression.  Compression ratio: how much is the size reduced?  Symmetric/asymmetric: time difference to compress, decompress?  Lossless; lossy: any.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 5.
Linawati Electrical Engineering Department Udayana University
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
COMPRESSION. Compression in General: Why Compress? So Many Bits, So Little Time (Space) CD audio rate: 2 * 2 * 8 * = 1,411,200 bps CD audio storage:
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Abdullah Aldahami ( ) April 6,  Huffman Coding is a simple algorithm that generates a set of variable sized codes with the minimum average.
Huffman Codes Juan A. Rodriguez CS 326 5/13/2003.
CS654: Digital Image Analysis Lecture 34: Different Coding Techniques.
Digital Image Processing Lecture 22: Image Compression
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
1 Source Coding and Compression Dr.-Ing. Khaled Shawky Hassan Room: C3-222, ext: 1204, Lecture 7 (W5)
ENTROPY & RUN LENGTH CODING. Contents What is Entropy coding? Huffman Encoding Huffman encoding Example Arithmetic coding Encoding Algorithms for arithmetic.
1 Data Compression Hae-sun Jung CS146 Dr. Sin-Min Lee Spring 2004.
Compression techniques Adaptive and non-adaptive.
Prof. Paolo Ferragina, Algoritmi per "Information Retrieval" Basics
Lossless Compression-Statistical Model Lossless Compression One important to note about entropy is that, unlike the thermodynamic measure of entropy,
Submitted To-: Submitted By-: Mrs.Sushma Rani (HOD) Aashish Kr. Goyal (IT-7th) Deepak Soni (IT-8 th )
Huffman Codes ASCII is a fixed length 7 bit code that uses the same number of bits to define each character regardless of how frequently it occurs. Huffman.
Data Coding Run Length Coding
EE465: Introduction to Digital Image Processing
Data Compression.
Proving the Correctness of Huffman’s Algorithm
Huffman Coding, Arithmetic Coding, and JBIG2
Huffman Coding CSE 373 Data Structures.
Huffman Encoding Huffman code is method for the compression for standard text documents. It makes use of a binary tree to develop codes of varying lengths.
Proving the Correctness of Huffman’s Algorithm
Electrical Communications Systems ECE
Presentation transcript:

Huffman Coding

Main properties : –Use variable-length code for encoding a source symbol. –Shorter codes are assigned to the most frequently used symbols, and longer codes to the symbols which appear less frequently. –Unique decodable & Instantaneous code. –It was shown that Huffman coding cannot be improved or with any other integral bit-width coding stream.

Example Number of pixels Gray level a.Step 1:Histogram

d.Step 4:Reorder and add until only two values remain b.Step 2:Order c.Step 3:Add

a.Assign 0 and 1 to the rightmost probabilities b.Bring 0 and 1 back along the tree c.Append 0 and 1 previously added branches d.Repeat the process until the original branch is labeled

1.846 bits/pixel =1.9 bits/pixel

Arithmetic Coding

Huffman coding has been proven the best fixed length coding method available. Yet, since Huffman codes have to be an integral number of bits long, while the entropy value of a symbol may (as a matter of fact, almost always so) be a faction number, theoretical possible compressed message cannot be achieved.

A rithmetic Coding For example, if a statistical method assign 90% probability to a given character, the optimal code size would be 0.15 bits. The Huffman coding system would probably assign a 1-bit code to the symbol, which is six times longer than necessary.

A rithmetic Coding Arithmetic coding bypasses the idea of replacing an input symbol with a specific code. It replaces a stream of input symbols with a single floating point output number.

Character probability Range ^(space) 1/10 A 1/10 B 1/10 E 1/10 G 1/10 I 1/10 L 2/10 S 1/10 T 1/10 BILL GATES” Suppose that we want to encode the message “ BILL GATES”

A rithmetic Coding Encoding algorithm for arithmetic coding : low = 0.0 ; high =1.0 ; while not EOF do range = high - low ; read(c) ; high = low + range  high_range(c) ; low = low + range  low_range(c) ; end do output(low);

A rithmetic Coding To encode the first character B properly, the final coded message has to be a number greater than or equal to 0.20 and less than –range = 1.0 – 0.0 = 1.0 –high = × 0.3 = 0.3 –low = × 0.2 = 0.2 After the first character is encoded, the low end for the range is changed from 0.00 to 0.20 and the high end for the range is changed from 1.00 to 0.30.

A rithmetic Coding The next character to be encoded, the letter I, owns the range 0.50 to 0.60 in the new subrange of 0.20 to So, the new encoded number will fall somewhere in the 50th to 60th percentile of the currently established. Thus, this number is further restricted to 0.25 to 0.26.

A rithmetic Coding Note that any number between 0.25 and 0.26 is a legal encoding number of ‘BI’. Thus, a number that is best suited for binary representation is selected. (Condition : the length of the encoded message is known or EOF is used.)

( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T ( ) A B E G I L S T

A rithmetic Coding CharacterLowHigh B I L L ^(space) G A T E S

A rithmetic Coding So, the final value (or, any value between and , if the length of the encoded message is known at the decode end), will uniquely encode the message ‘BILL GATES’.

A rithmetic Coding Decoding is the inverse process. Since falls between 0.2 and 0.3, the first character must be ‘B’. Removing the effect of ‘B’from by first subtracting the low value of B, 0.2, giving Then divided by the width of the range of ‘B’, 0.1. This gives a value of

A rithmetic Coding Then calculate where that lands, which is in the range of the next letter, ‘I’. The process repeats until 0 or the known length of the message is reached.

A rithmetic Coding Decoding algorithm : r = input_number repeat search c such that r falls in its range output(c) ; r = r - low_range(c); r = r ÷ (high_range(c) - low_range(c)); until EOF or the length of the message is reached

r cLow High range B I L L ^(space) G A T E S

A rithmetic Coding In summary, the encoding process is simply one of narrowing the range of possible numbers with every new symbol. The new range is proportional to the predefined probability attached to that symbol. Decoding is the inverse procedure, in which the range is expanded in proportion to the probability of each symbol as it is extracted.

A rithmetic Coding Coding rate approaches high-order entropy theoretically. Not so popular as Huffman coding because ×, ÷ are needed.