CSE 326 Huffman coding Richard Anderson. Coding theory Conversion, Encryption, Compression Binary coding Variable length coding A B C D E F.

Slides:



Advertisements
Similar presentations
Introduction to Algorithms
Advertisements

Introduction to Computer Science 2 Lecture 7: Extended binary trees
Lecture 3: Source Coding Theory TSBK01 Image Coding and Data Compression Jörgen Ahlberg Div. of Sensor Technology Swedish Defence Research Agency (FOI)
Algorithm Design Techniques: Greedy Algorithms. Introduction Algorithm Design Techniques –Design of algorithms –Algorithms commonly used to solve problems.
Michael Alves, Patrick Dugan, Robert Daniels, Carlos Vicuna
Greedy Algorithms Amihood Amir Bar-Ilan University.
Greedy Algorithms Greed is good. (Some of the time)
Source Coding Data Compression A.J. Han Vinck. DATA COMPRESSION NO LOSS of information and exact reproduction (low compression ratio 1:4) general problem.
Huffman Encoding Dr. Bernard Chen Ph.D. University of Central Arkansas.
Problem: Huffman Coding Def: binary character code = assignment of binary strings to characters e.g. ASCII code A = B = C =
CS38 Introduction to Algorithms Lecture 5 April 15, 2014.
Data Compressor---Huffman Encoding and Decoding. Huffman Encoding Compression Typically, in files and messages, Each character requires 1 byte or 8 bits.
1 Huffman Codes. 2 Introduction Huffman codes are a very effective technique for compressing data; savings of 20% to 90% are typical, depending on the.
Lecture 6: Huffman Code Thinh Nguyen Oregon State University.
Optimal Merging Of Runs
© 2004 Goodrich, Tamassia Greedy Method and Compression1 The Greedy Method and Text Compression.
Chapter 9: Greedy Algorithms The Design and Analysis of Algorithms.
Lecture 6: Greedy Algorithms I Shang-Hua Teng. Optimization Problems A problem that may have many feasible solutions. Each solution has a value In maximization.
DL Compression – Beeri/Feitelson1 Compression דחיסה Introduction Information theory Text compression IL compression.
Data Structures – LECTURE 10 Huffman coding
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Greedy Algorithms Huffman Coding
16.Greedy algorithms Hsu, Lih-Hsing. Computer Theory Lab. Chapter 16P An activity-selection problem Suppose we have a set S = {a 1, a 2,..., a.
Data Compression and Huffman Trees (HW 4) Data Structures Fall 2008 Modified by Eugene Weinstein.
Huffman Codes Message consisting of five characters: a, b, c, d,e
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
Huffman Coding Dr. Ying Lu RAIK 283 Data Structures & Algorithms.
Introduction to Algorithms Chapter 16: Greedy Algorithms.
Trees (Ch. 9.2) Longin Jan Latecki Temple University based on slides by Simon Langley and Shang-Hua Teng.
 Rooted tree and binary tree  Theorem 5.19: A full binary tree with t leaves contains i=t-1 internal vertices.
Huffman coding Content 1 Encoding and decoding messages Fixed-length coding Variable-length coding 2 Huffman coding.
Huffman Encodings Section 9.4. Data Compression: Array Representation Σ denotes an alphabet used for all strings Each element in Σ is called a character.
Huffman’s Algorithm 11/02/ Weighted 2-tree A weighted 2-tree T is an extended binary tree with n external nodes and each of the external nodes is.
Foundation of Computing Systems
Bahareh Sarrafzadeh 6111 Fall 2009
Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.
Huffman Codes. Overview  Huffman codes: compressing data (savings of 20% to 90%)  Huffman’s greedy algorithm uses a table of the frequencies of occurrence.
Huffman encoding.
5.6 Prefix codes and optimal tree Definition 31: Codes with this property which the bit string for a letter never occurs as the first part of the bit string.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 18.
ENTROPY Entropy measures the uncertainty in a random experiment. Let X be a discrete random variable with range S X = { 1,2,3,... k} and pmf p k = P X.
Ch4. Zero-Error Data Compression Yuan Luo. Content  Ch4. Zero-Error Data Compression  4.1 The Entropy Bound  4.2 Prefix Codes  Definition and.
CSCI 58000, Algorithm Design, Analysis & Implementation Lecture 12 Greedy Algorithms (Chapter 16)
Design & Analysis of Algorithm Huffman Coding
HUFFMAN CODES.
4.8 Huffman Codes These lecture slides are supplied by Mathijs de Weerd.
EE465: Introduction to Digital Image Processing
Assignment 6: Huffman Code Generation
Chapter 5 : Trees.
The Greedy Method and Text Compression
Proving the Correctness of Huffman’s Algorithm
Chapter 16: Greedy Algorithms
Lecture 9 Greedy Strategy
Math 221 Huffman Codes.
Algorithms (2IL15) – Lecture 2
Advanced Algorithms Analysis and Design
Greedy Algorithms Many optimization problems can be solved more quickly using a greedy approach The basic principle is that local optimal decisions may.
Chapter 16: Greedy algorithms Ming-Te Chi
Richard Anderson Lecture 6 Greedy Algorithms
Data Structure and Algorithms
Greedy Algorithms Alexandra Stefan.
Chapter 16: Greedy algorithms Ming-Te Chi
Richard Anderson Lecture 7 Greedy Algorithms
CSE 326 Huffman coding Richard Anderson.
Algorithms CSCI 235, Spring 2019 Lecture 30 More Greedy Algorithms
Huffman Coding Greedy Algorithm
CSE 589 Applied Algorithms Spring 1999
Algorithms CSCI 235, Spring 2019 Lecture 31 Huffman Codes
Proving the Correctness of Huffman’s Algorithm
Analysis of Algorithms CS 477/677
Presentation transcript:

CSE 326 Huffman coding Richard Anderson

Coding theory Conversion, Encryption, Compression Binary coding Variable length coding A B C D E F

Decode the following E0 T11 N100 I1010 S E0 T10 N100 I0111 S

Prefix code No prefix of a codeword is a codeword Uniquely decodable A001 B C D E F

Prefix codes and binary trees Tree representation of prefix codes A00 B010 C0110 D0111 E10 F11

Construct the tree for the following code E0 T11 N100 I1010 S1011

Minimum length code Average cost Average leaf depth Huffman tree – tree with minimum weighted path length C(T) – weighted path length

Compute average leaf depth A001/4 B0101/8 C01101/16 D01111/16 E11/2

Huffman code algorithm Derivation Two rarest items will have the longest codewords Codewords for rarest items differ only in the last bit Idea: suppose the weights are with and the smallest weights Start with an optimal code for and Extend the codeword for to get codewords for and

Huffman code H = new Heap() for each w i T = new Tree(w i ) H.Insert(T) while H.Size() > 1 T 1 = H.DeleteMin() T 2 = H.DeleteMin() T 3 = Merge(T 1, T 2 ) H.Insert(T 3 )

Example: Weights 4, 5, 6, 7, 11, 14, 21

Draw a Huffman tree for the following data values and show internal weights: 3, 5, 9, 14, 16, 35

Correctness proof The most amazing induction proof Induction on the number of code words The Huffman algorithm finds an optimal code for n = 1 Suppose that the Huffman algorithm finds an optimal code for codes size n, now consider a code of size n

Key lemma Given a tree T, we can find a tree T’, with the two minimum cost leaves as siblings, and C(T’) <= C(T)

Modify the following tree to reduce the WPL

Finish the induction proof T – Tree constructed by Huffman X – Any code tree Show C(T) <= C(X) T’ and X’ – Trees from the lemma C(T’) = C(T) C(X’) <= C(X) T’’ and X’’ – Trees with minimum cost leaves x and y removed

X : Any tree, X’: – modified, X’’ : Two smallest leaves removed C(X’’) = C(X’) – x – y C(T’’) = C(T’) – x – y C(T’’) <= C(X’’) C(T) = C(T’) = C(T’’) + x + y <= C(X’’) + x + y = C(X’) <= C(X)