Presentation is loading. Please wait.

Presentation is loading. Please wait.

Huffman Codes Message consisting of five characters: a, b, c, d,e

Similar presentations


Presentation on theme: "Huffman Codes Message consisting of five characters: a, b, c, d,e"— Presentation transcript:

1 Huffman Codes Message consisting of five characters: a, b, c, d,e
Probabilities: .12, .4, .15, .08, .25 Encode each character into sequence of 0’s and 1’s so that no code for a character is the prefix of the code for any other character Prefix property Can decode a string of 0’s and 1’s by repeatedly deleting prefixes of the string that are codes for the character

2 Example Both codes have prefix property
Symbol Probability Code 1 Code 2 a b c d e .12 .40 .15 .08 .25 000 001 010 011 100 000 11 01 001 10 Both codes have prefix property Decode code 1: “grab” 3 bits at a time and translate each group into a character Ex.:  bcd

3 Example Cont’d Symbol Probability Code 1 Code 2 a b c d e .12 .40 .15 .08 .25 000 001 010 011 100 000 11 01 001 10 Decode code 2: Repeatedly “grab” prefixes that are codes for characters and remove them from input Only difference, cannot “slice” up input at once How many bits depends on encoded character Ex.:  bcd

4 Big Deal? Huffman coding results in shorter average length of compressed (encoded) message Code 1 has average length of 3 multiply length of code for each symbol by probability of occurrence of that symbol Code 2 has average length of 2.2 (3*.12) + (2*.40) + ( 2*.15) + (3*.08) + (2*.25) Can we do better? Problem: Given a set of characters and their probabilities, find a code with the prefix property such that the average length of a code for a character is minimum

5 Representation Label leaves in tree by characters represented
Think of prefix codes as paths in binary trees Following a path from a node to its left child as appending a 0 to a code, and proceeding form node to right child as appending 1 Can represent any prefix code as a binary tree Prefix property guarantees no character can have a code that is an interior node Conversely, labeling the leaves of a binary tree with characters gives us a code with prefix property

6 Sample Binary Trees 1 1 1 1 1 e b c 1 1 1 a d a b c d e Code 1 Code 2

7 Huffman’s Algorithm Select two characters a and b having the lowest probabilities and replacing them with a single (imaginary) character, say x x’s probability of occurrence is the sum of the probabilities for a and b Now find an optimal prefix code for this smaller set of characters, using the above procedure recursively Code for original character set is obtained by using the code for x with a 0 appended for a and with a 1 appended for b

8 Steps in the Construction of a Huffman Tree
Sort input characters by frequency d a c e b

9 Merge a and d d a c e b

10 Merge a, d with c .35 e b c d a

11 Merge a, c, d with e .60 .40 . b e c d a

12 Final Tree 1.00 Codes: a - 1111 b - 0 c - 110 d - 1110 e - 10
average code length: 2.15 1 b 1 e 1 c 1 d a

13 Huffman Algorithm Example of greedy algorithm
Combine nodes whenever possible without considering potential drawbacks inherent in making such a move I.e., at any individual stage select that option which is “locally optimal” Recall: vertex coloring problem Does not always yield optimal solution; however, Huffman coding is optimal See textbook for proof

14 Finishing Remarks Works well in theory, several restrictive assumptions (1) Frequency of letters is independent of the context of that letter in message Not true in English language (2) Huffman coding works better when large variation in frequency of letters Actual frequencies must match expected ones Examples: DEED  8 bits (12 bits ASCII) FUZZ  20 bits (12 bits ASCII)


Download ppt "Huffman Codes Message consisting of five characters: a, b, c, d,e"

Similar presentations


Ads by Google