Presentation is loading. Please wait.

Presentation is loading. Please wait.

Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B.

Similar presentations


Presentation on theme: "Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B."— Presentation transcript:

1 Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B

2 Lossless Data Compression o Any compression algorithm can be viewed as a function that maps sequences of units into other sequences of units. o The original data to be reconstructed from the compressed data. - Lossless o Lossless is in contrast to lossy data compression, which only allows an approximation of the original data to be reconstructed in exchange for better compression rates.

3 David A. Huffman o BS Electrical Engineering at Ohio State University o Worked as a radar maintenance officer for the US Navy o PhD student, Electrical Engineering at MIT 1952 o Was given the choice of writing a term paper or to take a final exam o Paper topic: most efficient method for representing numbers, letters or other symbols as binary code

4 Huffman Coding o Uses the minimum number of bits o Variable length coding – good for data transfer o Different symbols have different lengths o Symbols with the most frequency will result in shorter codewords o Symbols with lower frequency will have longer codewords o “Z” will have a longer code representation then “E” if looking at the frequency of character occurrences in an alphabet o No codeword is a prefix for another codeword!

5 Decoding SymbolCode E0 T11 N100 I1010 S1011 To determine the original message, read the string of bits from left to right and use the table to determine the individual symbols Decode the following: 11010010010101011

6 Decoding 11 SymbolCode E0 T11 N100 I1010 S1011 0 100 1010 1011 T ENNIS Original String: 11010010010101011

7 Representing a Huffman Table as a Binary Tree o Codewords are presented by a binary tree o Each leaf stores a character o Each node has two children o Left = 0 o Right = 1 o The codeword is the path from the root to the leaf storing a given character o The code is represented by the leads of the tree is the prefix code

8 Constructing Huffman Codes o Goal: construct a prefix code for Σ: associate each letter i with a codeword w i to minimize the average codeword length:

9 Example Letterpipi wiwi A0.1000 B0.1001 C0.201 D0.310 E0.311 Where p i = probability of w i

10 Algorithm o Make a leaf node for node symbol o Add the generation probability for each symbol to the leaf node o Take the two leaf nodes with the smallest probability (p i ) and connect them into a new node (which becomes the parent of those nodes) o Add 1 for the right edge o Add 0 for the left edge o The probability of the new node is the sum of the probabilities of the two connecting nodes o If there is only one node left, the code construction is completed. If not, to back to (2)

11 Example SymbolProbability A0.387 B0.194 C0.161 D0.129 E

12 Example – Creating the tree D 0.129 C 0.161 A 0.387 B 0.194 E 0.129 SymbolProbability A0.387 B0.194 C0.161 D0.129 E

13 Example – Iterate Step 2 Take the two leaf nodes with the smallest probability (p i ) and connect them into a new node (which becomes the parent of those nodes) o Green nodes – nodes to be evaluated o White nodes – nodes which have already been evaluated o Blue nodes – nodes which are added in this iteration D 0.129 C 0.161 A 0.387 B 0.194 E 0.129 0.258

14 Example – Iterate Step 2 D 0.129 C 0.161 A 0.387 B 0.194 E 0.129 0.258 0.355 Note: when two nodes are connected by a parent, the parent should be evaluated in the next iteration

15 D 0.129 B 0.194 A 0.387 C 0.161 E 0.129 0.258 0.355 0.613 Example – Iterate Step 2

16 Example: Completed Tree D 0.129 C 0.161 A 0.387 B 0.194 E 0.129 0.258 0.355 0.613 1 1 0 0 0 0 1 1 1

17 Example: Table for Huffman Code SymbolProbability A0 B111 C110 D100 E101 Generate the table by reading from the root node to the leaves for each symbol

18 Practice SymbolOccurrencesHuffman Code A0.45? B0.13? C0.12? D0.16? E0.09? F0.05?

19 Practice Solution C 0.12 0.14 A 0.45 D 0.16 B 0.13 0.25 0.30 0.55 1 1 0 0 0 0 1 1 1 F 0.05 E 0.09 0 1

20 Questions?

21 References o http://www.cstutoringcenter.com/tutorials/algor ithms/huffman.php http://www.cstutoringcenter.com/tutorials/algor ithms/huffman.php o http://en.wikipedia.org/wiki/Huffman_coding http://en.wikipedia.org/wiki/Huffman_coding o http://michael.dipperstein.com/huffman/index. html http://michael.dipperstein.com/huffman/index. html o http://en.wikipedia.org/wiki/David_A._Huffman http://en.wikipedia.org/wiki/David_A._Huffman o http://www.binaryessence.com/dct/en000080. htm http://www.binaryessence.com/dct/en000080. htm


Download ppt "Lossless Decomposition and Huffman Codes Sophia Soohoo CS 157B."

Similar presentations


Ads by Google