Presentation is loading. Please wait.

Presentation is loading. Please wait.

Huffman Coding CSE 373 Data Structures.

Similar presentations


Presentation on theme: "Huffman Coding CSE 373 Data Structures."— Presentation transcript:

1 Huffman Coding CSE 373 Data Structures

2 CSE 373 AU 04 -- Huffman Coding
Reading Reading Goodrich and Tamassia, Chapter 11, section 11.4, pp 12/8/2018 CSE 373 AU Huffman Coding

3 CSE 373 AU 04 -- Huffman Coding
Outline Motivation Character occurrence in documents: Frequency Tables Huffman trees Huffman codes Decoding 12/8/2018 CSE 373 AU Huffman Coding

4 CSE 373 AU 04 -- Huffman Coding
Motivation Documents typically have a certain amount of redundancy in their representation in a computer. ASCII and other conventional character codes use equal numbers of bits for each character. If certain symbols (characters) occur a lot more frequently than others, then let’s use shorter bit sequences for them, (although that will mean longer bit sequences for the less frequent characters). Huffman coding provides the optimal assignment of numbers of bits to the different characters. 12/8/2018 CSE 373 AU Huffman Coding

5 CSE 373 AU 04 -- Huffman Coding
Frequency Tables “For years an oven course ages” a: 3 c: 1 e: 4 F: 1 g: 1 n: 2 o: 3 r: 3 s: 3 u: 1 v: 1 y: 1 Blank: 5 12/8/2018 CSE 373 AU Huffman Coding

6 CSE 373 AU 04 -- Huffman Coding
Huffman Trees A Huffman Tree is a binary tree having the following properties: Each leaf node contains a symbol (a character) and a frequency value. Each internal node contains a frequency (the sum of its children’s frequencies). a: 2 c: 1 e: 4 F: 1 g: 1 n: 1 o: 3 r: 3 s: 3 u: 1 v: 1 Blank: 5 12/8/2018 CSE 373 AU Huffman Coding

7 Huffman Tree Construction
To build a Huffman tree, do the following: Step 1: Initialize a forest of binary trees, with one one-node tree per character occurring in the document. Each of these is a leaf node, so it gets labeled with the character and the frequency. Step 2: (a) Select a minimum-frequency Huffman tree from the forest. Call it A. Select the a second minimum-frequency Huffman tree from those remaining. Call it B. Remove A and B from the forest. (b) Combine A and B by creating a new node C, and making A the left subtree of C and B the right subtree of C. Label C with the sum of the frequencies of A and B. (c) Put the new tree, rooted at C back into the forest. Step 3: Repeat step 2 until only a single tree remains. Output this tree. 12/8/2018 CSE 373 AU Huffman Coding

8 Example, Step 1: a:3, c:1, e:4, F:1, g:1, n:2, o:3, r:3, s:3, u:1, v:1, y:1, Blank:5 The intial forest contains 13 trees of one leaf node each. a c e F g n o r s u v y Blank 3 1 4 1 1 2 3 3 3 1 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

9 Example, Step 2a: Removal of two minimum-frequency trees from the forest. c F 1 1 a e g n o r s u v y Blank 3 4 1 2 3 3 3 1 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

10 Example, Step 2b: 2 Combining the two, using a new root node. c c F F
1 1 1 1 a e g n o r s u v y Blank 3 4 1 2 3 3 3 1 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

11 CSE 373 AU 04 -- Huffman Coding
Example, Step 2c: Adding the new tree to the forest. There are now only 12 trees in the forest. 2 a c F e g n o r s u v y Blank 3 1 1 4 1 2 3 3 3 1 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

12 Example, Next iteration:
Two more trees have been merged. 2 2 a c F e g u n o r s v y Blank 3 1 1 4 1 1 2 3 3 3 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

13 Example, Third iteration:
There are now 10 trees in the forest. 2 2 2 a c F e g u n o r s v y Blank 3 1 1 4 1 1 2 3 3 3 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

14 Example, Fourth iteration:
There are now 9 trees in the forest. 4 2 2 2 c F g u a e n o r s v y Blank 1 1 1 1 3 4 2 3 3 3 1 1 5 12/8/2018 CSE 373 AU Huffman Coding

15 Example, 5th iteration: There are now 8 trees in the forest. 4 4 2 2 2
c F g u a e n v y o r s Blank 1 1 1 1 3 4 2 1 1 3 3 3 5 12/8/2018 CSE 373 AU Huffman Coding

16 Example, 6th iteration: There are now 7 trees in the forest. 4 4 2 2 6
c F g u a o e n v y r s Blank 1 1 1 1 3 3 4 2 1 1 3 3 5 12/8/2018 CSE 373 AU Huffman Coding

17 Example, 7th iteration: There are now 6 trees in the forest. 4 4 2 2 6
c F g u a o e n v y r s Blank 1 1 1 1 3 3 4 2 1 1 3 3 5 12/8/2018 CSE 373 AU Huffman Coding

18 Example, 8th iteration: There are now 5 trees in the forest. 8 4 4 2 2
6 6 c F g u n v y a o e r s Blank 1 1 1 1 2 1 1 3 3 4 3 3 5 12/8/2018 CSE 373 AU Huffman Coding

19 Example, 9th iteration: There are now 4 trees in the forest. 8 4 4 2 2
6 6 c F g u n v y e Blank a o r s 1 1 1 1 2 1 1 4 5 3 3 3 3 12/8/2018 CSE 373 AU Huffman Coding

20 Example, 10th iteration: There are now 3 trees in the forest. 8 12 4 4
9 6 6 c F g u n v y e Blank a o r s 1 1 1 1 2 1 1 4 5 3 3 3 3 12/8/2018 CSE 373 AU Huffman Coding

21 Example, 11th iteration: There are now 2 trees in the forest. 17 8 12
4 4 2 2 2 9 6 6 c F g u n v y e Blank a o r s 1 1 1 1 2 1 1 4 5 3 3 3 3 12/8/2018 CSE 373 AU Huffman Coding

22 Example, 12th (last) iteration:
There is now only one tree in the forest. 29 17 8 12 4 4 2 2 2 9 6 6 c F g u n v y e Blank a o r s 1 1 1 1 2 1 1 4 5 3 3 3 3 12/8/2018 CSE 373 AU Huffman Coding

23 Example, Assign Edge Labels:
0 on left edges, 1 on right edges. 29 1 17 8 1 1 12 4 4 1 1 1 2 2 2 9 6 6 1 1 1 1 1 1 c F g u n v y e Blank a o r s 1 1 1 1 2 1 1 4 5 3 3 3 3 12/8/2018 CSE 373 AU Huffman Coding

24 CSE 373 AU 04 -- Huffman Coding
Example, Coding Table: a: 100 c: 00000 e: 010 F: 00001 g: 00010 n: 0010 o: 101 r: 110 s: 111 u: 00011 v: 00110 y: 00111 Blank: 011 12/8/2018 CSE 373 AU Huffman Coding

25 Example, The Document, Coded:
F: 00001 g: 00010 n: 0010 o: 101 r: 110 s: 111 u: 00011 v: 00110 y: 00111 Blank: 011 Formatted with spaces after each letter and breaks after each blank: For years an oven course ages 12/8/2018 CSE 373 AU Huffman Coding

26 CSE 373 AU 04 -- Huffman Coding
Example, Decoding: a: 100 c: 00000 e: 010 F: 00001 g: 00010 n: 0010 o: 101 r: 110 s: 111 u: 00011 v: 00110 y: 00111 Blank: 011 Use the Huffman tree as a decoding aid, starting at the root and following the edge left or right depending on whether the current next symbol in the code is a 0 or a 1. When you reach a leaf, output the character there, and start processing the next symbol from the root again. Huffman codes are “prefix codes” and there is never any ambiguity about how to process the next symbol. 12/8/2018 CSE 373 AU Huffman Coding

27 CSE 373 AU 04 -- Huffman Coding
Compression Ratio The coded document uses 100 bits*. The 8-bit ASCII version requires 29*8 = 232 bits. The compression ratio is 100/232 = (*Not including the coding table) 12/8/2018 CSE 373 AU Huffman Coding

28 Efficient Implementation
While constructing the Huffman tree, maintain a priority queue that holds the forest of trees. This makes it easy to obtain the minimum frequency tree using FINDMIN and DELETEMIN. 12/8/2018 CSE 373 AU Huffman Coding

29 CSE 373 AU 04 -- Huffman Coding
Closing Remarks Huffman Coding is an important data compression method. It can be applied to text, images or any data that can be described as a sequence of symbols from a fixed set of symbols. It is often used as part of other systems, such as the JPEG image compression method. 12/8/2018 CSE 373 AU Huffman Coding


Download ppt "Huffman Coding CSE 373 Data Structures."

Similar presentations


Ads by Google