Presentation is loading. Please wait.

Presentation is loading. Please wait.

Huffman encoding.

Similar presentations


Presentation on theme: "Huffman encoding."— Presentation transcript:

1 Huffman encoding

2 Fixed Length Codes Represent data as a sequence of 0’s and 1’s
Sequence: BACADAEAFABBAAAGAH A fixed length code: A B C D 011 E F G H 111 Encoding of sequence: The Encoding is 18x3=54 bits long. Can we make the encoding shorter?

3 Variable Length Code Make use of frequencies. Frequency of A=8, B=3, others 1. A B C D 1011 E F G H 1111 Example: BACADAEAFABBAAAGAH Morse code is a variable length code. We can encrypt more efficiently if we give frequent symbols E is the most frequent word and so is represented by a single dot. Top decode, we can use a special separator code as in Morse code. 42 bits (20% shorter) But how do we decode?

4 Prefix code  Binary tree
Prefix code: No codeword is a prefix of any other codeword A B C D 1011 E F G H 1111 A B C D E F G H 1

5 Decoding Example 10001010 10001010 B 10001010 BA 10001010 BAC 1 A B C
F G H 1 B BA BAC

6 Huffman Tree = Optimal Length Code
B C D E F G H 1 8 3 A D B C E F G H 1 8 3 Before we can describe the algorithm for generating optimal codes, lets talk a little bit about our representation. Optimal: no code has better weighted average length

7 Huffman’s Algorithm Build tree bottom-up, so that lowest weight leaves are farthest from the root. Repeatedly: Find two trees of lowest weight. merge them to form a new tree whose weight is the sum of their weights.

8 Construction of Huffman tree
17 9 5 4 2 2 2 A B C D E F G H 8 3 1

9 Two questions Why does the algorithm produce the best tree ?
How do you implement it efficiently ?

10 Huffman(C) n ← |C| Q ← C for i ← 1 to n-1 do new(z) left(z) ← x ← delete-min(Q) right(z) ← y ← delete-min(Q) f(z) ← f(x) + f(y) insert(z,Q) return delete-min(Q)

11 Correctness Let x and y be the characters with lowest frequencies. We prove that there is an optimal tree in which x and y are siblings, deepest leaves The tree without x and y is an optimal tree for the set in which we replace x and y with a single character whose frequency is f(x) + f(y). Then correctness follows by induction


Download ppt "Huffman encoding."

Similar presentations


Ads by Google