# Synchronization of Huffman codes Marek Biskup Warsaw University Phd-Open, 2007-05-26.

## Presentation on theme: "Synchronization of Huffman codes Marek Biskup Warsaw University Phd-Open, 2007-05-26."— Presentation transcript:

Synchronization of Huffman codes Marek Biskup Warsaw University Phd-Open, 2007-05-26

2007-05-27Marek Biskup - Synchronization of Huffman Codes2 Huffman Codes Each letter has a corresponding binary string (its code) The codes form a complete binary tree The depth of a letter depends its probability in the source The code is decodable a bc de 01 00 0 11 1 h=3 N=5

2007-05-27Marek Biskup - Synchronization of Huffman Codes3 Coding and decoding Source sequence bbeabce Encoded text 01 01 111 00 00 111 01 10 111 10 b b e a a c b c e c Encoding: For each input letter print out its code Decoding Use the Huffman tree as a finite automaton Start in the root; when you reach a leaf, print out its letter and start again a bc de 01 00 0 11 1

2007-05-27Marek Biskup - Synchronization of Huffman Codes4 Parallel decoding Use two processors to decode a string CPU1 starts from the beginning CPU2 starts in the middle 01 01 111 00 00 111 01 10 111 10 Where is the middle? 01011110000111011011110 (bbeaacbcec) 010111100001 11011011110 ? CPU2: 110 110 111 10 d d e c Wrong! CPU1 CPU2 a bc de 01 00 0 11 1

2007-05-27Marek Biskup - Synchronization of Huffman Codes5 Parallel decoding Correct: 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 b b e a a c b c e c Incorrect: 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 b b e a a d d e c Synchronization! a bc de 01 00 0 11 1

2007-05-27Marek Biskup - Synchronization of Huffman Codes6 Bit corruption Correct: 0 1 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 b b e a a c b c e c Bit error: 1 1 0 1 1 1 1 0 0 0 0 1 1 1 0 1 1 0 1 1 1 1 0 d e c a b d d e c a bc de 01 00 0 11 1 Synchronization!

2007-05-27Marek Biskup - Synchronization of Huffman Codes7 Huffman code automaton Huffman Tree = finite automaton  -transitions from leaves to the root Synchronization: the automaton is in the root when on a codeword boundary 1 1 1 \$ 0 1 \$ 1 0 \$ 1 1 1 \$ 1 0 c b c e c Lack of synchronization: the automaton is in the root when inside a codeword The automaton is in an inner node when on a codeword boundary 1 1 1 \$ 0 1 \$ 1 0 \$ 1 1 1 \$ 1 0 d d e c a bc de 01 00 0 11 1

2007-05-27Marek Biskup - Synchronization of Huffman Codes8 Synchronization A Huffman Code is self-synchronizing if for any inner node there is a sequence of codewords such that the automaton reaches the root Every self-synchronizing Huffman code will eventually resynchronize (for an  -guaranteed source) Almost all Huffman Codes are self-synchronizing Definition: A synchronizing string is a sequence of bits that moves any node to the root. Theorem: A Huffman code is self-synchronizing iff it has a synchronizing string a bc de 01 00 0 11 1 Synchronizing string: 0110

2007-05-27Marek Biskup - Synchronization of Huffman Codes9 Synchronizing codewords Can a synchronizing string be a codeword? Yes! a bc d e 01 00 0 11 1 010 011

2007-05-27Marek Biskup - Synchronization of Huffman Codes10 Optimal codes Minumum redundancy codes are not unique: a bc d e 01 00 0 11 1 a bc de 01 00 0 11 1 No synchronizing codeword 2 synchronizing codewords

2007-05-27Marek Biskup - Synchronization of Huffman Codes11 Code characteristics Open problems: chose the best Huffman code with respect to: Average number of bits to synchronization The length of the synchronizing string Existence and length of synchronizing codewords Open problem? The limit on the number of bits in a synchronizing string O(N 3 ) – known result for all automata O(h N logN) – my result for Huffman automata O(N 2 ) – Cerny conjecture for all automata

2007-05-27Marek Biskup - Synchronization of Huffman Codes12 Detecting synchronization Can a decoder find out that it has synchronized? Yes! For example if it receives a synchronizing string A more general algorithm: Try to start decoding the text from h consecutive positions (h „decoders”) Synchronization takes place if all decoders reach the same word boundary This can be done without increasing the complexity of decoding (no h dependence)

2007-05-27Marek Biskup - Synchronization of Huffman Codes13 Guaranteed synchronization Self-synchronizing Huffman Codes: no upper bound on the number of bits before synchronization My work (together with prof. Wojciech Plandowski): Extension to the Huffman coding No redundancy if the code would synchronize Small redundancy if it wouldn’t: O(1/N) per bit N – number of bits before guaranteed synchronization Linear time in the number of coded bits Coder: Analyze each possible starting position of a decoder Add a synchronization string whenever there is a decoder with the number of lost bits above the threshold Decoder: Just decode Skip synchronization strings inserted by the coder

2007-05-27Marek Biskup - Synchronization of Huffman Codes14 Summary Huffman codes can be decompressed in parallel After some bits (on average) a decoder which starts in the middle will synchronize No upper bound on the number of incorrectly decoded symbols With a small additional redundancy one may impose such a bound

Download ppt "Synchronization of Huffman codes Marek Biskup Warsaw University Phd-Open, 2007-05-26."

Similar presentations