Presentation is loading. Please wait.

Presentation is loading. Please wait.

 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality.

Similar presentations


Presentation on theme: " 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality."— Presentation transcript:

1  2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality

2  2004, 2009 SDU 2 A Resulting Problem Given a coding scheme of the source symbols, how to verify whether it is uniquely decodable or not?

3  2004, 2009 SDU 3 Related Notions alphabet:  = {0, 1, …,  -1} symbol or letter: an element of alphabet  word: a sequence of symbols of finite length Code: a collection of words on a specified alphabet codeword: a word in a code message: a sequence of codewords Uniquely decodable code C: every message can be uniquely decomposed into the codewords in C  {0, 10, 01} vs {0, 10, 11}

4  2004, 2009 SDU 4 Related Notions prefix and suffix: if w = ps, then p is prefix of w and s is suffix of w empty word: a word with length 0 suffix word: a non-empty word t is called a suffix word if there exist two messages C 1 C 2 …C m and C 1 ’C 2 ’…C n ’ such that  C i, C j ’ are all codewords for 1  i  m, 1  j  n, and C 1  C 1 ’,  t is the suffix of C n ’,  C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’.

5  2004, 2009 SDU 5 A Key Lemma for Determining UDC Lemma. A code C is uniquely decodable if and only if each suffix word is not a codeword in C. Proof.  Suppose that a suffix word t is a codeword in C, according to the definition of suffix word, there exist two messages C 1 C 2 …C m and C 1 ’C 2 ’…C n ’ such that C 1  C 1 ’ and C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’. Hence, there are two ways to decompose the message C 1 ’C 2 ’…C n ’, indicating that C is not uniquely decodable. A contradiction to that C is a UDC.

6  2004, 2009 SDU 6 Proof  Suppose that C is not uniquely decodable, then there exists some message which can be decomposed in more than one ways. Let  be such a message of the least length,  = C 1 C 2 …C k = C 1 ’C 2 ’…C n ’, where C i (1  i  k), C j ’ (1  j  n) are all codewords, and C 1  C 1 ’. Without loss of generality, assume that C k is a suffix of C n ’, then C k is a suffix word. A contradiction to that each suffix word is not a codeword in C.

7  2004, 2009 SDU 7 UDC Verification By the key lemma If we can generate all the suffix words of a code C  If none of suffix words is a codeword in C, then C is uniquely decodable.  If some suffix words are codewords, then C is not uniquely decodable. The following determining algorithm is directly from the key lemma.

8  2004, 2009 SDU 8 The Determining Algorithm UDC-Verification(C) 1 T   2 for each pair of codeword C i, C j  C (i  j) do 3 if C i = C j, then return NO. (C is not uniquely decodable) 4 if there exists a word s such that C i s = C j or C i = C j s, then T  T  {s} 5 endfor 6 for each pair of suffix word t and codeword C k do 7 if t = C k, then return NO. (C is not uniquely decodable) 8 if there exists a word s such that ts = C k or C k s = t, then T  T  {s} 9 endfor 10 return YES. (C is uniquely decodable)

9  2004, 2009 SDU 9 Correctness of Algorithm Theorem. The algorithm UDC-Verification correctly verifies whether a code C is uniquely decodable or not. Proof. we should prove: (1) Each word s put into T in Step 1.2 or Step 2.2 is a suffix word. (2) If the algorithm stops at Step 3, then the algorithm computes all the suffix words of code C and ensures that they are not codewords.

10  2004, 2009 SDU 10 Proof (1). The word s put in T in Step 1.2 is obviously a suffix word. We next consider the word s put into T in Step 2.2. As t is a suffix word, there exist codewords C 1, C 2,…, C m and C 1 ’, C 2 ’, …, C n ’ such that C 1  C 1 ’ and C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’. If ts = C k, then C 1 C 2 …C m C k = C 1 ’C 2 ’…C n ’s, indicating s is a suffix word. If C k s = t, then C 1 C 2 …C m C k s = C 1 ’C 2 ’…C n ’, indicating s is a suffix word.

11  2004, 2009 SDU 11 (2). For each suffix word t of C, let m(t) = C 1 C 2 …C m be the shortest message satisfying C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’ and t is the suffix of C n ’. Prove by induction on the length of m(t) that t can be generated by the algorithm. Basic Step: |m(t)| = 1, then n = m =1, so t is generated in Step 1.2. Inductive Step: Suppose every suffix word p with |m(p)| < |m(t)| had been generated by the algorithm, we now prove that t can also be generated by the algorithm. Because t is the suffix of C n ’, we have pt = C n ’, then C 1 C 2 …C m = C 1 ’C 2 ’…C n-1 ’p. Proof

12  2004, 2009 SDU 12 Proof (i). If p = C m, then C m t = C n ’, t is generated in Step 1.2. (ii). If p is suffix of C m, according to C 1 C 2 …C m = C 1 ’C 2 ’…C n-1 ’p, p is a suffix word. For |m(p)| < |m(t)|, the inductive hypothesis indicates that p had been generated by the algorithm. So when applying suffix word p and codeword C n ’ in Step 2, Step 2.2 will put t into T since pt = C n ’. (iii). If C m is a suffix of p, then C m t is suffix of C n ’, then C m t is a suffix word for C 1 C 2 …C m t = C 1 ’C 2 ’…C n ’, and |m(C m t)|  |C 1 C 2 …C m-1 |, the inductive hypothesis indicates that C m t had been generated by the algorithm. So when applying suffix word C m t and codeword C m in Step 2, Step 2.2 will put t into T for C m t = C m t. suffix word

13  2004, 2009 SDU 13 Time Complexity Analysis Suppose there are n codewords in C, and the length of the longest word is l, then  Step 1: O(n 2 l) comparisons  Step 2: Number of suffix words is at most O(nl), So O(n 2 l 2 ) comparisons and O(n 2 l 2 ) insertion of suffix words into T.  Totally, O(n 2 l 2 ).

14  2004, 2009 SDU 14 Property of UDC—Kraft Inequality 1.Let C = {C 1, C 2, …, C n } be a uniquely decodable code on an alphabet of cardinality , let l i = |C i | for 1  i  n, then we have 2.Conversely, if a set of integers {l 1, l 2,..., l n } satisfies the Kraft inequality, then a prefix code C = {C 1, C 2, …, C n } can be found with codeword lengths {l 1, l 2,..., l n }. Note:  prefix code C = {C 1, C 2, …, C n } means that neither C i nor C j is a prefix of the other, for each pair of codewords C i and C j (i  j). Strictly, called prefix-free code  Prefix-free code is UDC {00, 10, 11, 100, 111} vs {00, 10, 11, 010, 011} Kraft Inequality

15  2004, 2009 SDU 15 Proof of Property 1 (in text book page 246):  Let m be an arbitrary positive integer, then  For each of n m messages consisting of m codewords, there is a unique corresponding term in the above formula. Let N(m, j) be the number of messages of length j and consisting of m codewords. Then  C is uniquely decodable, there are no identical messages. So N(m, j)   j, We have  So, for any positive integer m > 0, there is,  So the Kraft Inequality Holds. length of the longest codeword in C

16  2004, 2009 SDU 16 Proof of Property 2 Let 1 < 2 < … < m be m integers such that {l 1, l 2, …, l n } = { 1, 2, …, m } when ignoring repeats. Let k j is the number of l i ’s that equals to j. We should prove that, there exists a prefix code C such that the number of codewords in C with length j is k j. The Kraft Inequality becomes Prove by induction that: For each 1  r  m, there exists prefix code C r such that for any 1  j  r, the number of codewords in C r with length j is k j.

17  2004, 2009 SDU 17 Proof of Property 2  Basic Step: r = 1, the above inequality means k 1  - 1  1, which is k 1   1. Obviously there exist  1 different words of length 1, we can arbitrarily select k 1 of them to form C 1.  Inductive Step: Suppose that C r exists for r < m, we prove that C r+1 exist for r +1  m. From, we have, which means Among the  r+1 different words with length r+1, there are codewords with length j in C r. So we can select k r+1 different words with length r+1, and the codewords in C r are not prefix of them. So we extend C r to C r+1.

18  2004, 2009 SDU 18 Thanks for attention!


Download ppt " 2004 SDU Uniquely Decodable Code 1.Related Notions 2.Determining UDC 3.Kraft Inequality."

Similar presentations


Ads by Google