Presentation is loading. Please wait.

Presentation is loading. Please wait.

© Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de Introduction to Computer Science 2 Lecture.

Similar presentations


Presentation on theme: "© Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de Introduction to Computer Science 2 Lecture."— Presentation transcript:

1 © Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group Introduction to Computer Science 2 Lecture 7: Extended binary trees Prof. Neeraj Suri Brahim Ayari

2 ICS-II Lecture 7: Extended binary trees2 In advance: Search in binary trees  Binary trees can be considered as decision trees.  Each node represent a decision, the edges the different possibilities.  In such a tree search means to go from the root to a leaf. A < 2 TRUE FALSE B < 5 X TRUE C > 7 X2X2 X3X3 3X TRUEFALSE

3 ICS-II Lecture 7: Extended binary trees3 Extended binary trees  Replace NULL-pointers with special (external) nodes.  A binary tree, to which external nodes are added, is called extended binary tree.  The data can be stored either in the internal or the external nodes.  The length of the path to the node illustrates the cost of the search.

4 ICS-II Lecture 7: Extended binary trees4 External and internal path length  The cost of the search in extended binary trees depend on the following parameters:  External path length = The sum over all path lengths from the root to the external nodes S i (1  i  n+1): Ext n =  i = 1... n+1 depth( S i )  Internal path length = The sum over all path lengths to the internal nodes K i ( 1  i  n ): Int n =  i = 1... n depth( K i )  Ext n = Int n + 2n(Proof by induction)  Extended binary trees with a minimal external path length have a minimal internal path length too.

5 ICS-II Lecture 7: Extended binary trees5 Example  External path length Ext n = = 25  Internal path length Int n = = 11  25 = Ext n = Int n + 2n = = 25 n =

6 ICS-II Lecture 7: Extended binary trees6 Minimal and maximal length  For a given n, a balanced tree has the minimal internal path length.  Example: Within a complete tree with height h, the internal path length is (for n = 2 h -1): Int n =  i = 1... h i 2i  Internal path length becomes maximum if the tree degenerates to a linear list: Int n =  i = 1... n-1 i = n(n-1)/2 Example: h = 4, n = 15, Int = 34, Ext = 164 = 64 For comparison: List with n = 15 nodes has Int = 105, Ext = = 135

7 ICS-II Lecture 7: Extended binary trees7 Weighted binary trees  Often weights q i are assigned to the external nodes ( 1  i  n+1 ).  The weighted external path length is defined as Ext w =  i = 1... n+1 depth( S i )  q i  Within weighted binary trees the properties of minimal and maximal path lengths do not apply any more.  The determination of the minimal external path length is an important practical problem... Ext w = 102 Ext w = 88 (less than 102 although linear list)

8 ICS-II Lecture 7: Extended binary trees8 Application example: optimal codes  To convert a text file efficiently to bit strings, there are two alternatives:  Fixed length coding: each character has the same number of bits (e.g., ASCII)  Variable length coding: some characters are represented using less bits than the others  Example for coding with fixed length: 3-bit code for alphabet A, B, C, D:  A = 001, B = 010, C = 011, D = 100  Message: ABBAABCDADA is converted to  (length 33 bits)  Using a 2-bit code the same message can be coded only with 22 bits.  For decoding the message, group each 3-bits (respectively 2bits) and use a table with the code and its matching character.

9 ICS-II Lecture 7: Extended binary trees9 Application example: optimal codes (2)  Idea: More frequently used characters are coded using less bits.  Message: ABBAABCDADA  Coding:  Length: 20 Bit!  Variable length coding can reduce the memory space needed for storing the file.  How can this special coding be found and why is the decoding unique? CharacterABCD Frequency5312 Coding

10 ICS-II Lecture 7: Extended binary trees10 Application example: optimal codes (3)  Representation of the frequencies and coding as a weighted binary tree.  First of all decoding: Given a bit string:  Use the successive bits, in order to traverse the tree starting from the root.  If you arrive to an external node, use the character stored there. Example: Bit = 0: external node, A 2. Bit = 1, from the root to the right 3. Bit 0, links, external node, B 4. Bit = 1, from the root to the right 5. Bit 1, right A B DC

11 ICS-II Lecture 7: Extended binary trees11 Correctness condition  Observation: Within variable length coding, the code of one character should not be a prefix of the code of any other character.  If a character is represented in form of an extended binary tree, then the uniqueness is guaranteed (only one character per external node).  If the frequency of the characters in the original text is taken as the weight of the external nodes, then a tree with minimal external path length will offer an optimal code.  How is a tree with minimal external path length generated?

12 ICS-II Lecture 7: Extended binary trees12 Huffman Code  Idea: Characters are weighted and sorted according to the frequency  This works as well independently from the text, e.g., in English (characters with relative weights):  A binary tree with minimal external path length is constructed as follows:  Each character is represented with an appropriate tree with its corresponding weight (only one external node).  The two trees having respectively the smallest weight are merged to a new tree.  The root of the new tree is marked with the sum of the weights of the original roots.  Continue until only one tree remains. E1231T959A805O794 N719I718S659R603 H514L403D365C320 U310P229F228M225 W203Y188B162G161 V93K52Q20X J10Z9

13 ICS-II Lecture 7: Extended binary trees13 Example 1: Huffman  Alphabet and frequency: ETNIS Step 1: (4, 5, 9, 10, 29) new weight: 9 Step 2: (9, 9, 10, 29) new weight:

14 ICS-II Lecture 7: Extended binary trees14 Example 1: Huffman (2)  Step 3: (18, 10, 29)  (10, 18, 29)  new weight: 28 Step 4: (28, 29) finished!

15 ICS-II Lecture 7: Extended binary trees15 Resulting tree  Coding:  Ext w = 112  Using this coding, the code e.g., for:  TENNIS =  SET =  NET =  Decoding as described before. 9 S I N T 0 1 E CharacterCodeWeight E129 T0010 N0119 I01015 S01004

16 ICS-II Lecture 7: Extended binary trees16 Some remarks  The resulting tree is not regular.  Regular trees are not always optimal.  Example: the best nearly complete tree has Ext w = 123  For the message ABBAABCDADA 20 bits is optimal (see previous slides)

17 ICS-II Lecture 7: Extended binary trees17 Example 2: Huffman  Average number of bits without Huffman: 3 (because 2 3 = 8)  Average number of bits using Huffman code:  There are other “valid” solutions! But the average number of bits remains the same for all these solutions (equal to Huffman) Zp (%)Code A2500 B41110 C13100 D7110 E3501 F11101 G H311111

18 ICS-II Lecture 7: Extended binary trees18 Analysis /* Algorithm Huffmann */ for (int i = 1; i  n-1; i++) { p 1 = smallest element in list L remove p1 from L p 2 = smallest element in L remove p 2 from L create node p add p 1 und p 2 as left and right subtrees to p weight p = weight p 1 + weight p 2 insert p into L }  Run time behavior depends in particular on the implementation of the list  Time required to find the node with the smallest weight  Time required to insert a new node  “Naive” implementations give O(n 2 ), “smarter” result in O(n log 2 n)

19 ICS-II Lecture 7: Extended binary trees19 Optimality  Observation: The weight of a node K in the Huffman tree is equal to the external path length of the subtree having K as root.  Theorem: A Huffman tree is an extended binary tree with minimal external path length Ext w.  Proof outline (per induction over n, the number of the characters in the alphabet):  The statement to prove is A(n) = “A Huffman tree with n nodes has minimal external path length Ext w ”.  Consider first n=2: Prove A(2) = “A Huffman tree with 2 nodes has minimal external path length”.

20 ICS-II Lecture 7: Extended binary trees20 Optimality (2)  Proof:  n = 2: Only two characters with weights q1 and q2 result in a tree with Ext w = q1 + q2. This is minimal, because there are no other trees.  Induction hypothesis: For all i  k, A(i) is true.  To prove: A(k+1) is true. V T1T1 T2T2

21 ICS-II Lecture 7: Extended binary trees21 Optimality (3)  Proof:  Consider a Huffman tree T with k+1 nodes. This tree has a root V and two subtrees T 1 und T 2, which have respectively the weights q 1 and q 2.  Considering the construction method we can deduce, that For the weights q i of all internal nodes n i of T 1 and T 2 : q i  min(q 1, q 2 ).  That’s why: for these weights q i : q 1 + q 2 > q i. So if V is replaced by any node in T1 or T2, the resulting tree will have a greater weight.  Replacing nodes within T 1 and T 2 will not make sense, because T1 and T2 are already optimal (both are trees with k nodes or less and the induction hypothesis hold for them).  So T is an optimal tree with k+1 nodes. V T1T1 T2T2 q1q1 q2q2 q 1 + q2

22 ICS-II Lecture 7: Extended binary trees22 Huffman Code: Applications  Fax machine

23 ICS-II Lecture 7: Extended binary trees23 Huffman: Other applications  ZIP-Coding (at least similar technique)  In principle: most of coding techniques with data reduction (lossless compression)  NOT Huffman: lossy compression techniques like JPEG, MP3, MPEG, …


Download ppt "© Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group www.deeds.informatik.tu-darmstadt.de Introduction to Computer Science 2 Lecture."

Similar presentations


Ads by Google