Download presentation

Presentation is loading. Please wait.

Published byJade Boyington Modified about 1 year ago

1
© Neeraj Suri EU-NSF ICT March 2006 Dependable Embedded Systems & SW Group Introduction to Computer Science 2 Lecture 7: Extended binary trees Prof. Neeraj Suri Brahim Ayari

2
ICS-II Lecture 7: Extended binary trees2 In advance: Search in binary trees Binary trees can be considered as decision trees. Each node represent a decision, the edges the different possibilities. In such a tree search means to go from the root to a leaf. A < 2 TRUE FALSE B < 5 X TRUE C > 7 X2X2 X3X3 3X TRUEFALSE

3
ICS-II Lecture 7: Extended binary trees3 Extended binary trees Replace NULL-pointers with special (external) nodes. A binary tree, to which external nodes are added, is called extended binary tree. The data can be stored either in the internal or the external nodes. The length of the path to the node illustrates the cost of the search.

4
ICS-II Lecture 7: Extended binary trees4 External and internal path length The cost of the search in extended binary trees depend on the following parameters: External path length = The sum over all path lengths from the root to the external nodes S i (1 i n+1): Ext n = i = 1... n+1 depth( S i ) Internal path length = The sum over all path lengths to the internal nodes K i ( 1 i n ): Int n = i = 1... n depth( K i ) Ext n = Int n + 2n(Proof by induction) Extended binary trees with a minimal external path length have a minimal internal path length too.

5
ICS-II Lecture 7: Extended binary trees5 Example External path length Ext n = = 25 Internal path length Int n = = 11 25 = Ext n = Int n + 2n = = 25 n =

6
ICS-II Lecture 7: Extended binary trees6 Minimal and maximal length For a given n, a balanced tree has the minimal internal path length. Example: Within a complete tree with height h, the internal path length is (for n = 2 h -1): Int n = i = 1... h i 2i Internal path length becomes maximum if the tree degenerates to a linear list: Int n = i = 1... n-1 i = n(n-1)/2 Example: h = 4, n = 15, Int = 34, Ext = 164 = 64 For comparison: List with n = 15 nodes has Int = 105, Ext = = 135

7
ICS-II Lecture 7: Extended binary trees7 Weighted binary trees Often weights q i are assigned to the external nodes ( 1 i n+1 ). The weighted external path length is defined as Ext w = i = 1... n+1 depth( S i ) q i Within weighted binary trees the properties of minimal and maximal path lengths do not apply any more. The determination of the minimal external path length is an important practical problem... Ext w = 102 Ext w = 88 (less than 102 although linear list)

8
ICS-II Lecture 7: Extended binary trees8 Application example: optimal codes To convert a text file efficiently to bit strings, there are two alternatives: Fixed length coding: each character has the same number of bits (e.g., ASCII) Variable length coding: some characters are represented using less bits than the others Example for coding with fixed length: 3-bit code for alphabet A, B, C, D: A = 001, B = 010, C = 011, D = 100 Message: ABBAABCDADA is converted to (length 33 bits) Using a 2-bit code the same message can be coded only with 22 bits. For decoding the message, group each 3-bits (respectively 2bits) and use a table with the code and its matching character.

9
ICS-II Lecture 7: Extended binary trees9 Application example: optimal codes (2) Idea: More frequently used characters are coded using less bits. Message: ABBAABCDADA Coding: Length: 20 Bit! Variable length coding can reduce the memory space needed for storing the file. How can this special coding be found and why is the decoding unique? CharacterABCD Frequency5312 Coding

10
ICS-II Lecture 7: Extended binary trees10 Application example: optimal codes (3) Representation of the frequencies and coding as a weighted binary tree. First of all decoding: Given a bit string: Use the successive bits, in order to traverse the tree starting from the root. If you arrive to an external node, use the character stored there. Example: Bit = 0: external node, A 2. Bit = 1, from the root to the right 3. Bit 0, links, external node, B 4. Bit = 1, from the root to the right 5. Bit 1, right A B DC

11
ICS-II Lecture 7: Extended binary trees11 Correctness condition Observation: Within variable length coding, the code of one character should not be a prefix of the code of any other character. If a character is represented in form of an extended binary tree, then the uniqueness is guaranteed (only one character per external node). If the frequency of the characters in the original text is taken as the weight of the external nodes, then a tree with minimal external path length will offer an optimal code. How is a tree with minimal external path length generated?

12
ICS-II Lecture 7: Extended binary trees12 Huffman Code Idea: Characters are weighted and sorted according to the frequency This works as well independently from the text, e.g., in English (characters with relative weights): A binary tree with minimal external path length is constructed as follows: Each character is represented with an appropriate tree with its corresponding weight (only one external node). The two trees having respectively the smallest weight are merged to a new tree. The root of the new tree is marked with the sum of the weights of the original roots. Continue until only one tree remains. E1231T959A805O794 N719I718S659R603 H514L403D365C320 U310P229F228M225 W203Y188B162G161 V93K52Q20X J10Z9

13
ICS-II Lecture 7: Extended binary trees13 Example 1: Huffman Alphabet and frequency: ETNIS Step 1: (4, 5, 9, 10, 29) new weight: 9 Step 2: (9, 9, 10, 29) new weight:

14
ICS-II Lecture 7: Extended binary trees14 Example 1: Huffman (2) Step 3: (18, 10, 29) (10, 18, 29) new weight: 28 Step 4: (28, 29) finished!

15
ICS-II Lecture 7: Extended binary trees15 Resulting tree Coding: Ext w = 112 Using this coding, the code e.g., for: TENNIS = SET = NET = Decoding as described before. 9 S I N T 0 1 E CharacterCodeWeight E129 T0010 N0119 I01015 S01004

16
ICS-II Lecture 7: Extended binary trees16 Some remarks The resulting tree is not regular. Regular trees are not always optimal. Example: the best nearly complete tree has Ext w = 123 For the message ABBAABCDADA 20 bits is optimal (see previous slides)

17
ICS-II Lecture 7: Extended binary trees17 Example 2: Huffman Average number of bits without Huffman: 3 (because 2 3 = 8) Average number of bits using Huffman code: There are other “valid” solutions! But the average number of bits remains the same for all these solutions (equal to Huffman) Zp (%)Code A2500 B41110 C13100 D7110 E3501 F11101 G H311111

18
ICS-II Lecture 7: Extended binary trees18 Analysis /* Algorithm Huffmann */ for (int i = 1; i n-1; i++) { p 1 = smallest element in list L remove p1 from L p 2 = smallest element in L remove p 2 from L create node p add p 1 und p 2 as left and right subtrees to p weight p = weight p 1 + weight p 2 insert p into L } Run time behavior depends in particular on the implementation of the list Time required to find the node with the smallest weight Time required to insert a new node “Naive” implementations give O(n 2 ), “smarter” result in O(n log 2 n)

19
ICS-II Lecture 7: Extended binary trees19 Optimality Observation: The weight of a node K in the Huffman tree is equal to the external path length of the subtree having K as root. Theorem: A Huffman tree is an extended binary tree with minimal external path length Ext w. Proof outline (per induction over n, the number of the characters in the alphabet): The statement to prove is A(n) = “A Huffman tree with n nodes has minimal external path length Ext w ”. Consider first n=2: Prove A(2) = “A Huffman tree with 2 nodes has minimal external path length”.

20
ICS-II Lecture 7: Extended binary trees20 Optimality (2) Proof: n = 2: Only two characters with weights q1 and q2 result in a tree with Ext w = q1 + q2. This is minimal, because there are no other trees. Induction hypothesis: For all i k, A(i) is true. To prove: A(k+1) is true. V T1T1 T2T2

21
ICS-II Lecture 7: Extended binary trees21 Optimality (3) Proof: Consider a Huffman tree T with k+1 nodes. This tree has a root V and two subtrees T 1 und T 2, which have respectively the weights q 1 and q 2. Considering the construction method we can deduce, that For the weights q i of all internal nodes n i of T 1 and T 2 : q i min(q 1, q 2 ). That’s why: for these weights q i : q 1 + q 2 > q i. So if V is replaced by any node in T1 or T2, the resulting tree will have a greater weight. Replacing nodes within T 1 and T 2 will not make sense, because T1 and T2 are already optimal (both are trees with k nodes or less and the induction hypothesis hold for them). So T is an optimal tree with k+1 nodes. V T1T1 T2T2 q1q1 q2q2 q 1 + q2

22
ICS-II Lecture 7: Extended binary trees22 Huffman Code: Applications Fax machine

23
ICS-II Lecture 7: Extended binary trees23 Huffman: Other applications ZIP-Coding (at least similar technique) In principle: most of coding techniques with data reduction (lossless compression) NOT Huffman: lossy compression techniques like JPEG, MP3, MPEG, …

Similar presentations

© 2016 SlidePlayer.com Inc.

All rights reserved.

Ads by Google