Presentation is loading. Please wait.

Presentation is loading. Please wait.

Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If.

Similar presentations


Presentation on theme: "Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If."— Presentation transcript:

1 Bushy Binary Search Tree from Ordered List

2 Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If we apply binary search to an ordered list and draw its comparison tree, then we see that binary search does exactly the same comparisons as tree_search will do if it is applied to this same tree. We already know from Section 7.4 that binary search performs O(log n) comparisons for a list of length n. This performance is excellent in comparison to other methods, since log n grows very slowly as n increases.

3 an example Suppose, as an example, that we apply binary search to the list of seven letters a, b, c, d, e, f, and g. The resulting tree is shown in part (a) of Figure 10.8. If tree_search is applied to this tree, it will do the same number of comparisons as binary search.

4 The Binary Search Tree

5 It is quite possible, however, that the same letters may be built into a binary search tree of a quite different shape, such as any of those shown in the remaining parts of Figure 10.8.

6 The Binary Search Tree Class The tree shown as part (a) of Figure 10.8 is the best possible for searching. It is as “bushy” as possible: It has the smallest possible height for a given number of vertices.

7 The Binary Search Tree Class In part (c) of Figure 10.8, however, the tree has degenerated quite badly, so that a search for target c requires six comparisons. In parts (d) and (e), the tree reduces to a single chain. tree_search, when applied to such a chain, degenerates to sequential search.

8 Goal: Start with an ordered list and build its entries into a binary search tree that is nearly balanced (“as bushy as possible”).

9 Building A Binary Search Tree When the number of entries, n, is 31, for example, we wish to build the tree of Figure 10.12.

10 Building A Binary Search Tree In Figure 10.12 the entries are numbered in their natural order, that is, in inorder sequence, which is the order in which they will be received and built into the tree, since they are received in sorted order. We will also use this numbering to label the nodes of the tree.

11 An important property of the labels If you examine the diagram for a moment, you may notice an important property of the labels. The labels of the leaves are all odd numbers; that is, they are not divisible by 2. The labels of the nodes one level above the leaves are 2, 6, 10, 14, 18, 22, 26, and 30. These numbers are all double an odd number; that is, they are all even (divisible by 2 = 2 1 ), but are not divisible by 4. On the next level up, the labels are 4, 12, 20, and 28, numbers that are divisible by 4 = 2 2, but not by 8. Finally, the nodes just below the root are labeled 8 and 24 (divisible by 8 = 2 3 ), and the root itself is 16 (divisible by 16 = 2 4 ). The crucial observation is: If the nodes of a complete binary tree are labeled in inorder sequence, starting with 1, then each node is exactly as many levels above the leaves as the highest power of 2 that divides its label.

12 one more constraint Let us now put one more constraint on our problem: Let us suppose that we do not know in advance how many entries will be built into the tree. If the entries are coming from a file or a linked list, then this assumption is quite reasonable, since we may not have any convenient way to count the entries before receiving them.

13 one more constraint This assumption also has the advantage that it will stop us from worrying about the fact that, when the number of entries is not exactly one less than a power of 2, the resulting tree will not be complete and cannot be as symmetrical as the one in Figure 10.12. Instead, we shall design our algorithm as though it were completely symmetrical, and after receiving all entries we shall determine how to tidy up the tree.

14 Getting Started There is no doubt what to do with entry number 1 when it arrives. It will be placed in a leaf node whose left and right pointers should both be set to NULL. Node number 2 goes above node 1, as shown in Figure 10.13. Since node 2 links to node 1, we obviously must keep some way to remember where node 1 is until entry 2 arrives. Node 3 is again a leaf, but it is in the right subtree of node 2, so we must remember a pointer to node 2.

15

16 keep a list of pointers Does this mean that we must keep a list of pointers to all nodes previously processed, to determine how to link in the next one? The answer is no, since when node 2 is added, all connections for node 1 are complete. Node 2 must be remembered until node 4 is added, to establish the left link from node 4, but then a pointer to node 2 is no longer needed. Similarly, node 4 must be remembered until node 8 has been processed. In Figure 10.13, colored arrows point to each node that must be remembered as the tree grows.

17 keep a list of pointers It should now be clear that to establish future links, we need only remember pointers to one node on each level, the last node processed on that level. We keep these pointers in a List called last_node that will be quite small. For example, a tree with 20 levels (hence 20 entries in last_node) can accommodate 2 20 - 1 > 1,00 0,000 nodes.

18 Finishing the Task Finally, we must determine how to tie in any subtrees that may not yet be connected properly after all the nodes have been received.

19 Finishing the Task For example, if n … 21, we must connect the three components shown in Figure 10.13 into a single tree.

20 Finishing the Task Some nodes in the upper part of the tree may still have their right links set to NUL L, even though further nodes have been inserted that now belong in their right subtrees. Any one of these nodes (a node, not a leaf, for which the right child is still NULL) will be one of the nodes in the list last_node. For n … 21, these will be nodes 16 and 20 (in positions 5 and 3 of last_node, respectively), as shown in Figure 10.14.

21 determine the highest node in last_node that is not already in the left subtree The pointer lower_node can be determined as the highest node in last_node that is not already in the left subtree of high_nod e. To determine whether a node is in the left subtree, we need only compare its key with that of high_nod e.

22 Evaluation The algorithm of this section produces a binary search tree that is not always completely balanced. If the tree has 31 nodes, then it will be completely balanced, but if 32 nodes come in, then node 32 will become the root of the tree, and all 31 remaining nodes will be in its left subtree. In this case, the leaves are five steps away from the root. If the root were chosen optimally, then most of the leaves would be four steps from it, and only one would be five steps. Hence one comparison more than necessary will usually be done in the tree with 32 nodes.

23 Evaluation One extra comparison in a binary search is not really a very high price, and it is easy to see that a tree produced by our method is never more than one level away from optimality. There are sophisticated methods for building a binary search tree that is as balanced as possible, but much remains to recommend a simpler method, one that does not need to know in advance how many nodes are in the tree.

24 Random Search Trees and Optimality To conclude this section, let us ask whether it is worthwhile, on average, to keep a binary search tree balanced or to rebalance it. If we assume that the keys have arrived in random order, then, on average, how many more comparisons are needed in a search of the resulting tree than would be needed in a completely balanced tree?

25 The average number of nodes visited

26 Evaluation In other words, the average cost of not balancing a binary search tree is approximately 39 percent more comparisons. In applications where optimality is important, this cost must be weighed against the extra cost of balancing the tree, or of maintaining it in balance. Note especially that these latter tasks involve not only the cost of computer time, but the cost of the extra programming effort that will be required.


Download ppt "Bushy Binary Search Tree from Ordered List. Behavior of the Algorithm Binary Search Tree Recall that tree_search is based closely on binary search. If."

Similar presentations


Ads by Google