Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller.

Similar presentations


Presentation on theme: "Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller."— Presentation transcript:

1 Tree Data Structures

2 Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller than root, look at left and right child If search item smaller than root, look at left and right child If search item smaller than any node, look at both children If search item smaller than any node, look at both children Search in a heap is upper bounded by O(n) – may have to look at every node (if your value is smaller than every node in the tree). Search in a heap is upper bounded by O(n) – may have to look at every node (if your value is smaller than every node in the tree). Total time: Total time: n nodes inserted in heap, O(log 2 n) insertion n nodes inserted in heap, O(log 2 n) insertion n searches, O(n) each n searches, O(n) each O(n log 2 n + n 2 ) O(n log 2 n + n 2 )

3 Heaps for Searching Heap is well suited for problems where have to remove specific elements (largest, smallest) Heap is well suited for problems where have to remove specific elements (largest, smallest) Need to better exploit binary tree properties (max height log 2 n) better than naive heap search to allow O(log 2 n) search for arbitrary elements Need to better exploit binary tree properties (max height log 2 n) better than naive heap search to allow O(log 2 n) search for arbitrary elements But if pull out largest element every time, But if pull out largest element every time, Result is reverse sorted list – Heapsort Result is reverse sorted list – Heapsort Cost = ~ O(nlog 2 n) to build heap, O(nlog 2 n) to extract items back out Cost = ~ O(nlog 2 n) to build heap, O(nlog 2 n) to extract items back out

4 A Better Way to Search: Binary Search Trees Binary Search Tree: Binary Search Tree: A binary tree (2 children, left and right) A binary tree (2 children, left and right) Either zero nodes (empty) or Either zero nodes (empty) or If > 0 nodes, If > 0 nodes, Every element has a key and no two elements have the same key (unique keys) Every element has a key and no two elements have the same key (unique keys) All keys (if any) in the left sub-tree are smaller than the key in the root All keys (if any) in the left sub-tree are smaller than the key in the root All keys (if any) in the right sub-tree are larger than the key in the root All keys (if any) in the right sub-tree are larger than the key in the root The left and right subtrees are also binary search trees. The left and right subtrees are also binary search trees.

5 Binary Search Trees 30 5 2 40 Unique keys Left nodes < root Right nodes > root Left and right are also binary search trees 60 80 65 70 Unique keys Left nodes < root Right nodes > root Left and right are also binary search trees

6 Binary Search Trees 20 15 12 25 Unique keys Left nodes < root Right nodes > root Left and right are not also binary search trees 18 22 Not a binary search tree Right child of 25 is not larger than 25

7 Binary Search Trees Note that there is no constraint to be a complete binary tree, just an arbitrary binary tree Note that there is no constraint to be a complete binary tree, just an arbitrary binary tree Suggests that linked node implementation may be more useful Suggests that linked node implementation may be more useful May affect properties of searching May affect properties of searching Recursive definition of binary search tree = recursive algorithms Recursive definition of binary search tree = recursive algorithms

8 Binary Search Trees: Search Search: Search: Take advantage of binary tree properties Take advantage of binary tree properties Begin at root Begin at root If root == 0, return as tree is empty If root == 0, return as tree is empty Otherwise, Otherwise, Compare x to root key Compare x to root key If x == root key, return root node If x == root key, return root node Else if x Recursively search on left child Else if x Recursively search on left child Else recursively search on right child Else recursively search on right child

9 Binary Search Trees: BSTNode Definition template template class BSTNode { private: BSTNode* leftChild; BSTNode* rightChild; Element data; }; template template class Element {private: Type key; ??? OTHER DATA }

10 Binary Search Tree: Search Implementation template template BSTNode * BST ::Search(const Element & x) { return Search(root,x); } template template BSTNode * BST ::Search(BSTNode* *b, const Element & x) { if (b == 0) return 0; if (x.key == b->data.key) return b; if (x.key data.key) return Search(b->LeftChild, x); return Search(b->rightChild, x); }

11 Binary Search Trees: Search Example 30 5 2 40 15 Find 15 Is root == 0? No Compare 15 to root (30) 15 < 30, so recurse on left child Compare 15 to 5 15 > 5, so recurse on right child Compare 15 to 15 15 == 15, so return node with 15

12 Binary Search Trees: Big Oh Analysis At the root, we do one comparison At the root, we do one comparison > Root or Root or < Root Depending on result Depending on result Move to one child of root [moving down a level] Move to one child of root [moving down a level] Do one comparison Do one comparison Max number of times could do this is the height of the tree (maximum number of levels) – O(h). Max number of times could do this is the height of the tree (maximum number of levels) – O(h). Thus ease of search is dependent on the shape of the tree: Thus ease of search is dependent on the shape of the tree: Skewed – expensive: O(n) Skewed – expensive: O(n) Balanced – cheap: O (log 2 n) Balanced – cheap: O (log 2 n)

13 Binary Search Trees: Insertion Rules: Insertion must preserve Rules: Insertion must preserve Unique keys Unique keys Right child > parent Right child > parent Left child < parent Left child < parent Self-similar (internal nodes are also binary trees) Self-similar (internal nodes are also binary trees) How do we check for uniqueness? How do we check for uniqueness? Look at all the nodes? Look at all the nodes?

14 Binary Search Trees: Insertion Don’t need to look at all the nodes Don’t need to look at all the nodes Take advantage of the fact that before adding it was already a binary search tree Take advantage of the fact that before adding it was already a binary search tree To see if value is in tree, search for it. To see if value is in tree, search for it. 30 5 2 40 15 Add 15 Search for 15 15 ? 30, 15 Left 15 ? 5, 15 > 5 => Right 15 ? 15, 15 == 15 => Not Unique

15 Binary Search Trees: Insertion Search not also performs test for uniqueness, but also puts us in the right place to insert at Search not also performs test for uniqueness, but also puts us in the right place to insert at Where input value should be in tree Where input value should be in tree 30 5 2 40 Add 15 Search for 15 15 ? 30, 15 Left 15 ? 5, 15 > 5 => Right No right child, so not present Add 15 as right child of 5 15

16 Binary Search Trees: Insertion Implementation template template bool BST ::Insert(const Element & x) { // search for x BSTNode *current = root; BSTNode * parent = 0; while (current) { parent = current; if (x.key == current-> data.key) return false; if (x.key data.key) current = current->leftChild; else current = current->rightChild; } current = new BSTNode ; current->leftChild = 0; current->rightChild = 0; current->data = x; if (!root) root = current; else if (x.key data.key) parent->leftChild = current; else parent->rightChild = current; return true; }

17 Binary Search Trees: Insertion Big Oh Analysis Core of insertion function is in the search implementation Core of insertion function is in the search implementation Dependent on shape and size of tree Dependent on shape and size of tree Actual insertion is constant time Actual insertion is constant time Cost is bounded by search cost, which we have said: Cost is bounded by search cost, which we have said: O(n) worst case O(n) worst case ~O(log 2 n) average case with a well balanced tree. ~O(log 2 n) average case with a well balanced tree.

18 Binary Search Trees: Deletion Rules: Deletion must preserve Rules: Deletion must preserve Unique keys Unique keys No work to do here. If unique before delete, unique afterwards as deletes can’t change values in tree No work to do here. If unique before delete, unique afterwards as deletes can’t change values in tree Do need to ensure: Do need to ensure: Right child > parent, left child parent, left child < parent Self-similar (internal nodes are also binary trees) Self-similar (internal nodes are also binary trees)

19 Binary Search Trees: Deletion 30 5 2 40 15 Three cases: 1)Leaf Node (15) Remove leaf node Set parents pointer where leaf node was to zero 30 5 2 40

20 Binary Search Trees: Deletion 2) Non-leaf, one child (5) From current, Set parents link to currents link Remove current node 30 5 2 40 30 5 2 40

21 Binary Search Trees: Deletion 30 5 2 40 Non-leaf, multiple children (30) Replace value with largest element of left subtree or smallest element of right subtree Delete node from which you swapped This then becomes case 1 or case 2 5 5 2 40 toDelete 5 5 2 40

22 Binary Search Trees: Deletion The rule was: The rule was: “ ” “Replace value with largest element of left subtree or smallest element of right subtree” Is this guaranteed to work? Is this guaranteed to work? Yes, because of binary tree properties, largest element of left side is: Yes, because of binary tree properties, largest element of left side is: Bigger than anything in left side Bigger than anything in left side Smaller than anything in right side Smaller than anything in right side Smallest of right side is: Smallest of right side is: Bigger than anything in left side Bigger than anything in left side Smaller than anything in right side Smaller than anything in right side These are exactly the roles that must be fulfilled when moving to become the root of that subtree These are exactly the roles that must be fulfilled when moving to become the root of that subtree

23 Binary Search Trees: Height The worst case height for a binary tree is the number of elements in the tree The worst case height for a binary tree is the number of elements in the tree Skewed tree Skewed tree 30 5 2 40 Binary Tree operation costs are bounded by the height of the tree, so in these cases become O(n). How easy is it to get a skewed tree? Sorted or nearly sorted data

24 Binary Search Trees: Height bool BST ::Insert(const Element & x) { // search for x BSTNode *current = root; BSTNode * parent = 0; while (current) { parent = current; if (x.key == current-> data.key) return false; if (x.key data.key) current = current- >leftChild; else current = current->rightChild; } current = new BSTNode ; current->leftChild = 0; current->rightChild = 0; current- >data = x; if (!root) root = current; else if (x.key data.key) parent->leftChild = current; else parent->rightChild = current; return true; } Insert: 3, 4, 6, 5, 8 3 root 4 6 5 86

25 Binary Search Trees: Height If insertions are made at random, height is O(log n) on average If insertions are made at random, height is O(log n) on average Random insertions are the general case, so most of the time will achieve O(log n) height Random insertions are the general case, so most of the time will achieve O(log n) height There are ways to guarantee O(log n) height – requires modifications to insert and delete functions to maintain balance. There are ways to guarantee O(log n) height – requires modifications to insert and delete functions to maintain balance.

26 TreeSort: Insertion into a binary tree places a specific ordering on the elements. Insertion into a binary tree places a specific ordering on the elements. 30 5 2 40 15 35 50 For the root, Everything in the left subtree is < root Everything in the right subtree is > root For each subtree, Everything on the left < subtree root, Everything on the right is > subtree root

27 TreeSort: Theoretically, should be able to construct an ordering of all elements from the tree: Theoretically, should be able to construct an ordering of all elements from the tree: Generate an array of size equal to number of elements in tree Generate an array of size equal to number of elements in tree Root goes in middle of array Root goes in middle of array Left subtree fills in left half of array Left subtree fills in left half of array Right subtree fills in right half of array Right subtree fills in right half of array And Recurse And Recurse < 30> 3030 5 4030

28 TreeSort: Extracting ordered array from binary tree: Extracting ordered array from binary tree: Perform in-order traversal (LVR) – Ensures will visit all smaller items first and larger items last Perform in-order traversal (LVR) – Ensures will visit all smaller items first and larger items last 30 5 2 40 15 35 50 LVR Ordering: 2,5,15,30,40,35,50

29 TreeSort: Analysis of TreeSort: Analysis of TreeSort: Given an array of size n, have to build binary a tree with n-elements Given an array of size n, have to build binary a tree with n-elements Requires N insertions Requires N insertions Given a binary tree with n-elements, have to traverse tree in LVR order to extract sorted order Given a binary tree with n-elements, have to traverse tree in LVR order to extract sorted order Construction: O(n * log 2 n) if balanced O(n * n) if not balanced O(n * n) if not balanced Traversal: O(n) anytime Average Case: O(n log 2 n), Worst Case: O(n 2 )

30 TreeSort: Very similar to quicksort! Very similar to quicksort! Same average case [O(n log n)] and worst case [O(n 2 )] times Same average case [O(n log n)] and worst case [O(n 2 )] times Roots of binary search tree subnodes are the pivots Roots of binary search tree subnodes are the pivots Place data smaller than pivot on left of pivot (leftChild), place larger data on right of pivot (rightChild) Place data smaller than pivot on left of pivot (leftChild), place larger data on right of pivot (rightChild) The better the pivot is, the more balanced the tree is (same for quicksort recursion) The better the pivot is, the more balanced the tree is (same for quicksort recursion) Nearly sorted/already sorted data leads both to trouble: Bad partitioning for quicksort, Bad construction for treesort Nearly sorted/already sorted data leads both to trouble: Bad partitioning for quicksort, Bad construction for treesort

31 Rank Information Often times when working with lists of data, interested in rank information: Often times when working with lists of data, interested in rank information: What is the largest item? What is the largest item? What is the smallest? What is the smallest? What is the median? What is the median? What is the fifth smallest item? What is the fifth smallest item? Largest and smallest are trivial [O(n)] Largest and smallest are trivial [O(n)] What if want to ask a lot of questions about rank or want to know about something other than largest smallest? What if want to ask a lot of questions about rank or want to know about something other than largest smallest?

32 Rank Information Sorting approach to rank information: Sorting approach to rank information: Sort the list Sort the list Return list[rankOfInterest] Return list[rankOfInterest] O(n log n) [sort] + O(1) [value retrieval] O(n log n) [sort] + O(1) [value retrieval] If using dynamic data, may not have the array to work with – instead a linked list would be more likely If using dynamic data, may not have the array to work with – instead a linked list would be more likely

33 Rank Information Linked List Approach Linked List Approach Sort list Sort list Assuming mergesort for linked lists Assuming mergesort for linked lists Traverse list to find rankOfInterest element Traverse list to find rankOfInterest element O(n log n) [sort] + O(rankOfInterest) [traversal] O(n log n) [sort] + O(rankOfInterest) [traversal] Can handle dynamic data, but slower! Can handle dynamic data, but slower!

34 Rank Information Binary Tree Approach: Binary Tree Approach: Insert into binary tree Insert into binary tree Inorder traversal up until rankOfInterest node (goes through in sorted order) Inorder traversal up until rankOfInterest node (goes through in sorted order) O(n log n) [building tree] + O(rankOfInterest) [traversal] O(n log n) [building tree] + O(rankOfInterest) [traversal] Same cost as linked list approach (probably easier since don’t have to write quicksort for linked lists).

35 Rank Information: Binary Tree Approach II: Binary Tree Approach II: Add a new variable to each node in the tree Add a new variable to each node in the tree leftSize = indicates number of elements in nodes left subtree + self leftSize = indicates number of elements in nodes left subtree + self Initially set all left sizes to 1 (for self) Initially set all left sizes to 1 (for self) Insert elements into binary tree Insert elements into binary tree As pass by parent nodes in searching for appropriate place, store references to each parent node As pass by parent nodes in searching for appropriate place, store references to each parent node If do insertion, update each parent’s leftSize value If do insertion, update each parent’s leftSize value If don’t insert (non-unique), no updates for leftSize If don’t insert (non-unique), no updates for leftSize Search by rank using traditional binary tree search on leftSize value Search by rank using traditional binary tree search on leftSize value Function on next slide Function on next slide

36 Rank Information: template template BinaryTreeNode * BinarySearchTree :: search(int rank) { BinaryTreeNode * current = root; while (current) { if (k == current->leftSize) return current; else if (rank leftSize) current = current-leftChild; else { rank = rank – leftSize; current = current->rightChild;}}}

37 Rank Information: Example Mike John Georgia Thomas Kylie Shelley Tyler 4 2 1 11 1 2 What is 2nd element? Rank 2 < leftSize(Mike) [4] Move to root->leftChild Rank 2 == leftSize(John) [2] Return John Node What is 5 th element? Rank 5 > leftSize(Mike) [4] Move to root->rightChild Rank = 5-4 = 1 < leftSize(Thomas) [2] Move to leftChild of Thomas Rank == leftSize(Shelley) [1] Return Shelley Node Real Ranks for Data [First is rank 1, Last is 7]: Georgia, John, Kylie, Mike, Shelley, Thomas, Tyler leftSize values:

38 Rank Information: Analysis Searching (traversal) is now bounded by the height of the tree Searching (traversal) is now bounded by the height of the tree On average O(log n) On average O(log n) Building tree was O(n log n), but we added more work Building tree was O(n log n), but we added more work Original n log n comes from n insertions, log n cost each Original n log n comes from n insertions, log n cost each Now have to update parents leftSize values Now have to update parents leftSize values However, maximum number of parents = height of tree = on average log n However, maximum number of parents = height of tree = on average log n So the cost for a single insertion is now just 2 log n, and all insertions costs are still bounded by O(n log n) So the cost for a single insertion is now just 2 log n, and all insertions costs are still bounded by O(n log n) So for dynamic data, can do rank information in: So for dynamic data, can do rank information in: O(n log n) [building] + O(log n) [searching] Better than approaches that sort and traverse to rank position

39 Threaded Trees: General Trees Mike John Georgia Thomas Kylie Shelley Tyler Fred Hall Wasting a lot of links in this tree -> All terminals waste 2 links! Can we make use of those for our good? Yes.

40 Threaded Trees Mike John Georgia Thomas KylieShelleyTyler FredHall NULL tt ff tt ff

41 Threaded Trees: Insertion Mike John Kylie Hall Bill Mike John Kylie HallBill

42 Threaded Trees: Insertion Mike John Kylie HallBill FredJane Mike John Kylie Hall Bill FredJane Kate


Download ppt "Tree Data Structures. Heaps for searching Search in a heap? Search in a heap? Would have to look at root Would have to look at root If search item smaller."

Similar presentations


Ads by Google