CSE 326: Data Structures Lecture #8 Balanced Dendrology

Slides:



Advertisements
Similar presentations
CSE332: Data Abstractions Lecture 7: AVL Trees Dan Grossman Spring 2010.
Advertisements

CSE332: Data Abstractions Lecture 7: AVL Trees Tyler Robison Summer
CSE 326 Binary Search Trees David Kaplan Dept of Computer Science & Engineering Autumn 2001.
CSE 326: Data Structures Lecture #10 Balancing Act and What AVL Stands For Steve Wolfman Winter Quarter 2000.
CSE 326: Data Structures Lecture #7 Binary Search Trees Alon Halevy Spring Quarter 2001.
CSE 326: Data Structures Lecture #8 Binary Search Trees Alon Halevy Spring Quarter 2001.
1 CSE 326: Data Structures Trees Lecture 7: Wednesday, Jan 23, 2003.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Nicki Dell Spring 2014 CSE373: Data Structures & Algorithms1.
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act Steve Wolfman 2014W1 1.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees continued Aaron Bauer Winter 2014 CSE373: Data Structures & Algorithms1.
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees Linda Shapiro Winter 2015.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees Linda Shapiro Winter 2015.
CSE373: Data Structures & Algorithms Lecture 8: AVL Trees and Priority Queues Linda Shapiro Spring 2016.
CSE332: Data Abstractions Lecture 7: AVL Trees
Instructor: Lilian de Greef Quarter: Summer 2017
Lecture 15 Nov 3, 2013 Height-balanced BST Recall:
File Organization and Processing Week 3
BCA-II Data Structure Using C
Week 7 - Friday CS221.
CSE 326: Data Structures Lecture #8 Pruning (and Pruning for the Lazy)
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
April 19th – Avl Operations
Balancing Binary Search Trees
CSE373: Data Structures & Algorithms
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
UNIT III TREES.
CSE 332 Data Abstractions B-Trees
Binary Search Trees.
CSIT 402 Data Structures II
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
CSE373: Data Structures & Algorithms
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
CO4301 – Advanced Games Development Week 10 Red-Black Trees continued
October 30th – Priority QUeues
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Lecture 22 Binary Search Trees Chapter 10 of textbook
Binary Search Trees Why this is a useful data structure. Terminology
CSE373: Data Structures & Algorithms Lecture 7: AVL Trees
Chapter 22 : Binary Trees, AVL Trees, and Priority Queues
Monday, April 16, 2018 Announcements… For Today…
CMSC 341 Lecture 10 B-Trees Based on slides from Dr. Katherine Gibson.
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees
CSE373: Data Structures & Algorithms Lecture 6: Binary Search Trees
CMSC 341: Data Structures Priority Queues – Binary Heaps
Wednesday, April 18, 2018 Announcements… For Today…
Instructor: Lilian de Greef Quarter: Summer 2017
Binary Search Trees One of the tree applications in Chapter 10 is binary search trees. In Chapter 10, binary search trees are used to implement bags.
David Kaplan Dept of Computer Science & Engineering Autumn 2001
CPSC 221: Algorithms and Data Structures Lecture #6 Balancing Act
Heaps and the Heapsort Heaps and priority queues
CSE 332: Data Abstractions Binary Search Trees
CSE373: Data Structures & Algorithms Lecture 5: AVL Trees
CSE 332: Data Abstractions AVL Trees
Analysis of Algorithms - Trees
CSE 373: Data Structures and Algorithms
CSC 143 Binary Search Trees.
CSE 326: Data Structures Lecture #9 Amazingly Vexing Letters
CSE 326: Data Structures Lecture #7 Binary Search Trees
Lecture 10 Oct 1, 2012 Complete BST deletion Height-balanced BST
CSE 373 Data Structures and Algorithms
CSE 326: Data Structures Lecture #9 AVL II
CSE 373: Data Structures and Algorithms
Richard Anderson Spring 2016
CSE 326: Data Structures Splay Trees
B-Trees.
CSE 373 Data Structures Lecture 8
CSE 326: Data Structures Lecture #10 Amazingly Vexing Letters
CSE 326: Data Structures Lecture #10 B-Trees
Trees in Data Structures
326 Lecture 9 Henry Kautz Winter Quarter 2002
Presentation transcript:

CSE 326: Data Structures Lecture #8 Balanced Dendrology Bart Niswonger Summer Quarter 2001

Today’s Outline Clear up build tree analysis Deletion from BSTs Binary Search Trees Today we’ll start to cover trees in more detail But lets start with a correction – thank you Ashish! – and a clarification

Analysis of BuildTree Worst case is O(n2) 1 + 2 + 3 + … + n = O(n2) Average case assuming all orderings equally likely is O(n log n) not averaging over all binary trees, rather averaging over all input sequences (inserts) equivalently: average depth of a node is log n proof: see Introduction to Algorithms, Cormen, Leiserson, & Rivest Average runtime is equal to the average depth of a node in the tree. We’ll calculate the average depth by finding the sum of all depths in the tree, and dividing by the number of nodes. What’s the sum of all depths? D(n) = D(I) + D(N - I - 1) + N - 1 (left subtree = I, root is 1 node, so right = n - I - 1. D(I) is depth of left, 1 node deeper in overall, same goes for right, total of I + N - I - 1 extra depth). For BSTs, all subtree sizes are equally likely (because we pick the middle element and random and the rest fall on the left or right determinically). Each subtree then averages 1/N * sum 0 to N-1 of D(j)

Finding the Successor Digression Find the next larger node in this node’s subtree. not next larger in entire tree Node * succ(Node * root) { if (root->right == NULL) return NULL; else return min(root->right); } 10 5 15 2 9 20 Here’s a little digression. Maybe it’ll even have an application at some point. Find the next larger node in 10’s subtree. Can we define it in terms of min and max? It’s the min of the right subtree! 7 17 30 How many children can the successor of a node have?

Predecessor Digression Find the next smaller node in this node’s subtree. Node * pred(Node * root) { if (root->left == NULL) return NULL; else return max(root->left); } 10 5 15 2 9 20 Predecessor is just the mirror problem. 7 17 30

Deletion 10 5 15 2 9 20 And now for something completely different. Let’s say I want to delete a node. Why might it be harder than insertion? Might happen in the middle of the tree instead of at leaf. Then, I have to fix the BST. 7 17 30 Why might deletion be harder than insertion?

Lazy Deletion Instead of physically deleting nodes, just mark them as deleted Simpler some adds just flip deleted flag physical deletions done in batches extra memory for deleted flag many lazy deletions slow finds some operations may have to be modified (e.g., min and max) 10 5 15 Now, before we move on to all the pains of true deletion, let’s do it the easy way. We’ll just pretend we delete deleted nodes. This has some real advantages: … 2 9 20 7 17 30

Lazy Deletion Delete(17) Delete(15) Delete(5) Find(9) Find(16) Insert(5) Find(17) 10 5 15 2 9 20 OK, let’s do some lazy deletions. Everybody yawn, stretch, and say “Mmmm… doughnut” to get in the mood. Purple is a fruit Those of you who are already asleep have the advantage. 7 17 30

Deletion - Leaf Case Delete(17) 10 5 15 2 9 20 7 17 30 Alright, we did it the easy way, but what about real deletions? Leaves are easy; we just prune them. 7 17 30

Deletion - One Child Case Delete(15) 10 5 15 2 9 20 Single child nodes we remove and… Do what? We can just pull up their children. Is the search tree property intact? Yes. 7 30

Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 Ah, now the hard case. How do we delete a two child node? We remove it and replace it with what? It has all these left and right children that need to be greater and less than the new value (respectively). Is there any value that is guaranteed to be between the two subtrees? Two of them: the successor and predecessor! So, let’s just replace the node’s value with it’s successor and then delete the succ. 7 replace node with value guaranteed to be between the left and right subtrees: the successor Could we have used the predecessor instead?

Deletion - Two Child Case Delete(5) 10 5 20 2 9 30 Ah, now the hard case. How do we delete a two child node? We remove it and replace it with what? It has all these left and right children that need to be greater and less than the new value (respectively). Is there any value that is guaranteed to be between the two subtrees? Two of them: the successor and predecessor! So, let’s just replace the node’s value with it’s successor and then delete the succ. 7 always easy to delete the successor – always has either 0 or 1 children!

Delete Code void delete(Comparable x, Node *& p) { Node * q; if (p != NULL) { if (p->key < x) delete(x, p->right); else if (p->key > x) delete(x, p->left); else { /* p->key == x */ if (p->left == NULL) p = p->right; else if (p->right == NULL) p = p->left; else { q = successor(p); p->key = q->key; delete(q->key, p->right); } } } } Do people find these code slides useful? Here’s the code for deletion using lots of confusing reference pointers BUT no leaders, fake nodes. The iterative version of this can get somewhat messy, but it’s not really any big deal.

Dictionary Implementations unsorted array sorted linked list BST insert find + O(n) O(n) find + O(1) O(Depth) find O(log n) delete BST’s looking good for shallow trees, i.e. the depth D is small (log n), otherwise as bad as a linked list!

Beauty is Only (log n) Deep Binary Search Trees are fast if they’re shallow: e.g.: perfectly complete e.g.: perfectly complete except the “fringe” (leafs) any other good cases? What makes a good BST good? Here’s two examples. Are these the only good BSTs? No! Anything without too many long branches is good, right? Problems occur when one branch is much longer than the other! What matters here?

Balance Balance: Balance between -1 and 1 everywhere  height(left subtree) - height(right subtree) zero everywhere  perfectly balanced small everywhere  balanced enough t 5 7 We’ll use the concept of Balance to keep things shallow. Balance between -1 and 1 everywhere  maximum height of 1.44 log n

AVL Tree Dictionary Data Structure Binary search tree properties binary tree property search tree property Balance property balance of every node is: -1 b  1 result: depth is (log n) 8 5 11 2 6 10 12 So, AVL trees will be Binary Search Trees with one extra feature: They balance themselves! The result is that all AVL trees at any point will have a logarithmic asymptotic bound on their depths 4 7 9 13 14 15

Testing the Balance Property 10 5 15 2 9 20 We need to know a few things now: How do we track the balance? How do we detect imbalance? How do we restore balance? Let’s start with this tree and see if it’s balanced. By the way, is it leftist? No! because of 15 Height (level order): 3 2 2 0 1 1 0 0 0 There’s that darn 15 again! It’s not balanced at 15. 7 17 30 NULLs have height -1

An AVL Tree 10 10 3 5 15 2 9 12 20 17 30 data 3 height children 1 2 1 1 Here’s a revision of that tree that’s balanced. (Same values, similar tree) This one _is_ an AVL tree (and isn’t leftist). I also have here how we might store the nodes in the AVL tree. Notice that I’m going to keep track of height all the time. WHY? 2 9 12 20 17 30

Not AVL Trees 10 10 0-2 = -2 (-1)-1 = -2 5 15 15 12 20 20 17 30 3 2 2 2 0-2 = -2 (-1)-1 = -2 1 5 15 15 1 12 20 20 These, however, are not AVL trees 17 30

But, How Do We Stay Balanced? I need: the smallest person in the class the tallest person in the class the averagest (?) person in the class Alright, so we now what balance is and how to detect imbalance. How do we keep the tree balanced? I need some data points to do this. Can I have the {smallest, tallest, middlest} person in the class, please?

Beautiful Balance Insert(middle) Insert(small) Insert(tall) 1 Let’s make a tree from these people with their height as the keys. We’ll start by inserting [MIDDLE] first. Then, [SMALL] and finally [TALL]. Is this tree balanced? Yes!

Bad Case #1 Insert(small) Insert(middle) Insert(tall) 2 1 But, let’s start over… Insert [SMALL] Now, [MIDDLE]. Now, [TALL]. Is this tree balanced? NO! Who do we need at the root? [MIDDLE!] Alright, let’s pull er up.

Single Rotation 2 1 1 This is the basic operation we’ll use in AVL trees. Since this is a right child, it could legally have the parent as its left child. When we finish the rotation, we have a balanced tree!

General Single Rotation h + 2 h + 1 a a X Y b Z h h + 1 h - 1 b X h h - 1 h h - 1 h - 1 Z Y Here’s the general form of this. We insert into the red tree. That ups the three heights on the left. Basically, you just need to pull up on the child. Then, ensure that everything falls in place as legal subtrees of the nodes. Notice, though, the height of this subtree is the same as it was before the insert into the red tree. So? So, we don’t have to worry about ancestors of the subtree becoming imbalanced; we can just stop here! Height of subtree same as it was before insert! Height of all ancestors unchanged. So?

Bad Case #2 Insert(small) Insert(tall) Insert(middle) 2 1 There’s another bad case, though. What if we insert: [SMALL] [TALL] [MIDDLE] Now, is the tree imbalanced? Will a single rotation fix it? (Try it by bringing up tall; doesn’t work!)

Double Rotation 2 2 1 1 1 Let’s try two single rotations, starting a bit lower down. First, we rotate up middle. Then, we rotate up middle again! Is the new tree balanced?

General Double Rotation h + 2 a h + 1 h + 1 c h - 1 h - 1 b Z h h b a W h c h - 1 h - 1 X Y W Z X Y Here’s the general form of this. Notice that the difference here is that we zigged one way than zagged the other to find the problem. We don’t really know or care which of X or Y was inserted into, but one of them was. To fix it, we pull c all the way up. Then, put a, b, and the subtrees beneath it in the reasonable manner. The height is still the same at the end! h - 1? h - 1? Height of subtree still the same as it was before insert! Height of all ancestors unchanged.

To Do Project II-A Read through section 4.6 in the book

Coming Up Project II – the complete version! More balancing acts A Huge Search Tree Data Structure