CSE 326: Data Structures Lecture #8 Binary Search Trees Alon Halevy Spring Quarter 2001
Binary Trees Many algorithms are efficient and easy to program for the special case of binary trees Binary tree is –a root –left subtree (maybe empty) –right subtree (maybe empty) A B DE C F HG JI
Binary Search Tree Dictionary Data Structure Search tree property –all keys in left subtree smaller than root’s key –all keys in right subtree larger than root’s key –result: easy to find any given key inserts/deletes by changing links
Example and Counter-Example BINARY SEARCH TREE NOT A BINARY SEARCH TREE 7
In Order Listing visit left subtree visit node visit right subtree In order listing: 2 5 7 9 10 15 17 20 30
Finding a Node Node *& find(Comparable x, Node * root) { if (root == NULL) return root; else if (x key) return find(x, root->left); else if (x > root->key) return find(x, root->right); else return root; } runtime:
Insert Concept: proceed down tree as in Find; if new key not found, then insert a new node at last spot traversed void insert(Comparable x, Node * root) { assert ( root != NULL ); if (x key){ if (root->left == NULL) root->left = new Node(x); else insert( x, root->left ); } else if (x > root->key){ if (root->right == NULL) root->right = new Node(x); else insert( x, root->right ); } }
BuildTree for BSTs Suppose a 1, a 2, …, a n are inserted into an initially empty BST: 1.a 1, a 2, …, a n are in increasing order 2.a 1, a 2, …, a n are in decreasing order 3.a 1 is the median of all, a 2 is the median of elements less than a 1, a 3 is the median of elements greater than a 1, etc. 4.data is randomly ordered
Examples of Building from Scratch 1, 2, 3, 4, 5, 6, 7, 8, 9 5, 3, 7, 2, 4, 6, 8, 1, 9
Analysis of BuildTree Worst case is O(n 2 ) … + n = O(n 2 ) Average case assuming all orderings equally likely is O(n log n) –not averaging over all binary trees, rather averaging over all input sequences (inserts) –equivalently: average depth of a node is log n –proof: see Introduction to Algorithms, Cormen, Leiserson, & Rivest
Bonus: FindMin/FindMax Find minimum Find maximum
Deletion Why might deletion be harder than insertion?
Deletion - Leaf Case Delete(17)
Deletion - One Child Case Delete(15)
Deletion - Two Child Case Delete(5) replace node with value guaranteed to be between the left and right subtrees: the successor Could we have used the predecessor instead?
Finding the Successor Find the next larger node in this node’s subtree. –not next larger in entire tree Node * succ(Node * root) { if (root->right == NULL) return NULL; else return min(root->right); } How many children can the successor of a node have?
Predecessor Find the next smaller node in this node’s subtree. Node * pred(Node * root) { if (root->left == NULL) return NULL; else return max(root->left); }
Deletion - Two Child Case Delete(5) always easy to delete the successor – always has either 0 or 1 children!
Delete Code void delete(Comparable x, Node *& p) { Node * q; if (p != NULL) { if (p->key right); else if (p->key > x) delete(x, p->left); else { /* p->key == x */ if (p->left == NULL) p = p->right; else if (p->right == NULL) p = p->left; else { q = successor(p); p->key = q->key; delete(q->key, p->right); }
Lazy Deletion Instead of physically deleting nodes, just mark them as deleted +simpler +physical deletions done in batches +some adds just flip deleted flag –extra memory for deleted flag –many lazy deletions slow finds –some operations may have to be modified (e.g., min and max)
Lazy Deletion Delete(17) Delete(15) Delete(5) Find(9) Find(16) Insert(5) Find(17)
Dictionary Implementations BST’s looking good for shallow trees, i.e. the depth D is small (log n), otherwise as bad as a linked list! unsorted array sorted array linked list BST insertfind + O(n)O(n)find + O(1)O(Depth) findO(n)O(log n)O(n)O(Depth) deletefind + O(1)O(n)find + O(1)O(Depth)
Beauty is Only (log n) Deep Binary Search Trees are fast if they’re shallow: –e.g.: perfectly complete –e.g.: perfectly complete except the “fringe” (leafs) –any other good cases? What matters here? Problems occur when one branch is much longer than the other!
Balance –height(left subtree) - height(right subtree) –zero everywhere perfectly balanced –small everywhere balanced enough t 5 7 Balance between -1 and 1 everywhere maximum height of 1.44 log n
AVL Tree Dictionary Data Structure Binary search tree properties –binary tree property –search tree property Balance property –balance of every node is: -1 b 1 –result: depth is (log n) 15
An AVL Tree data height children 30 0
Not AVL Trees (-1)-1 = = -2
Staying Balanced Good case: inserting small, tall and middle. Insert(middle) Insert(small) Insert(tall) M ST 00 1
Bad Case #1 Insert(small) Insert(middle) Insert(tall) T M S 0 1 2
Single Rotation T M S M ST 00 1 Basic operation used in AVL trees: A right child could legally have its parent as its left child.
General Case: Insert Unbalances a X Y b Z h h - 1 h + 1 h - 1 h + 2 a X Y b Z h-1 h h + 1
General Single Rotation Height of left subtree same as it was before insert! Height of all ancestors unchanged –We can stop here! a X Y b Z a XY b Z h h - 1 h + 1 h - 1 h + 2 h h - 1 h h + 1
Bad Case #2 Insert(small) Insert(tall) Insert(middle) M T S Will a single rotation fix this?
Double Rotation M ST 00 1 M T S T M S 0 1 2
General Double Rotation Initially: insert into either X or Y unbalances tree (root height goes to h+2) “Zig zag” to pull up c – restores root height to h+1, left subtree height to h a Z b W c XY a Z b W c XY h h - 1? h - 1 h + 2 h + 1 h - 1 h h + 1 h h - 1?
Insert Algorithm Find spot for value Hang new node Search back up looking for imbalance If there is an imbalance: case #1: Perform single rotation and exit case #2: Perform double rotation and exit
Easy Insert Insert(3)
Hard Insert (Bad Case #1) Insert(33)
Single Rotation
Hard Insert (Bad Case #2) Insert(18)
Single Rotation (oops!)
Double Rotation (Step #1) Look familiar? 18 0
Double Rotation (Step #2)
AVL Algorithm Revisited Recursive 1. Search downward for spot 2. Insert node 3. Unwind stack, correcting heights a. If imbalance #1, single rotate b. If imbalance #2, double rotate Iterative 1. Search downward for spot, stacking parent nodes 2. Insert node 3. Unwind stack, correcting heights a. If imbalance #1, single rotate and exit b. If imbalance #2, double rotate and exit
Single Rotation Code void RotateRight(Node *& root) { Node * temp = root->right; root->right = temp->left; temp->left = root; root->height = max(root->right->height, root->left->height) + 1; temp->height = max(temp->right->height, temp->left->height) + 1; root = temp; } X Y Z root temp
Double Rotation Code void DoubleRotateRight(Node *& root) { RotateLeft(root->right); RotateRight(root); } a Z b W c XY a Z c b X Y W First Rotation
Double Rotation Completed a Z c b X Y W a Z c b X Y W First Rotation Second Rotation