Outline Scapegoat Trees ( O(log n) amortized time)

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
AA Trees another alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A balanced.
Balanced Binary Search Trees
November 5, Algorithms and Data Structures Lecture VIII Simonas Šaltenis Nykredit Center for Database Research Aalborg University
A balanced life is a prefect life.
Red-Black Trees CIS 606 Spring Red-black trees A variation of binary search trees. Balanced: height is O(lg n), where n is the number of nodes.
CS Section 600 CS Section 002 Dr. Angela Guercio Spring 2010.
6/14/2015 6:48 AM(2,4) Trees /14/2015 6:48 AM(2,4) Trees2 Outline and Reading Multi-way search tree (§3.3.1) Definition Search (2,4)
CSC311: Data Structures 1 Chapter 10: Search Trees Objectives: Binary Search Trees: Search, update, and implementation AVL Trees: Properties and maintenance.
1 Red-Black Trees. 2 Black-Height of the tree = 4.
1 COSC 2P03 Lecture #5 – Trees Part III, Heaps. 2 Today Take up the quiz Assignment Questions Red-Black Trees Binary Heaps Heap sort D-Heaps, Leftist.
Multi-Way search Trees Trees: a. Nodes may contain 1 or 2 items. b. A node with k items has k + 1 children c. All leaves are on same level.
Trees and Red-Black Trees Gordon College Prof. Brinton.
Binary Search Trees1 ADT for Map: Map stores elements (entries) so that they can be located quickly using keys. Each element (entry) is a key-value pair.
© 2004 Goodrich, Tamassia (2,4) Trees
Multi-Way search Trees Trees: a. Nodes may contain 1 or 2 items. b. A node with k items has k + 1 children c. All leaves are on same level.
1 Red Black Trees (Guibas Sedgewick 78). 2 Goal Keep sorted lists subject to the following operations: find(x,L) insert(x,L) delete(x,L) catenate(L1,L2)
Balanced Trees Ellen Walker CPSC 201 Data Structures Hiram College.
Balanced Trees AVL Trees Red-Black Trees 2-3 Trees Trees.
Course: Programming II - Abstract Data Types Red-Black TreesSlide Number 1 Balanced Search Trees Binary Search Tree data structures can allow insertion,
Analysis of Red-Black Tree Because of the rules of the Red-Black tree, its height is at most 2log(N + 1). Meaning that it is a balanced tree Time Analysis:
10/20/2015 2:03 PMRed-Black Trees v z. 10/20/2015 2:03 PMRed-Black Trees2 Outline and Reading From (2,4) trees to red-black trees (§9.5) Red-black.
© 2004 Goodrich, Tamassia Red-Black Trees v z.
Computer Algorithms Submitted by: Rishi Jethwa Suvarna Angal.
© 2004 Goodrich, Tamassia Red-Black Trees v z.
Red Black Tree Smt Genap Outline Red-Black Trees ◦ Motivation ◦ Definition ◦ Operation Smt Genap
1. 2 Setting Up Deletion As with binary search trees, we can always delete a node that has at least one external child If the key to be deleted is stored.
Beyond (2,4) Trees What do we know about (2,4)Trees? Balanced
Min Chen School of Computer Science and Engineering Seoul National University Data Structure: Chapter 8.
Data Structures CSCI 2720 Spring 2007 Balanced Trees.
Data Structures Balanced Trees 1CSCI Outline  Balanced Search Trees 2-3 Trees Trees Red-Black Trees 2CSCI 3110.
Advanced Data Structures and Implementation Top-Down Splay Trees Top-Down Splay Trees Red-Black Trees Red-Black Trees Top-Down Red Black Trees Top-Down.
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
Search Trees. Binary Search Tree (§10.1) A binary search tree is a binary tree storing keys (or key-element pairs) at its internal nodes and satisfying.
1 Trees 4: AVL Trees Section 4.4. Motivation When building a binary search tree, what type of trees would we like? Example: 3, 5, 8, 20, 18, 13, 22 2.
Outline Binary Trees Binary Search Tree Treaps. Binary Trees The empty set (null) is a binary tree A single node is a binary tree A node has a left child.
Fall 2006 CSC311: Data Structures 1 Chapter 10: Search Trees Objectives: Binary Search Trees: Search, update, and implementation AVL Trees: Properties.
Chapter 2: Basic Data Structures. Spring 2003CS 3152 Basic Data Structures Stacks Queues Vectors, Linked Lists Trees (Including Balanced Trees) Priority.
Lecture 11COMPSCI.220.FS.T Balancing an AVLTree Two mirror-symmetric pairs of cases to rebalance the tree if after the insertion of a new key to.
Week 8 - Wednesday.  What did we talk about last time?  Level order traversal  BST delete  2-3 trees.
© 2004 Goodrich, Tamassia Trees
3.1. Binary Search Trees   . Ordered Dictionaries Keys are assumed to come from a total order. Old operations: insert, delete, find, …
Week 10 - Friday.  What did we talk about last time?  Graph representations  Adjacency matrix  Adjacency lists  Depth first search.
(2,4) Trees1 What are they? –They are search Trees (but not binary search trees) –They are also known as 2-4, trees.
Week 15 – Wednesday.  What did we talk about last time?  Review up to Exam 1.
1 Binary Search Trees   . 2 Ordered Dictionaries Keys are assumed to come from a total order. New operations: closestKeyBefore(k) closestElemBefore(k)
Red-Black Trees an alternative to AVL trees. Balanced Binary Search Trees A Binary Search Tree (BST) of N nodes is balanced if height is in O(log N) A.
Lecture 23 Red Black Tree Chapter 10 of textbook
AA Trees.
File Organization and Processing Week 3
Red-Black Trees v z Red-Black Trees 1 Red-Black Trees
Red-Black Trees Motivations
Data Structures Balanced Trees CSCI
(2,4) Trees (2,4) Trees 1 (2,4) Trees (2,4) Trees
Red-Black Trees v z Red-Black Trees 1 Red-Black Trees
Red-Black Trees v z /20/2018 7:59 AM Red-Black Trees
(2,4) Trees /26/2018 3:48 PM (2,4) Trees (2,4) Trees
(2,4) Trees (2,4) Trees (2,4) Trees.
Algorithms and Data Structures Lecture VIII
2-3-4 Trees Red-Black Trees
Red-Black Trees v z /17/2019 4:20 PM Red-Black Trees
(2,4) Trees 2/15/2019 (2,4) Trees (2,4) Trees.
(2,4) Trees /24/2019 7:30 PM (2,4) Trees (2,4) Trees
(2,4) Trees (2,4) Trees (2,4) Trees.
Red Black Trees (Guibas Sedgewick 78)
(2,4) Trees /6/ :26 AM (2,4) Trees (2,4) Trees
Binary Search Trees < > = Dictionaries
Presented by Sams Uddin Ahamed & Najmus Sakib Borson Scapegoat Tree (Data Structure)
Red-Black Trees v z /6/ :10 PM Red-Black Trees
Presentation transcript:

Outline Scapegoat Trees ( O(log n) amortized time) 2-4 Trees ( O(log n) worst case time) Red Black Trees ( O(log n) worst case time)

Review Skiplists and Treaps So far, we have seen treaps and skiplists Randomized structures Insert/delete/search in O(log n) expected time Expectation depends on random choices made by the data structure Coin tosses made by a skiplist Random priorities assigned by a treap

Scapegoat trees Deterministic data structure Lazy data structure Only does work when search paths get too long Search in O(log n) worst-case time Insert/delete in O(log n) amortized time Starting with an empty scapegoat tree, a sequence of m insertions and deletions takes O(mlog n) time

Scapegoat philosophy We follow a simple strategy. 15 If the tree is not optimal rebuild. Is this a good binary search tree? 16 7 3 11 1 5 9 13 2 4 6 8 10 12 14 It has 17 nodes and 5 levels Any binary tree with 17 nodes has at least 5 levels (A binary tree with 4 levels has at most 24 - 1 = 15 nodes) This is an “optimal" binary search tree.

How to know when we need to rebuild the tree? Scapegoat philosophy Rebuild the tree cost O(n) time We cannot do it to often if we want to keep the order of O(log n) amortized time. Scapegoat trees keep two counters: How to know when we need to rebuild the tree? n: the number of items in the tree (size) q: an overestimate of n We maintain the following two invariants: q/2 ≤ n ≤ q No node has depth greater than log3/2 q

Search and Delete How can we perform a search in a Scapegoat tree? How can we delete a value x from a Scapegoat tree? run the standard deletion algorithm for binary search trees. decrement n if n < q/2 then rebuild the entire tree and set q=n How can we insert a value x into a Scapegoat tree?

Insert How can we insert a value x into a Scapegoat tree? To insert the value x into a ScapegoatTree: Create a node u and insert in the normal way. Increment n and q If the depth of u is greater than log3/2 q, then Walk up to the root from u until reaching a node w with size(w) > (2/3) size(w:parent) Rebuild the subtree rooted at w.parent

Inserting into a Scapegoat tree ( easy case ) n = q = 10 n = q = 11 5 u=3.5 2 8 3.5 1 4 7 9 3 6 u Create a node u and insert in the normal way. Increment n and q depth(u) = 4 ≤ log3/2 q = 5.913

Inserting into a Scapegoat tree ( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 3 1 ≤ (2/3)2 = 1.33 w 3.5 size(w) > (2/3) size(w.parent)

Inserting into a Scapegoat tree ( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 w 2 ≤ (2/3)3 = 2 3 3.5 size(w) > (2/3) size(w.parent)

Inserting into a Scapegoat tree ( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 3 ≤ (2/3)6 = 4 w 3 3.5 size(w) > (2/3) size(w.parent)

Inserting into a Scapegoat tree ( bad case ) 7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 w 1 4 6 > (2/3)7 = 4.67 3 ( Scapegoat ) 3.5 size(w) > (2/3) size(w.parent)

Inserting into a Scapegoat tree ( bad case ) 7 n = q = 11 6 8 u=3.5 3 9 1 4 2 3.5 5 How can we be sure that the scapegoat node always exist?

Why is there always a scapegoat? Lemma: if d > log3/2 q then there exists a scapegoat node. Proof by contradiction Assume (for contradiction) that we don't find a scapegoat node. Then size(w) ≤ (2/3) size(w.parent) for all nodes w on the path to u The size of a node at depth i is at most n(2/3)I But d > log3/2 q ≥ log3/2 n, so size(u) ≤ n(2/3)d < n(2/3)log3/2 n n = n/n = 1 Contradiction! (Since size(u)=1) So there must be a scapegoat node.

Summary So far, we know Insert and delete maintain the invariants: the depth of any node is at most log3/2 q q < 2n So the depth of any node is most log3/2 2n ≤ 2 + log3/2 n So, we can search in a scapegoat tree in O(log n) time Some issues still to resolve How do we keep track of size(w) for each node w? How much time is spent rebuilding nodes during deletion and insertion?

Keeping track of the size There are two possible solutions: Solution 1: Each node keeps an extra counter for its size During insertion, each node on the path to u gets its counter incremented During deletion, each node on the path to u gets its counter decremented We calculate sizes bottom-up during a rebuild Solution 2: Each node doesn't keep an extra counter for its size

(Not) keeping track of the size We only need the size(w) while looking for a scapegoat Knowing size(w), we can compute size(w.parent) by traversing the subtree rooted at sibling(w) 7 So, in O(size(v)), we know all sizes up to the scapegoat node time 6 8 5 9 But we do O(size(v)) work when we rebuild v anyway, so this doesn't add anything to the cost of rebuilding 2 1 4 3 3.5

Analysis of deletion This takes O(n) time When deleting, if n < q/2, then we rebuild the whole tree This takes O(n) time If n < q/2 then we have done at least q - n > n/2 deletions The amortized (average) cost of rebuilding (due to deletions) is O(1) per deletion

Analysis of insertion If no rebuild is necessary the cost of the insertion is log( n ) After rebuilding a sub tree containing node v, both of its children have de same size*. If the subtree rooted in v has size n we needed at least n/3 insertion the previous rebuilding process. The rebuild cost n(log n) operations Thus the cost of the insertion is O(log n) amortized time.

Scapegoat trees summary Theorem: The cost to search in a scapegoat tree is O(log n) in the worst-case. The cost of insertion and deletion in a scapegoat tree are O(log n) amortized time per operation. Scapegoat trees often work even better than expected If we get lucky, then no rebuilding is required

Review: Maintaining Sorted Sets We have seen the following data structures for implementing a SortedSet Skiplists: find(x)/add(x)/remove(x) in O(log n) expected time per operation Treaps: find(x)/add(x)/remove(x) in Scapegoat trees: find(x) in O(log n) worst-case time per operation, add(x)/remove(x) in O(log n) amortized time per operation

Review: Maintaining Sorted Sets No data structures course would be complete without covering 2-4 trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation Red-black trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation

The height of 2-4 Trees A 2-4 tree is a tree in which Each internal node has 2, 3, or 4 children All the leaves are at the same level

Binary Trees Lemma: A 2-4 tree of height h ≥ 0 has at least 2h leaves Proof: The number of nodes at level i is at least 2i Corollary: A 2-4 tree with n > 0 leaves has height at most log2 n Proof: n ≥ 2h ↔ log2 ≥ h ≥20=1 ≥21=2 ≥22=4 ≥23=8

Add a leaf to a 2-4 Trees To add a leaf w as a child of a node u in a 2-4 tree: Add w as a child of u

Add a leaf to a 2-4 Trees To add a leaf w as a child of a node u in a 2-4 tree: Add w as a child of u While u has 5 children do: Split u into two nodes with 2 and 3 children, respectively, and make them children of u.parent Set u = u.parent If root was split, create new root with 2 children This runs in O(h) = O(log n) time

Deleting a leaf to a 2-4 Trees To delete a leaf w from a 2-4 tree: Remove w from its parent u While u has 1 child and u != root If u has a sibling v with 3 or more children then borrow a child from v Else merge u with its sibling v, remove v from u.parent and set u = u.parent If u == root and u has 1 child, then set root = u.child[0]

Deleting a leaf to a 2-4 Trees To delete a leaf w from a 2-4 tree: Remove w from its parent u While u has 1 child and u != root If u has a sibling v with 3 or more children then borrow a child from v Else merge u with its sibling v, remove v from u.parent and set u = u.parent If u == root and u has 1 child, then set root = u.child[0] This runs in O(h) = O(log n) time

2-4 trees can act as search trees 3-5 6-7-8 0-1-2 -4- 1 2 3 4 5 6 7 8 9 How? All n keys are stored in the leaves Internal nodes store 1, 2, or 3 values to direct searches to the correct subtree Searches take O(h) = O(log n) time Theorem: A 2-4 tree supports the operations find(x), add(x), and remove(x) in O(log n) time per operation

binary version of 2-4 trees Red-Black Trees 2-4 trees are nice, but they aren't binary trees How can we made it binary Red-black trees binary version of 2-4 trees

Red-Black Trees A red-black tree is a binary search tree in which each node is colored red or black Each red node has 2 black children The number of black nodes on every root-to-leaf path is the same null (external) nodes are considered black the root is always black

Red-Black trees and 2-4 trees A red-black tree is an encoding of 2-4 tree as a binary tree Red nodes are “virtual nodes" that allow 3 and 4 children per black node

The height of Red-Black Trees Each red node has 2 black children The number of black nodes on every root-to-leaf path is the same Red-black trees properties: Theorem: A red-black tree with n nodes has height at most: 2 log2(n + 1) A red-black tree is an encoding of a 2-4 tree with n + 1 leaves Black height is at most log2(n + 1) Red nodes at most double this height

Red-Black Trees Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree

Red-Black Trees Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree This results in a lot of cases To get fewer cases, we add an extra property: If u has a red child then u.left is red

Adding to a read black tree To add a new value to a red-black tree: create a new red node u and insert it as usual (as a leaf) call insertFixup(u) to restore no red-red edge if u has a red child then u.left is red Each iteration of addFixup(u) moves u up in tree Finishes after O(log n) iterations in O(log n) time

Insertion cases Case 1:The new node N is the root. We color N as black. N N All the properties are still satisfied. ? ? ? ? ? ?

Insertion cases Case 2:The parent P of the new node N is Black. All the properties are still satisfied. N ? ? ? ? ?

Insertion cases Case 3:The parent P of the new node N and the uncle U are both red. G G Red property is not satisfied. P P U U P and U become blacks. N ? ? ? Path property is not satisfied. ? ? P`s parent G become red. Are all the properties satisfied now?. The process is repeated recursively until reach case 1

Insertion cases Case 4:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the left child of P. Rotate to the right P. P P G G G P U N 1 2 3 N 3 4 5 U 4 5 1 2 Change colors of P and G.

Insertion cases Case 5:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the right child of P. Rotate to the left N and reach case 4 G G P U N U 1 N 4 5 P 3 4 5 2 3 1 2

Removing from a read black tree To remove a value from a red-black tree: remove a node w with 0 or 1 children set u=w.parent and make u blacker red becomes black black becomes double-black call removeFixup(u) to restore no double-black nodes if u has a red child then u.left is red Each iteration of addFixup(u) moves u up in tree Finishes after O(log n) iterations in O(log n) time

Removing simple cases If the node N to be removed has two children we change it from its successor and remove the successor (as in any binary tree). We can assume N has at most one child. N N If N is red just remove it. ? ? If N`s child is red color it black and remove N. N All the properties are still satisfied. ? ?

Removing complex cases Both N and its child are black We remove N and replace it by its children (we will call now N to its child and S to its new brother). P P N S C N S C ? ? ? ? ? ? ? ?

Insertion cases Case 1:N is the new root. Everything is done. All the properties are satisfied. N N ? ? ? ? ? ?

Insertion cases Case 2:The node S is red. Rotate to the left S and swap colors between S and P P S S N S P P Sr 1 2 5 6 Sl Sr N Sl 3 4 5 6 1 2 3 4 Is the path property satisfied? We pass to case 4, 5 or 6.

Insertion cases Case 3:All N, P, S and the children of S are black. We color S as red. P P N S N S 1 2 1 2 Sl Sr Sl Sr 3 4 5 6 3 4 5 6 Is the path property satisfied? We recursively repeat the checking process with node P

Insertion cases Case 4: N, S and the children of S are black but P is red. We swap the colors of nodes S and P. P P N S N S 1 2 1 2 Sl Sr Sl Sr 3 4 5 6 3 4 5 6 Is the path property satisfied? Yes all the properties are satisfied. Why?

Insertion cases Case 5: N is a left child of P and S and its right child are black but its left child is black We rotate to the right at S. P P Sl Sl N S N S S 1 2 1 2 Sl Sr Sr 3 4 5 6 3 4 5 6 We swap colors of S and its parent. We move to the case 6

Insertion cases All the properties are satisfied. Case 5: N is a left child of P and S is black and its right child is red. We rotate to the left at P. S S P N S P P Sr Sr 1 2 3 Sr N 4 5 1 2 3 4 5 Set the right child of S to black and swap colors of P and S. All the properties are satisfied.

Summary Key point: there exist data structures (2-4 trees and red-black trees) that support SortedSet operations in O(log n) worst-case time per operation Implementation difficulty is considerably higher than Scapegoat trees/skiplists/treaps Look more closely at addFixup(u) and removeFixup(u) Amortized analysis shows that they do only O(1) work on average

Summary Key point: there exist data structures (2-4 trees and red-black trees) that support SortedSet operations in O(log n) worst-case time per operation Theorem: Starting with an empty red-black tree, any sequence of m add(x)/remove(x) operations performs only O(m) rotations and color changes This is useful if want to apply persistence to remember old versions of the tree for later use

Summary Skiplists: find(x)/add(x)/remove(x) in O(log n) expected time per operation. Treaps: find(x)/add(x)/remove(x) in O(log n) expected time per operation. Scapegoat trees: find(x) in O(log n) worst-case time per operation, add(x)/remove(x) in O(log n) amortized time per operation. Red-black trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation All structures, except scapegoat trees do O(1) amortized (or expected) restructuring per add(x)/remove(x) operation