Presentation is loading. Please wait.

Presentation is loading. Please wait.

Outline Scapegoat Trees ( O(log n) amortized time)

Similar presentations


Presentation on theme: "Outline Scapegoat Trees ( O(log n) amortized time)"— Presentation transcript:

1 Outline Scapegoat Trees ( O(log n) amortized time)
2-4 Trees ( O(log n) worst case time) Red Black Trees ( O(log n) worst case time)

2 Review Skiplists and Treaps
So far, we have seen treaps and skiplists Randomized structures Insert/delete/search in O(log n) expected time Expectation depends on random choices made by the data structure Coin tosses made by a skiplist Random priorities assigned by a treap

3 Scapegoat trees Deterministic data structure Lazy data structure
Only does work when search paths get too long Search in O(log n) worst-case time Insert/delete in O(log n) amortized time Starting with an empty scapegoat tree, a sequence of m insertions and deletions takes O(mlog n) time

4 Scapegoat philosophy We follow a simple strategy.
15 If the tree is not optimal rebuild. Is this a good binary search tree? 16 7 3 11 1 5 9 13 2 4 6 8 10 12 14 It has 17 nodes and 5 levels Any binary tree with 17 nodes has at least 5 levels (A binary tree with 4 levels has at most = 15 nodes) This is an “optimal" binary search tree.

5 How to know when we need to rebuild the tree?
Scapegoat philosophy Rebuild the tree cost O(n) time We cannot do it to often if we want to keep the order of O(log n) amortized time. Scapegoat trees keep two counters: How to know when we need to rebuild the tree? n: the number of items in the tree (size) q: an overestimate of n We maintain the following two invariants: q/2 ≤ n ≤ q No node has depth greater than log3/2 q

6 Search and Delete How can we perform a search in a Scapegoat tree?
How can we delete a value x from a Scapegoat tree? run the standard deletion algorithm for binary search trees. decrement n if n < q/2 then rebuild the entire tree and set q=n How can we insert a value x into a Scapegoat tree?

7 Insert How can we insert a value x into a Scapegoat tree? To insert the value x into a ScapegoatTree: Create a node u and insert in the normal way. Increment n and q If the depth of u is greater than log3/2 q, then Walk up to the root from u until reaching a node w with size(w) > (2/3) size(w:parent) Rebuild the subtree rooted at w.parent

8 Inserting into a Scapegoat tree ( easy case )
n = q = 10 n = q = 11 5 u=3.5 2 8 3.5 1 4 7 9 3 6 u Create a node u and insert in the normal way. Increment n and q depth(u) = 4 ≤ log3/2 q = 5.913

9 Inserting into a Scapegoat tree ( bad case )
7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 3 1 ≤ (2/3)2 = 1.33 w 3.5 size(w) > (2/3) size(w.parent)

10 Inserting into a Scapegoat tree ( bad case )
7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 w 2 ≤ (2/3)3 = 2 3 3.5 size(w) > (2/3) size(w.parent)

11 Inserting into a Scapegoat tree ( bad case )
7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 1 4 3 ≤ (2/3)6 = 4 w 3 3.5 size(w) > (2/3) size(w.parent)

12 Inserting into a Scapegoat tree ( bad case )
7 n = q = 11 6 8 u=3.5 5 9 d(u) = 6 > log3/2 q = 5.913 2 w 1 4 6 > (2/3)7 = 4.67 3 ( Scapegoat ) 3.5 size(w) > (2/3) size(w.parent)

13 Inserting into a Scapegoat tree ( bad case )
7 n = q = 11 6 8 u=3.5 3 9 1 4 2 3.5 5 How can we be sure that the scapegoat node always exist?

14 Why is there always a scapegoat?
Lemma: if d > log3/2 q then there exists a scapegoat node. Proof by contradiction Assume (for contradiction) that we don't find a scapegoat node. Then size(w) ≤ (2/3) size(w.parent) for all nodes w on the path to u The size of a node at depth i is at most n(2/3)I But d > log3/2 q ≥ log3/2 n, so size(u) ≤ n(2/3)d < n(2/3)log3/2 n n = n/n = 1 Contradiction! (Since size(u)=1) So there must be a scapegoat node.

15 Summary So far, we know Insert and delete maintain the invariants:
the depth of any node is at most log3/2 q q < 2n So the depth of any node is most log3/2 2n ≤ 2 + log3/2 n So, we can search in a scapegoat tree in O(log n) time Some issues still to resolve How do we keep track of size(w) for each node w? How much time is spent rebuilding nodes during deletion and insertion?

16 Keeping track of the size
There are two possible solutions: Solution 1: Each node keeps an extra counter for its size During insertion, each node on the path to u gets its counter incremented During deletion, each node on the path to u gets its counter decremented We calculate sizes bottom-up during a rebuild Solution 2: Each node doesn't keep an extra counter for its size

17 (Not) keeping track of the size
We only need the size(w) while looking for a scapegoat Knowing size(w), we can compute size(w.parent) by traversing the subtree rooted at sibling(w) 7 So, in O(size(v)), we know all sizes up to the scapegoat node time 6 8 5 9 But we do O(size(v)) work when we rebuild v anyway, so this doesn't add anything to the cost of rebuilding 2 1 4 3 3.5

18 Analysis of deletion This takes O(n) time
When deleting, if n < q/2, then we rebuild the whole tree This takes O(n) time If n < q/2 then we have done at least q - n > n/2 deletions The amortized (average) cost of rebuilding (due to deletions) is O(1) per deletion

19 Analysis of insertion If no rebuild is necessary the cost of the insertion is log( n ) After rebuilding a sub tree containing node v, both of its children have de same size*. If the subtree rooted in v has size n we needed at least n/3 insertion the previous rebuilding process. The rebuild cost n(log n) operations Thus the cost of the insertion is O(log n) amortized time.

20 Scapegoat trees summary
Theorem: The cost to search in a scapegoat tree is O(log n) in the worst-case. The cost of insertion and deletion in a scapegoat tree are O(log n) amortized time per operation. Scapegoat trees often work even better than expected If we get lucky, then no rebuilding is required

21 Review: Maintaining Sorted Sets
We have seen the following data structures for implementing a SortedSet Skiplists: find(x)/add(x)/remove(x) in O(log n) expected time per operation Treaps: find(x)/add(x)/remove(x) in Scapegoat trees: find(x) in O(log n) worst-case time per operation, add(x)/remove(x) in O(log n) amortized time per operation

22 Review: Maintaining Sorted Sets
No data structures course would be complete without covering 2-4 trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation Red-black trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation

23 The height of 2-4 Trees A 2-4 tree is a tree in which
Each internal node has 2, 3, or 4 children All the leaves are at the same level

24 Binary Trees Lemma: A 2-4 tree of height h ≥ 0 has at least 2h leaves
Proof: The number of nodes at level i is at least 2i Corollary: A 2-4 tree with n > 0 leaves has height at most log2 n Proof: n ≥ 2h ↔ log2 ≥ h ≥20=1 ≥21=2 ≥22=4 ≥23=8

25 Add a leaf to a 2-4 Trees To add a leaf w as a child of a node u in a 2-4 tree: Add w as a child of u

26 Add a leaf to a 2-4 Trees To add a leaf w as a child of a node u in a 2-4 tree: Add w as a child of u While u has 5 children do: Split u into two nodes with 2 and 3 children, respectively, and make them children of u.parent Set u = u.parent If root was split, create new root with 2 children This runs in O(h) = O(log n) time

27 Deleting a leaf to a 2-4 Trees
To delete a leaf w from a 2-4 tree: Remove w from its parent u While u has 1 child and u != root If u has a sibling v with 3 or more children then borrow a child from v Else merge u with its sibling v, remove v from u.parent and set u = u.parent If u == root and u has 1 child, then set root = u.child[0]

28 Deleting a leaf to a 2-4 Trees
To delete a leaf w from a 2-4 tree: Remove w from its parent u While u has 1 child and u != root If u has a sibling v with 3 or more children then borrow a child from v Else merge u with its sibling v, remove v from u.parent and set u = u.parent If u == root and u has 1 child, then set root = u.child[0] This runs in O(h) = O(log n) time

29 2-4 trees can act as search trees
3-5 6-7-8 0-1-2 -4- 1 2 3 4 5 6 7 8 9 How? All n keys are stored in the leaves Internal nodes store 1, 2, or 3 values to direct searches to the correct subtree Searches take O(h) = O(log n) time Theorem: A 2-4 tree supports the operations find(x), add(x), and remove(x) in O(log n) time per operation

30 binary version of 2-4 trees
Red-Black Trees 2-4 trees are nice, but they aren't binary trees How can we made it binary Red-black trees binary version of 2-4 trees

31 Red-Black Trees A red-black tree is a binary search tree in which each node is colored red or black Each red node has 2 black children The number of black nodes on every root-to-leaf path is the same null (external) nodes are considered black the root is always black

32 Red-Black trees and 2-4 trees
A red-black tree is an encoding of 2-4 tree as a binary tree Red nodes are “virtual nodes" that allow 3 and 4 children per black node

33 The height of Red-Black Trees
Each red node has 2 black children The number of black nodes on every root-to-leaf path is the same Red-black trees properties: Theorem: A red-black tree with n nodes has height at most: 2 log2(n + 1) A red-black tree is an encoding of a 2-4 tree with n + 1 leaves Black height is at most log2(n + 1) Red nodes at most double this height

34 Red-Black Trees Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree

35 Red-Black Trees Adding and removing in a red-black tree simulates adding/deleting in a 2-4 tree This results in a lot of cases To get fewer cases, we add an extra property: If u has a red child then u.left is red

36 Adding to a read black tree
To add a new value to a red-black tree: create a new red node u and insert it as usual (as a leaf) call insertFixup(u) to restore no red-red edge if u has a red child then u.left is red Each iteration of addFixup(u) moves u up in tree Finishes after O(log n) iterations in O(log n) time

37 Insertion cases Case 1:The new node N is the root.
We color N as black. N N All the properties are still satisfied. ? ? ? ? ? ?

38 Insertion cases Case 2:The parent P of the new node N is Black.
All the properties are still satisfied. N ? ? ? ? ?

39 Insertion cases Case 3:The parent P of the new node N and the uncle U are both red. G G Red property is not satisfied. P P U U P and U become blacks. N ? ? ? Path property is not satisfied. ? ? P`s parent G become red. Are all the properties satisfied now?. The process is repeated recursively until reach case 1

40 Insertion cases Case 4:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the left child of P. Rotate to the right P. P P G G G P U N 1 2 3 N 3 4 5 U 4 5 1 2 Change colors of P and G.

41 Insertion cases Case 5:The parent P of the new node N is red but the uncle U is black. P is the left child of G and N is the right child of P. Rotate to the left N and reach case 4 G G P U N U 1 N 4 5 P 3 4 5 2 3 1 2

42 Removing from a read black tree
To remove a value from a red-black tree: remove a node w with 0 or 1 children set u=w.parent and make u blacker red becomes black black becomes double-black call removeFixup(u) to restore no double-black nodes if u has a red child then u.left is red Each iteration of addFixup(u) moves u up in tree Finishes after O(log n) iterations in O(log n) time

43 Removing simple cases If the node N to be removed has two children we change it from its successor and remove the successor (as in any binary tree). We can assume N has at most one child. N N If N is red just remove it. ? ? If N`s child is red color it black and remove N. N All the properties are still satisfied. ? ?

44 Removing complex cases
Both N and its child are black We remove N and replace it by its children (we will call now N to its child and S to its new brother). P P N S C N S C ? ? ? ? ? ? ? ?

45 Insertion cases Case 1:N is the new root. Everything is done.
All the properties are satisfied. N N ? ? ? ? ? ?

46 Insertion cases Case 2:The node S is red.
Rotate to the left S and swap colors between S and P P S S N S P P Sr 1 2 5 6 Sl Sr N Sl 3 4 5 6 1 2 3 4 Is the path property satisfied? We pass to case 4, 5 or 6.

47 Insertion cases Case 3:All N, P, S and the children of S are black.
We color S as red. P P N S N S 1 2 1 2 Sl Sr Sl Sr 3 4 5 6 3 4 5 6 Is the path property satisfied? We recursively repeat the checking process with node P

48 Insertion cases Case 4: N, S and the children of S are black but P is red. We swap the colors of nodes S and P. P P N S N S 1 2 1 2 Sl Sr Sl Sr 3 4 5 6 3 4 5 6 Is the path property satisfied? Yes all the properties are satisfied. Why?

49 Insertion cases Case 5: N is a left child of P and S and its right child are black but its left child is black We rotate to the right at S. P P Sl Sl N S N S S 1 2 1 2 Sl Sr Sr 3 4 5 6 3 4 5 6 We swap colors of S and its parent. We move to the case 6

50 Insertion cases All the properties are satisfied.
Case 5: N is a left child of P and S is black and its right child is red. We rotate to the left at P. S S P N S P P Sr Sr 1 2 3 Sr N 4 5 1 2 3 4 5 Set the right child of S to black and swap colors of P and S. All the properties are satisfied.

51 Summary Key point: there exist data structures (2-4 trees and red-black trees) that support SortedSet operations in O(log n) worst-case time per operation Implementation difficulty is considerably higher than Scapegoat trees/skiplists/treaps Look more closely at addFixup(u) and removeFixup(u) Amortized analysis shows that they do only O(1) work on average

52 Summary Key point: there exist data structures (2-4 trees and red-black trees) that support SortedSet operations in O(log n) worst-case time per operation Theorem: Starting with an empty red-black tree, any sequence of m add(x)/remove(x) operations performs only O(m) rotations and color changes This is useful if want to apply persistence to remember old versions of the tree for later use

53 Summary Skiplists: find(x)/add(x)/remove(x) in O(log n) expected time per operation. Treaps: find(x)/add(x)/remove(x) in O(log n) expected time per operation. Scapegoat trees: find(x) in O(log n) worst-case time per operation, add(x)/remove(x) in O(log n) amortized time per operation. Red-black trees: find(x)/add(x)/remove(x) in O(log n) worst-case time per operation All structures, except scapegoat trees do O(1) amortized (or expected) restructuring per add(x)/remove(x) operation


Download ppt "Outline Scapegoat Trees ( O(log n) amortized time)"

Similar presentations


Ads by Google