Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004.

Slides:



Advertisements
Similar presentations
Chapter 13. Red-Black Trees
Advertisements

Chapter 5: Tree Constructions
Splay Tree Algorithm Mingda Zhao CSC 252 Algorithms Smith College Fall, 2000.
AVL Trees1 Part-F2 AVL Trees v z. AVL Trees2 AVL Tree Definition (§ 9.2) AVL trees are balanced. An AVL Tree is a binary search tree such that.
1 AVL Trees (10.2) CSE 2011 Winter April 2015.
QuickSort Average Case Analysis An Incompressibility Approach Brendan Lucier August 2, 2005.
Chapter 4: Trees Part II - AVL Tree
AVL Trees COL 106 Amit Kumar Shweta Agrawal Slide Courtesy : Douglas Wilhelm Harder, MMath, UWaterloo
Time Complexity of Basic BST Operations Search, Insert, Delete – These operations visit the nodes along a root-to- leaf path – The number of nodes encountered.
Tirgul 5 AVL trees.
C++ Programming:. Program Design Including
TCSS 342 AVL Trees v1.01 AVL Trees Motivation: we want to guarantee O(log n) running time on the find/insert/remove operations. Idea: keep the tree balanced.
1 Minimize average access time Items have weights: Item i has weight w i Let W =  w i be the total weight of the items Want the search to heavy items.
Shuchi Chawla, Carnegie Mellon University Static Optimality and Dynamic Search Optimality in Lists and Trees Avrim Blum Shuchi Chawla Adam Kalai 1/6/2002.
1 COSC 2P03 Lecture #5 – Trees Part III, Heaps. 2 Today Take up the quiz Assignment Questions Red-Black Trees Binary Heaps Heap sort D-Heaps, Leftist.
Tirgul 6 B-Trees – Another kind of balanced trees Problem set 1 - some solutions.
CSC 2300 Data Structures & Algorithms February 13, 2007 Chapter 4. Trees.
1 Splay trees (Sleator, Tarjan 1983). 2 Motivation Assume you know the frequencies p 1, p 2, …. What is the best static tree ? You can find it in O(nlog(n))
CSC 212 Lecture 19: Splay Trees, (2,4) Trees, and Red-Black Trees.
Binary search trees Definition Binary search trees and dynamic set operations Balanced binary search trees –Tree rotations –Red-black trees Move to front.
CS420 lecture eight Greedy Algorithms. Going from A to G Starting with a full tank, we can drive 350 miles before we need to gas up, minimize the number.
Balanced Trees Balanced trees have height O(lg n).
Splay Trees Splay trees are binary search trees (BSTs) that:
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
Splay Trees and B-Trees
ECE 250 Algorithms and Data Structures Douglas Wilhelm Harder, M.Math. LEL Department of Electrical and Computer Engineering University of Waterloo Waterloo,
Advanced Data Structures and Algorithms COSC-600 Lecture presentation-6.
Balanced Binary Search Trees
“On an Algorithm of Zemlyachenko for Subtree Isomorphism” Yefim Dinitz, Alon Itai, Michael Rodeh (1998) Presented by: Masha Igra, Merav Bukra.
Weight balance trees (Nievergelt & Reingold 73)
Lecture 10 Algorithm Analysis Arne Kutzner Hanyang University / Seoul Korea.
1 Binary Trees Informal defn: each node has 0, 1, or 2 children Informal defn: each node has 0, 1, or 2 children Formal defn: a binary tree is a structure.
1 Balanced Trees There are several ways to define balance Examples: –Force the subtrees of each node to have almost equal heights –Place upper and lower.
2-3 Tree. Slide 2 Outline  Balanced Search Trees 2-3 Trees Trees.
1 Splay trees (Sleator, Tarjan 1983). 2 Goal Support the same operations as previous search trees.
Search Trees Chapter   . Outline  Binary Search Trees  AVL Trees  Splay Trees.
Red–black trees.  Define the red-black tree properties  Describe and implement rotations  Implement red-black tree insertion  We will skip red-black.
Trees 2: Section 4.2 and 4.3 Binary trees. Binary Trees Definition: A binary tree is a rooted tree in which no vertex has more than two children
Proximity Inversion Functions on the Non-Negative Integers Presented By Brendan Lucier June 5, 2005 CMS Summer 2005 Meeting Automatic Sequences and Related.
Binary Search Trees (BSTs) 18 February Binary Search Tree (BST) An important special kind of binary tree is the BST Each node stores some information.
CMSC 341 Introduction to Trees. 2/21/20062 Tree ADT Tree definition –A tree is a set of nodes which may be empty –If not empty, then there is a distinguished.
AVL Trees / Slide 1 Height-balanced trees AVL trees height is no more than 2 log 2 n (n is the number of nodes) Proof based on a recurrence formula for.
AVL trees1 AVL Trees Height of a node : The height of a leaf is 1. The height of a null pointer is zero. The height of an internal node is the maximum.
Balanced Binary Search Trees
CMSC 202, Version 5/02 1 Trees. CMSC 202, Version 5/02 2 Tree Basics 1.A tree is a set of nodes. 2.A tree may be empty (i.e., contain no nodes). 3.If.
1 AVL Trees II Implementation. 2 AVL Tree ADT A binary search tree in which the balance factor of each node is 0, 1, of -1. Basic Operations Construction,
Chapter 11. Chapter Summary  Introduction to trees (11.1)  Application of trees (11.2)  Tree traversal (11.3)  Spanning trees (11.4)
AVL Trees AVL (Adel`son-Vel`skii and Landis) tree = – A BST – With the property: For every node, the heights of the left and right subtrees differ at most.
Lecture 23 Red Black Tree Chapter 10 of textbook
Binary search trees Definition
Lec 13 Oct 17, 2011 AVL tree – height-balanced tree Other options:
Self-Adjusting Search trees
Source Code for Data Structures and Algorithm Analysis in C (Second Edition) – by Weiss
12. Graphs and Trees 2 Summary
AVL Trees A BST in which, for any node, the number of levels in its two subtrees differ by at most 1 The height of an empty tree is -1. If this relationship.
B+ Tree.
Static Optimality and Dynamic Search Optimality in Lists and Trees
Lecture 18. Basics and types of Trees
Lecture 25 Splay Tree Chapter 10 of textbook
Lecture 9 Algorithm Analysis
Lecture 9 Algorithm Analysis
Lecture 9 Algorithm Analysis
SPLAY TREES.
Trees CMSC 202, Version 5/02.
CMSC 202 Trees.
Splay trees (Sleator, Tarjan 1983)
Lecture 10 Oct 1, 2012 Complete BST deletion Height-balanced BST
Richard Anderson Spring 2016
1 Lecture 13 CS2013.
Chapter 12&13: Binary Search Trees (BSTs)
Presentation transcript:

Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004.

Outline zIntroduction and Definitions zResults of “Dynamic Optimality - Almost” zApplication to Splay Trees

Outline zIntroduction and Definitions zResults of “Dynamic Optimality - Almost” zApplication to Splay Trees

Dynamic Optimality zConsider a binary search tree T on n nodes, servicing a request sequence X=(x 1, x 2, …, x m ) of m values. zCost model: the algorithm is charged one operation for each node traversed at each time step (i.e. each access). zThe set of all traversed nodes (for a particular access) forms a connected subtree. This tree can be rearranged at no cost. zThe offline optimal dynamic BST for X is an algorithm (A OPT ) which services X with the lowest cost. Call this cost C OPT. Note that A OPT is offline. zWe say that an online BST algorithm A is O(f(n))-competitive if C(A) = O(f(n)C OPT ). The algorithm is dynamically optimal if f(n) = 1. zThere are sequences for which C OPT = θ(m) (e.g. (1,…,n) k ) and others for which C OPT = θ(mlgn) (e.g. (n/2, n/4, 3n/4, n/8, …, n) k ).

Interleave Bound zThe interleave bound (IB) is a function that assigns a positive integer to a given sequence X. zIt has been shown that IB(X)/2 - O(n) is a lower bound on C opt (X). In fact, IB(X) is a simplification of a lower bound developed by Wilber in zDemaine et. Al. use IB(X) to construct a O(lglgn)-competitive BST algorithm.

Definition of IB(X) zConsider a fixed, perfectly balanced binary tree P on n nodes (assume n = 2 k -1). P is not a BST, it’s only used to define IB(X). zEach node in P has a preferred child, either left or right. The preferred child of y is the one whose subtree contains the most recently accessed descendent of y. If the most recently accessed element is y, the preferred child is left. zThe interleave cost of x (IC(x)) is the number of child preferences that would change if x were the next value accessed. If no preferences change, we incur a cost of 1. zThe interleave bound IB(X) is the sum of all the IC(x i ), where the state of P is updated after the access of each x i.

Example: X = (13,5,10,10)

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) =

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) = 2. zAccess Element 10 yNote the preference of 10 changes to left. IC(10) =

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) = 2. zAccess Element 10 yNote the preference of 10 changes to left. IC(10) = 3. zAccess Element 10 yNo changes. IC(10) =

Outline zIntroduction and Definitions zResults of “Dynamic Optimality - Almost” zApplication to Splay Trees

Interleaving as a lower bound zTheorem: IB(X)/2 – O(n) is a lower bound on C opt (X) zThis had already been proven by Wilber in ‘89, but the proof of Demaine et. Al. is simpler and (I think) quite enlightening.

Proof Sketch: zIdea 1: Consider, in any binary tree T, a node y. Then the indices of y and all of y’s descendents form a contiguous range of values. That is, they are precisely the set of values [L, R] for some L and R. zIdea 2: Given any two nodes in any tree T, say x and y, the lowest common ancestor of x and y occurs in the range [x, y]. yWe conclude that the lowest common ancestor of any range of values [L,R] must, in fact, be an element of [L,R]. zIdea 3: Given any node y in our balanced binary tree P, let IB y (X) be the number of times the preferred child of y changes as we process X. Then IB(X) = Σ y in T IB y (X).

Proof Sketch (con’t) zChoose some y in P. Let the subtree rooted at y in P correspond to index range [L,R]. Then the left side of y corresponds to [L,y], and the right side to [y+1,R]. zLet T be the a binary tree that occurs at a point in A OPT. zLet r1 and r2 be the lowest common ancestors of [L,y] and [y+1,R] in Then one of r1 and r2 must be the lowest common ancestor of [L,R]; say it’s r1. zWe call r2 the transition point for y, and it turns out that this relationship forms a bijection between nodes of P and nodes of T P: Left is [1,4], Right is [5,7] T: 3 4 r1 r2 16 y

Proof Sketch (con’t) zNow any access into y’s right subtree in P requires that r2 be traversed in T. But if y’s preferred child changes twice, y’s right subtree must be accessed! zHence r2 must be traversed when y’s preferred child changes twice, so the BST algorithm must incur a cost of 1 to touch it. zNote that the transition point might change after it’s traversed, but there will always be one. zThis means that node y contributes at least IB y (X)/2 – O(1) to the total cost of the BST. zSumming over all y, we get that the BST algorithm must incur a cost of at least IB(X)/2 – O(n), as required P: Left is [1,4], Right is [5,7] T: y Any access to [5,7] must touch 5.

Non-Tightness of IB(X) zWe know C opt (X) = O(IB(X)), but there are sequences for which C opt (X) = θ(IB(X)lglgn) zSuppose X consists only of values along the “always left” path in P. There are lgn+1 such values, and every access (except possibly the first) has an interleave cost of 1, so IB(X) = m + O(lgn). zWe can access any k values in such a way that C opt (X) = θ(mlgk). In particular, we can access our “left-path” values so that C opt (X) = θ(mlglgn) = θ(IB(X)lglgn)

The Tango BST zDemaine et. Al. developed a BST algorithm, Tango, which performs in θ(lglgnIB(X)) time. Since C opt = Ω(IB(X)), Tango is O(lglgn)- competitive. zIdea: take the preferred path of P, and place its values into a balanced (AVL) tree T. Take all the remaining subtrees of P, recursively construct Tango trees from them, then hang those Tango trees from the leaves of T.

Illustration of Tango

Illustration of Tango

Illustration of Tango

The Tango Algorithm zThe difficult part of Tango is rearranging the Tango tree when a value is accessed, so it still corresponds to the modified interleave tree P. zThis requires an extra O(1) bits per node, then cutting and merging trees with n nodes in lgn time. This can be done with AVL trees. zFor details, see the paper by Demaine et. Al.

Search Cost in Tango zEach preferred path has O(lgn) nodes, so each balanced tree has depth O(lglgn). zThe number of trees one must pass through to reach a node y is simply the number of preferred paths touched on the path to y in P. zThis is simply the number of times a non- preferred child is chosen (off by 1), so the number of trees traversed is O(IC(y)). zThe depth of y in the Tango tree is therefore O(IC(y)lglgn). Total access time for a sequence X is therefore O(IB(X)lglgn).

Outline zIntroduction and Definitions zResults of “Dynamic Optimality - Almost” zApplication to Splay Trees

Splay Trees zSplaying is an online BST algorithm that rotates an accessed node to the root of the tree. zMethod of rotation is done so that the ancestors of accessed node x form a not-too-unbalanced subtree after all rotations are performed. zRecall: Access Lemma for Splay trees. Assign a weight w(x) to each node x in the tree. Define s(x) to be the sum of the weights of all descendents of x, and r(x) = lg[s(x)]. zThen the amortized cost to access node x in a splay tree with root t is 3(r(t) - r(x)) + 1.

The Open Problem zIn 1985, Sleator and Tarjan conjectured that Splay Trees are O(1)- competitive. Unfortunately, it has been shown only that splay trees are O(lgn)-competitive (and this was proven by S & T in 1985!). zI believe that splay trees perform in time O(IB(X)lglgn), and are therefore O(lglgn)-competitive. zConsider the weight function w(x) = [lgn] -IC(x). Then for each node, r(x) is between -IC(x)lglgn and lge (not obvious). zIn particular, 3(r(t) - r(x)) + 1 = O(IC(x)lglgn), as required (?). zThe problem is that our weight function is not fixed: it changes as the access sequence is processed. The access lemma requires a fixed weight function.

A Possible Approach zIn fact, there is a generalization to the access lemma that does not require the weight function to be fixed. It is only necessary for the weight of a node to not increase, unless that node is being accessed. zThe approach: for any period of time between two accesses to a node x, come up with a fixed value IC’(x) that depends on the values of IC(x) over that time period. Note that assignment of weights can be offline! zSet the weight function to be w(x) = lgn IC’(x), or some variant thereof (i.e. apply a multiplier), so that when x is accessed r(x) = IC’(x)lglgn = O(IC(x)lglgn) and r(t) = O(lglgn). The Access Lemma would then apply to give a total splaying cost of O(IB(X)lglgn).

Thank You