Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004.

Outline zIntroduction and Definitions zResults of “Dynamic Optimality - Almost” zApplication to Splay Trees

Dynamic Optimality zConsider a binary search tree T on n nodes, servicing a request sequence X=(x 1, x 2, …, x m ) of m values. zCost model: the algorithm is charged one operation for each node traversed at each time step (i.e. each access). zThe set of all traversed nodes (for a particular access) forms a connected subtree. This tree can be rearranged at no cost. zThe offline optimal dynamic BST for X is an algorithm (A OPT ) which services X with the lowest cost. Call this cost C OPT. Note that A OPT is offline. zWe say that an online BST algorithm A is O(f(n))-competitive if C(A) = O(f(n)C OPT ). The algorithm is dynamically optimal if f(n) = 1. zThere are sequences for which C OPT = θ(m) (e.g. (1,…,n) k ) and others for which C OPT = θ(mlgn) (e.g. (n/2, n/4, 3n/4, n/8, …, n) k ).

Interleave Bound zThe interleave bound (IB) is a function that assigns a positive integer to a given sequence X. zIt has been shown that IB(X)/2 - O(n) is a lower bound on C opt (X). In fact, IB(X) is a simplification of a lower bound developed by Wilber in 1989. zDemaine et. Al. use IB(X) to construct a O(lglgn)-competitive BST algorithm.

Definition of IB(X) zConsider a fixed, perfectly balanced binary tree P on n nodes (assume n = 2 k -1). P is not a BST, it’s only used to define IB(X). zEach node in P has a preferred child, either left or right. The preferred child of y is the one whose subtree contains the most recently accessed descendent of y. If the most recently accessed element is y, the preferred child is left. zThe interleave cost of x (IC(x)) is the number of child preferences that would change if x were the next value accessed. If no preferences change, we incur a cost of 1. zThe interleave bound IB(X) is the sum of all the IC(x i ), where the state of P is updated after the access of each x i.

Example: X = (13,5,10,10) 13579111315 261014 412 8

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. 13579111315 261014 412 8

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) = 2. 13579111315 261014 412 8

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) = 2. zAccess Element 10 yNote the preference of 10 changes to left. IC(10) = 3. 13579111315 261014 412 8

Example: X = (13,5,10,10) zAccess Element 13 yNo preferences change, so IC(13) = 1, since we always incur a cost of at least 1. zAccess Element 5 yTwo preferences change, so IC(5) = 2. zAccess Element 10 yNote the preference of 10 changes to left. IC(10) = 3. zAccess Element 10 yNo changes. IC(10) = 1. 13579111315 261014 412 8

Interleaving as a lower bound zTheorem: IB(X)/2 – O(n) is a lower bound on C opt (X) zThis had already been proven by Wilber in ‘89, but the proof of Demaine et. Al. is simpler and (I think) quite enlightening.

Proof Sketch: zIdea 1: Consider, in any binary tree T, a node y. Then the indices of y and all of y’s descendents form a contiguous range of values. That is, they are precisely the set of values [L, R] for some L and R. zIdea 2: Given any two nodes in any tree T, say x and y, the lowest common ancestor of x and y occurs in the range [x, y]. yWe conclude that the lowest common ancestor of any range of values [L,R] must, in fact, be an element of [L,R]. zIdea 3: Given any node y in our balanced binary tree P, let IB y (X) be the number of times the preferred child of y changes as we process X. Then IB(X) = Σ y in T IB y (X).

Proof Sketch (con’t) zChoose some y in P. Let the subtree rooted at y in P correspond to index range [L,R]. Then the left side of y corresponds to [L,y], and the right side to [y+1,R]. zLet T be the a binary tree that occurs at a point in A OPT. zLet r1 and r2 be the lowest common ancestors of [L,y] and [y+1,R] in Then one of r1 and r2 must be the lowest common ancestor of [L,R]; say it’s r1. zWe call r2 the transition point for y, and it turns out that this relationship forms a bijection between nodes of P and nodes of T. 1357 26 4 P: Left is [1,4], Right is [5,7] 2 5 7 6 1 T: 3 4 r1 r2 16 y

Proof Sketch (con’t) zNow any access into y’s right subtree in P requires that r2 be traversed in T. But if y’s preferred child changes twice, y’s right subtree must be accessed! zHence r2 must be traversed when y’s preferred child changes twice, so the BST algorithm must incur a cost of 1 to touch it. zNote that the transition point might change after it’s traversed, but there will always be one. zThis means that node y contributes at least IB y (X)/2 – O(1) to the total cost of the BST. zSumming over all y, we get that the BST algorithm must incur a cost of at least IB(X)/2 – O(n), as required. 1357 26 4 P: Left is [1,4], Right is [5,7] 2 5 7 6 1 T: 3 4 16 y Any access to [5,7] must touch 5.

Non-Tightness of IB(X) zWe know C opt (X) = O(IB(X)), but there are sequences for which C opt (X) = θ(IB(X)lglgn) zSuppose X consists only of values along the “always left” path in P. There are lgn+1 such values, and every access (except possibly the first) has an interleave cost of 1, so IB(X) = m + O(lgn). zWe can access any k values in such a way that C opt (X) = θ(mlgk). In particular, we can access our “left-path” values so that C opt (X) = θ(mlglgn) = θ(IB(X)lglgn). 13579111315 261014 412 8

The Tango BST zDemaine et. Al. developed a BST algorithm, Tango, which performs in θ(lglgnIB(X)) time. Since C opt = Ω(IB(X)), Tango is O(lglgn)- competitive. zIdea: take the preferred path of P, and place its values into a balanced (AVL) tree T. Take all the remaining subtrees of P, recursively construct Tango trees from them, then hang those Tango trees from the leaves of T.

Illustration of Tango 13579111315 261014 412 8 9 10 12 8

Illustration of Tango 13579111315 261014 412 8 9 10 12 8 1 2 4 11 13 14

Illustration of Tango 13579111315 261014 412 8 9 10 12 8 1 2 4 11 13 14 153 5 6 7

The Tango Algorithm zThe difficult part of Tango is rearranging the Tango tree when a value is accessed, so it still corresponds to the modified interleave tree P. zThis requires an extra O(1) bits per node, then cutting and merging trees with n nodes in lgn time. This can be done with AVL trees. zFor details, see the paper by Demaine et. Al.

Search Cost in Tango zEach preferred path has O(lgn) nodes, so each balanced tree has depth O(lglgn). zThe number of trees one must pass through to reach a node y is simply the number of preferred paths touched on the path to y in P. zThis is simply the number of times a non- preferred child is chosen (off by 1), so the number of trees traversed is O(IC(y)). zThe depth of y in the Tango tree is therefore O(IC(y)lglgn). Total access time for a sequence X is therefore O(IB(X)lglgn).

Splay Trees zSplaying is an online BST algorithm that rotates an accessed node to the root of the tree. zMethod of rotation is done so that the ancestors of accessed node x form a not-too-unbalanced subtree after all rotations are performed. zRecall: Access Lemma for Splay trees. Assign a weight w(x) to each node x in the tree. Define s(x) to be the sum of the weights of all descendents of x, and r(x) = lg[s(x)]. zThen the amortized cost to access node x in a splay tree with root t is 3(r(t) - r(x)) + 1.

The Open Problem zIn 1985, Sleator and Tarjan conjectured that Splay Trees are O(1)- competitive. Unfortunately, it has been shown only that splay trees are O(lgn)-competitive (and this was proven by S & T in 1985!). zI believe that splay trees perform in time O(IB(X)lglgn), and are therefore O(lglgn)-competitive. zConsider the weight function w(x) = [lgn] -IC(x). Then for each node, r(x) is between -IC(x)lglgn and lge (not obvious). zIn particular, 3(r(t) - r(x)) + 1 = O(IC(x)lglgn), as required (?). zThe problem is that our weight function is not fixed: it changes as the access sequence is processed. The access lemma requires a fixed weight function.

A Possible Approach zIn fact, there is a generalization to the access lemma that does not require the weight function to be fixed. It is only necessary for the weight of a node to not increase, unless that node is being accessed. zThe approach: for any period of time between two accesses to a node x, come up with a fixed value IC’(x) that depends on the values of IC(x) over that time period. Note that assignment of weights can be offline! zSet the weight function to be w(x) = lgn IC’(x), or some variant thereof (i.e. apply a multiplier), so that when x is accessed r(x) = IC’(x)lglgn = O(IC(x)lglgn) and r(t) = O(lglgn). The Access Lemma would then apply to give a total splaying cost of O(IB(X)lglgn).

Thank You

Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004.

Similar presentations

Presentation on theme: "Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004.

Similar presentations

Presentation on theme: "Splay Trees and the Interleave Bound Brendan Lucier March 15, 2005 Summary of “Dynamic Optimality -- Almost” by Demaine et. Al., 2004."— Presentation transcript:

Similar presentations

About project

Feedback