# An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile.

## Presentation on theme: "An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile."— Presentation transcript:

An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Succinct data structures In a k-ary tree each node has at most k children, each children labeled with a symbol in the set {1,…, k} (tries) A succinct data structure requires space close to the information-theoretic lower bound There are different k-ary trees with n nodes Therefore, the information-theoretical lower bound is about bits if k is not a constant with respect to n

Succinct data structures We are interested in succinct representation that can be navigated We are interested in operations  parent ( x ): parent of node x  child ( x, i ): ith child of node x  child ( x, a ): child of node x by label a  depth ( x )  degree ( x )  subtree-size ( x )  preorder ( x )  is-ancestor ( x, y ): is node x an ancestor of node y ?  insertions (assume in the leaves)  deletions (just for unary nodes and leaves) The traditional representation of trees requires nlog n bits for (almost) each operation

Succinct tree representations Succinct representations for static trees:  LOUDS [Jacobson, FOCS’89]  Balanced Parentheses [MR, STOC’97]  DFUDS [Benoit et al., Algorithmica 2005]  xbw [Ferragina et al., FOCS’05]  Ultra succinct trees [Jansson et al., SODA’07] These must be rebuilt from scrath upon insertion or deletion of nodes

Succinct tree representations The case of succinct dynamic trees has been studied only for binary trees  Munro, Raman, and Storm [SODA’01] 2n + o(n) bits parent, child in constant time Updates and subtree-size in O(polylog(n)) time  Raman and Rao [ICALP’03] 2n + o(n) bits Parent, child, preorder, and subtree-size in O(1) time Updates in O((loglog n) 1 +  ) amortized ( O(log n loglog n) worst case) k-ary trees: basic navigation in O(k) time (assume k is not a constant)

Dynamic balanced parentheses Chan et al. [TALG 2007] define a dynamic representation for balanced parentheses This can be used to represent a dynamic k-ary tree using O(n) bits of space The time for all operations is related to the number of nodes in the tree rather than to k (O(log n) time) This data structure cannot take advantage when k is asymptotically smaller than n (e.g., k = O(polylog(n))) We look to achieve o(log n) time whenever log k=o(log u)

Motivations This work is motivated by previous works on LZ-indices  Space-efficient construction of LZ-index [AN, ISAAC’05]  Very preliminary representation:  nlog n bits for pointers, child operation and insertions in O(k) worst-case time  LZ-index on disk [AN, CPM’07]  Basic operations in O(1) CPU time, yet  nlog n bits are needed for pointers and does not support insertions nor deletions

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Our basic tree representation We incrementally divide the tree into disjoint blocks [MRS, RR, AN] Every block represents a subtree of N nodes such that N min ≤ N ≤ N max We arrange these blocks in a tree by adding inter-block pointers (entire tree is tree of subtrees)

Our basic tree representation frontier of the block duplicated nodes

Our basic tree representation We define N min (minimum block size) as follows  Inter-block pointers should require o(n) bits  Therefore we define N min =  (log 2 n) (In general, N min =  (log n f(n)), for f(n) =  (1))  In this way we have (worst case) one pointer out of  (log 2 n) nodes  And hence o(n) bits for pointers

Our basic tree representation We define N max (maximum block size) as follows  In case of block overflow we should be able to create a new block of size at least N min from the full block  In the worst case, the root of the block has its k children, all of them having a subtree of the same size  By choosing N max =  (klog 2 n) we solve this problem …

Our basic tree representation The blocks cannot be as small as we would like We support dynamic operations on the tree by:  Dividing the tree into blocks (we only need to rebuild a block upon updates)  Making these smaller trees dynamic (different to other approaches) We represent the blocks using a dynamic DFUDS representation on top of Chan et al.’s [TALG, 2007]  We solve the basic navigation inside blocks in O(log N) = O(log k + loglog n)  Insertions can be also handled in the same time  We require overall 2n+o(n) bits

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Representing the blocks We represent the symbols S p labeling the arcs of the trie with a data structure for rank and select [GN, submitted]  We compute child p (x, a) by rank and select on S p child p (x, i) on p child p (x, a) can be computed in O(log N log k / loglog N) = O((log 2 k + loglog n) / log(logk + log log n)) time The space requirement is nlog k + o(nlog k) bits

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Representing the frontier of a block We need to indicate which nodes in a block have a pointer to a child block This can be done by using a bit vector  However this would require 3n+o(n) bits overall for the tree structure We define array F p storing the preorders of the nodes having a child pointer  Since there are O(n/log 2 n) pointers, this requires o(n) bits

Representing the frontier of a block T p : (((())(()))((()))) Fp:Fp: We must change all the preorders in FP from this position 3 5 8 4 (3) (8) (16) (20) 3 6 8 4 (3) (9) (17) (21) O(log N) time Array Fp is represented in differential form with a data structure for Searchable Partial Sums

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Representing inter-block pointers Pointers to child blocks  We store the pointers to child blocks in array PTRp  Increasingly sorted according to the preorders of the nodes in the frontier Pointers to parent block  In each block p we need a pointer to the representation of the root of p in the parent block  However the position of a node change upon updates  A parent pointer is composed of A pointer to the parent block q If p is the j-th child of q, then we store value j in p

Representing inter-block pointers p,1 p,2 p,3 p,4 T p : (((())(()))((()))) Fp:Fp: PTR p : 12 34 p

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Solving the basic operations child(x, i):  Look for preorder of x in Fp  If we find it, follow child pointer to block q and apply child q on the root of q  Otherwise, use child p operation  This takes O(log N) = O(log k + loglog n) time child(x,a) is solved in the same way, but using child p (x,a) instead parent(x): if x is the root of block, follow parent pointer to block p. Then apply parent p (x)

Solving the basic operations Insert:  We use the corresponding insertion operation on the block  When a block p becomes full 1. Choose node z in block p 2. Reinsert the nodes in the subtree of z in a new block q (along with the corresponding part in the frontier of p) 3. Delete the subtree of z from p Total cost is O(log k + loglog n) amortized (if we are able to spend time proportional to the size of the subtree of z)  List of candidates subtrees in each block (o(n) bits overall)

Roadmap Succinct data structures  Static tree representations  Dynamic tree representations Our basic dynamic tree representation  Representing blocks  Representing the frontier of blocks  Representing inter-block pointers Solving operations  Basic operations  Specialized operations Discussion

Solving specialized operations We can solve other operations by using this representation  degree(x)  depth(x)  subtree-size(x) x Size p

Solving specialized operations We can solve other operations by using this representation  preorder(x)  is-ancestor(x, y)  lca(x, y)

Conclusions We have defined a representation for dynamic k-ary trees requiring space close to the information-theoretical lower bound We can profit from smaller alphabets  o(log n) time for operations whenever log k = o(log n)  In particular, O(loglog n) time for k=O(polylog(n))  Versus O(log n) time of Chan et al. for any alphabet size We need extra o(nlog k) bits of space

Discussion What happens if we have external pointers to the tree nodes? Can we compress the dynamic DFUDS representation of blocks? (just as in [JSS, SODA’07]) Suffix links in little space? (assuming a suffix-closed trie)

Download ppt "An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile."

Similar presentations