Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro University of Waterloo.

Slides:



Advertisements
Similar presentations
1 Succinct Representation of Labeled Graphs Jérémy Barbay, Luca Castelli Aleardi, Meng He, J. Ian Munro.
Advertisements

I/O and Space-Efficient Path Traversal in Planar Graphs Craig Dillabaugh, Carleton University Meng He, University of Waterloo Anil Maheshwari, Carleton.
Succinct Representations of Dynamic Strings Meng He and J. Ian Munro University of Waterloo.
Succinct Data Structures for Permutations, Functions and Suffix Arrays
Orthogonal range searching. The problem (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of.
Fast Algorithms For Hierarchical Range Histogram Constructions
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Dynamic Planar Convex Hull Operations in Near- Logarithmic Amortized Time TIMOTHY M. CHAN.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
An Improved Succinct Dynamic k-Ary Tree Representation (work in progress) Diego Arroyuelo Department of Computer Science, Universidad de Chile.
Searching on Multi-Dimensional Data
The Wavelet Trie: Maintaining an Indexed Sequence of Strings in Compressed Space Roberto GrossiGiuseppe Ottaviano * Università di Pisa * Part of the work.
Succinct Indexes for Strings, Binary Relations and Multi-labeled Trees Jérémy Barbay, Meng He, J. Ian Munro, University of Waterloo S. Srinivasa Rao, IT.
External Memory Geometric Data Structures
SODA Jan 11, 2004Partial Sums1 Tight Bounds for the Partial-Sums Problem Mihai PǎtraşcuErik Demaine (presenting) MIT CSAIL.
A Categorization Theorem on Suffix Arrays with Applications to Space Efficient Text Indexes Meng He, J. Ian Munro, and S. Srinivasa Rao University of Waterloo.
Succinct Data Structures Ian Munro University of Waterloo Joint work with David Benoit, Andrej Brodnik, D, Clark, F. Fich, M. He, J. Horton, A. López-Ortiz,
2-dimensional indexing structure
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
BTrees & Bitmap Indexes
Rank-Sensitive Data Structures Iwona Bialynicka-Birula and Roberto Grossi (Università di Pisa) 12 th Symposium on String Processing and Information Retrieval.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj.
I/O-Algorithms Lars Arge Aarhus University March 9, 2006.
Obtaining Provably Good Performance from Suffix Trees in Secondary Storage Pang Ko & Srinivas Aluru Department of Electrical and Computer Engineering Iowa.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Compressed Index for a Dynamic Collection of Texts H.W. Chan, W.K. Hon, T.W. Lam The University of Hong Kong.
Orthogonal Range Searching I Range Trees. Range Searching S = set of geometric objects Q = query object Report/Count objects in S that intersect Q Query.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
Mike 66 Sept Succinct Data Structures: Techniques and Lower Bounds Ian Munro University of Waterloo Joint work with/ work of Arash Farzan, Alex Golynski,
1 Geometric Intersection Determining if there are intersections between graphical objects Finding all intersecting pairs Brute Force Algorithm Plane Sweep.
New Balanced Search Trees Siddhartha Sen Princeton University Joint work with Bernhard Haeupler and Robert E. Tarjan.
Succinct Representations of Trees
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
Introduction n – length of text, m – length of search pattern string Generally suffix tree construction takes O(n) time, O(n) space and searching takes.
Summer School '131 Succinct Data Structures Ian Munro.
Succinct Geometric Indexes Supporting Point Location Queries Prosenjit Bose, Eric Y. Chen, Meng He, Anil Maheshwari, Pat Morin.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Succinct Orthogonal Range Search Structures on a Grid with Applications to Text Indexing Prosenjit Bose, Carleton University Meng He, Unversity of Waterloo.
Succinct Data Structures Ian Munro University of Waterloo Joint work with David Benoit, Andrej Brodnik, D, Clark, F. Fich, M. He, J. Horton, A. López-Ortiz,
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Succinct Dynamic Cardinal Trees with Constant Time Operations for Small Alphabet Pooya Davoodi Aarhus University May 24, 2011 S. Srinivasa Rao Seoul National.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Succinct Ordinal Trees Based on Tree Covering Meng He, J. Ian Munro, University of Waterloo S. Srinivasa Rao, IT University of Copenhagen.
CMPS 3130/6130 Computational Geometry Spring 2015
February 17, 2005Lecture 6: Point Location Point Location (most slides by Sergi Elizalde and David Pritchard)
B-Trees Katherine Gurdziel 252a-ba. Outline What are b-trees? How does the algorithm work? –Insertion –Deletion Complexity What are b-trees used for?
Internal Memory Pointer MachineRandom Access MachineStatic Setting Data resides in records (nodes) that can be accessed via pointers (links). The priority.
Navigation Piles with Applications to Sorting, Priority Queues, and Priority Deques Jyrki Katajainen and Fabio Vitale Department of Computing, University.
Discrete Methods in Mathematical Informatics Kunihiko Sadakane The University of Tokyo
Mehdi Kargar Department of Computer Science and Engineering
HUFFMAN CODES.
Data Structures: Disjoint Sets, Segment Trees, Fenwick Trees
Succinct Data Structures
Data Structures: Segment Trees, Fenwick Trees
Dynamic Data Structures for Simplicial Thickness Queries
Succinct Representation of Labeled Graphs
CSE2331/5331 Topic 7: Balanced search trees Rotate operation
Succinct Data Structures
Donghui Zhang, Tian Xia Northeastern University
Presentation transcript:

Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro University of Waterloo

Dynamic Orthogonal Range Counting  A fundamental geometric query problem  Definitions Data sets: a set P of n points in the plane Query: given an axis-aligned query rectangle R, compute the number of points in P∩R Update: insertion or deletion of a point  Applications Geometric data processing (GIS, CAD) Databases

Example

Classic Solutions and Our Result SpaceQueryUpdate Chazelle (1988)O(n)O(lg n) JáJá (2004)*O(n)O(lg n / lglg n) Chazelle (1988)O(n)O(lg 2 n) Nekrich (2009)O(n)O((lg n / lglg n) 2 )O(lg 4+ε n) (0<ε<1) Our resultO(n)O((lg n / lglg n) 2 )  Matches the lower bound under the group model Pătraşcu (2007) * For integer coordinates.

Background: Succinct Data Structures  What are succinct data structures (Jacobson 1989) Representing data structures using ideally information-theoretic minimum space Supporting efficient navigational operations  Why succinct data structures Large data sets in modern applications: textual, genomic, spatial or geometric  A novel and unusual way of using succinct data structures (this paper) Matching the storage cost of standard data structures Improving the time efficiency

Dynamic Range Sum  Data A 2D array A[1..r, 1..c] of numbers  Operations range_sum(i 1, j 1, i 2, j 2 ): the sum of numbers in A[i 1..i 2, i 2.. j 2 ] modify(i, j, δ): A[i, j] ← A[i, j] + δ insert(j): insert a 0 between A[i, j-1] and A[i, j] for i = 1, 2, …, r. delete(j): delete A[i, j] for for i = 1, 2, …, r. To perform this, A[i, j] must be 0 for all i.  Restrictions on r, c and δ and operations supported may apply.

Dynamic Range Sum: An Example range_sum(2, 3, 3, 6) =25insert(6) delete(6)range_sum(2, 3, 3, 7) = 30 modify(2, 6, 5) modify(2, 6, -5)

Dynamic Range Sum in a small 2D Array  Assumptions and restrictions Word size w: Ω(lg n) Each number: nonnegative, O(lg n) bits rc = O(lg λ n), 0 < λ < 1 modify(i, j, δ): |δ| ≤ lg n insert and delete: no support  Our solution Space: O(lg 1+λ n) bits, with an o(n)-bit universal table Time: modify and range_sum in O(1) time Generalization of the 1D array version (Raman et al. 2001) Deamortization is interesting

Range Sum in a Narrow 2D Array  Assumptions and restrictions b = O(w): number of bits required to encode each number “Narrow”: r = O(lg γ c), 0 < λ < 1 |δ| ≤ lg c  Our results Space: O(rcb + w) bits, with an O(c lg c)-bit buffer Operations: O(lg c / lg lg c) time  A generalization of the solution to CSPSI problem based on B trees (He and Munro 2010), using our small 2D array structure on each B-tree node

Range Counting in Dynamic Integer Sequences  Notation Integer range: [1..σ] Sequence: S[1..n]  Operations: access(x): S[x] rank( α, x): number of occurrences of α in S[1..x] select( α, r): position of the r th occurrence of α in S range_count(p 1, p 2, v 1, v 2 ): number of entries in S[p 1.. p 2 ] whose values are in the range [v 1.. v 2 ]. insert( α, i): insert α between S[i-1] and S[i] delete(i): delete S[i] from S

Range Counting in Integer Sequences: An Example S = 5,5,2,5,3,1,3,4,7,6,4,1,2,2,5,8 rank(5, 8) =3 select(2, 3) =14 range_count(6, 12, 2, 6) = 4

Range Counting in Sequences of Small Integers  Restrictions σ = O(lg ρ n) for any constant 0 < ρ < 1  Our result Space: nH 0 + o(n lg σ) + O(w) bits Time: O(lg n / lglg n)  This is achieved by combining: Our solution to range sum on narrow 2D arrays A succinct dynamic string representation (He and Munro 2010 )

Dynamic Range Counting: An Augmented Red Black Tree  T x : A red black tree storing all the x-coordinates  Each node also stores the number of its descendants  Purpose: conversions between real x- coordinates and rank space in O(lg n) time

Dynamic Range Counting: A Range Tree  T y : A weight balanced B-tree (Arge and Vitter 2003) constructed over all the y-coordinates Branching factor d = Θ(lg ε n) for constant 0 < ε < 1 Leaf parameter: 1 The levels are numbered 0, 1, … from top to bottom  Essentially a range tree Each node represents a range of y-coordinates  Choice of weight balanced B-tree: amortizing a rebuilding cost

Dynamic Range Counting: A Wavelet Tree  Ideas from generalized wavelet trees (Ferragina et al. 2006)  For each node v of T y, construct a sequence S v : Each entry of S v corresponds to a point whose y-coordinate is in the range represented by node v S v [i] corresponds to the point with the i th smallest x-coordinate among all these points S v [i] indicates which child of v contains the y-coordinate of the above point  For each level m, construct a sequence L m [1..n] of integers from [1..4d] by concatenating the all the S v ’s constructed at level m  L m : stored as dynamic sequences of small integers  Space: O(n lg d + w) bits per level, O(n) words overall

Range Counting Queries  Query range: [x 1..x 2 ] × [y 1..y 2 ]  Use T x to convert the query x-range to a range in rank space  Perform a top-down traversal to locate the (up to two) leaves in T y whose ranges contain y 1 and y 2  Perform range_count on S v for each node v visited in the above traversal  Sum up the query results to get the answer  Time: O(lg n / lglg n) per level, O(lg n / lglg n) levels

Insertions and Deletions  More complicated: splits and merges; changes to child ranks  The choice of storing T y as weight balanced B- tree allows us to amortize the updating cost of subsequences of L m ’s  Additional techniques supporting batch updating of integer sequences are also developed

Our Results  Dynamic Orthogonal Range Counting Space: O(n) words Time: O((lg n / lglg n) 2 )  Points on a U×U grid Space: O(n) words Time (worst-case): O(lg n lg U / (lg lg n) 2 )  Succinct representations of dynamic integer sequences Space: nH 0 + o(n lg σ) + O(w) bits Time (including range_count): O(──── ( ──── + 1)) lg σ lg lg n lg n lg lg n

Conclusions  Results The best result for dynamic orthogonal range counting Same problem for points on a grid The first succinct representations of dynamic integer sequences supporting range counting Two preliminary results on dynamic range sum  Techniques The first that combines wavelet trees with range trees Deamortization on 2D arrays  Future work Lower bound Use techniques from succinct data structures to improve standard data structures

Thank you!