Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011.

Slides:



Advertisements
Similar presentations
Planar point location -- example
Advertisements

Indexing Mobile Objects on the plane Revisited Computer Engineering and Informatics Department, Polytechnic School, University of Patras The authors would.
I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
B+-Trees (PART 1) What is a B+ tree? Why B+ trees? Searching a B+ tree
External Memory Geometric Data Structures
Multiversion Access Methods - Temporal Indexing. Basics A data structure is called : Ephemeral: updates create a new version and the old version cannot.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
Data Structures: Range Queries - Space Efficiency Pooya Davoodi Aarhus University PhD Defense July 4, 2011.
Micha Streppel TU Eindhoven  NCIM-Groep, the Netherlands and Ke Yi AT&T Labs, USA  HKUST, Hong Kong.
Dynamic Planar Range Maxima Queries (presented at ICALP 2011) Gerth Stølting Brodal Aarhus University Kostas Tsakalidis University of Primorska, October.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
Dynamic Planar Range Maxima Queries (presented at ICALP 2011) Gerth Stølting Brodal Aarhus University Kostas Tsakalidis LIAFA, Université Paris Diderot,
Update 1 Persistent Data Structures (Version Control) v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Ephemeral query v0v0 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 Partial persistence.
Optimal Planar Point Enclosure Indexing Lars Arge, Vasilis Samoladas and Ke Yi Department of Computer Science Duke University Technical University of Crete.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Aarhus University March 16, 2006.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
Cache-Oblivious B-Trees
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge Aarhus University February 7, 2005.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
Persistent Data Structures Computational Geometry, WS 2007/08 Lecture 12 Prof. Dr. Thomas Ottmann Khaireel A. Mohamed Algorithmen & Datenstrukturen, Institut.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
I/O-Algorithms Lars Arge Spring 2009 April 28, 2009.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University February 6, 2007.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
Fully Persistent B-Trees 23 rd Annual ACM-SIAM Symposium on Discrete Algorithms, Kyoto, Japan, January 18, 2012 Gerth Stølting Brodal Konstantinos Tsakalidis.
I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj.
I/O-Algorithms Lars Arge Aarhus University February 9, 2006.
I/O-Algorithms Lars Arge Aarhus University March 9, 2006.
I/O-Algorithms Lars Arge Aarhus University February 14, 2008.
Comparison Based Dictionaries: Fault Tolerance versus I/O Efficiency Gerth Stølting Brodal Allan Grønlund Jørgensen Thomas Mølhave University of Aarhus.
I/O-Algorithms Lars Arge Aarhus University March 6, 2007.
Point Location Computational Geometry, WS 2007/08 Lecture 5 Prof. Dr. Thomas Ottmann Algorithmen & Datenstrukturen, Institut für Informatik Fakultät für.
I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
1 Database Tuning Rasmus Pagh and S. Srinivasa Rao IT University of Copenhagen Spring 2007 February 8, 2007 Tree Indexes Lecture based on [RG, Chapter.
Space Efficient Data Structures for Dynamic Orthogonal Range Counting Meng He and J. Ian Munro University of Waterloo.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
CSIS7101 – Advanced Database Technologies Spatio-Temporal Data (Part 1) On Indexing Mobile Objects Kwong Chi Ho Leo Wong Chi Kwong Simon Lui, Tak Sing.
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
Λίστα Εργασιών External Memory Data Structures Vitter, J. S. and Shriver, E. 1994a. Algorithms for parallel memory I: Two-level memories. Algorithmica.
Λίστα Εργασιών Data Structures for Tree Manipulation D. Harel and R.E. Tarjan. Fast Algorithms for finding nearest common ancestors. SIAM J. Computing,
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Lecture 3: External Memory Indexing Structures (Contd) CS6931 Database Seminar.
Equivalence Between Priority Queues and Sorting in External Memory
External Memory Geometric Data Structures Lars Arge Duke University June 27, 2002 Summer School on Massive Datasets.
CMPS 3130/6130 Computational Geometry Spring 2015
Problem Definition I/O-efficient Rectangular Segment Search Gautam K. Das and Bradford G. Nickerson Faculty of Computer science, University of New Brunswick,
COMP 5704 Project Presentation Parallel Buffer Trees and Searching Cory Fraser School of Computer Science Carleton University, Ottawa, Canada
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
ProblemData StructuresLower Bound Preprocess a set of N 3-dimensional points into an I/O-efficient data structure, such that all points inside an axis.
Internal Memory Pointer MachineRandom Access MachineStatic Setting Data resides in records (nodes) that can be accessed via pointers (links). The priority.
X1x1 x2x2 top-k y 3-sided x1x1 x2x2 External Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates Gerth Stølting Brodal Aarhus.
Michal Balas1 I/O-efficient Point Location using Persistent B-Trees Lars Arge, Andrew Danner, and Sha-Mayn Teh Department of Computer Science, Duke University.
arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
Persistent Data Structures (Version Control)
Temporal Indexing MVBT.
Searching in Trees Gerth Stølting Brodal Aarhus University
Advanced Topics in Data Management
STACS arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
8th Workshop on Massive Data Algorithms, August 23, 2016
Presentation transcript:

Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011

Konstantinos Tsakalidis 2 Κωνσταντίνος Τσακαλίδης B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece Sum Intern Google Inc., Mountain View, California, USA Ph. D. Student (Part A) MADALGO, Aarhus University, Denmark Sum Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada Ph. D. Student (Part B)

Konstantinos Tsakalidis 3 Overview  Dynamic Planar Orthogonal 3-Sided Range Reporting Queries  [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”  [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”  Dynamic Planar Orthogonal Range Maxima Reporting Queries  [ICALP ’11] “Dynamic Planar Range Maxima Queries”  Multi-Versioned Indexed Databases  [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis 4 Databases and Geometry NameAgeSalaryDatePhone… Andreas / … Maria6.5004/ … John / … Helen / … Jacob / … Planar (D=2) Euclidean Space 38 Query Operation Question about stored data Update Operation/Transaction Insert/Delete Tuple Change Value N points D dimensions 29 Salary Age … Date Name Phone

Konstantinos Tsakalidis 5 Models of Computation Pointer Machine Record O(1) fields word-RAM I/O Model [Aggarwal, Vitter ‘88] Space w bits/cell O(1) Time N M<N N B B words N/B M/B I/O Operation #Occupied Records #Arithmetic Operations +#Pointer Traversals Time #Occupied Cells #Arithmetic Operations +#cell READ/WRITEs #Occupied Blocks #I/O Operations specialized database Memory Disk

Konstantinos Tsakalidis 6 Overview  Dynamic Planar Orthogonal 3-Sided Range Reporting Queries  [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”  [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”  Dynamic Planar Orthogonal Range Maxima Reporting Queries  [ICALP ’11] “Dynamic Planar Range Maxima Queries”  Multi-Versioned Indexed Databases  [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis 7 Orthogonal Range Reporting Queries Salary Age 1000 Contour Query Report all points with: Salary > 1000 Dominance Query Report all points with: Salary > 1000 and Age > Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35 Employees

Konstantinos Tsakalidis 8 I/O Model SpaceQuery I/OsUpdate I/Os External Priority Search Tree [Arge’99] amo. [ICDT ’10] Amortized Expected w.h.p. [ICDT ’10] Expected w.h.p. Amortized Expected w.h.p. [ISAAC‘09] Expected w.h.p.Amortized Expected [ISAAC ’09] Expected w.h.p. Expected amortized Worst-Case Efficient Dynamic 3-Sided Range Reporting word-RAM SpaceQuery TimeUpdate Time Fusion Tree [Willard’00] [Mortensen’06] I/O Model SpaceQuery I/OsUpdate I/Os External Priority Search Tree [Arge’99] amo. SpaceQuery TimeUpdate Time Priority Search Tree [McCreight’85] Pointer Machine word-RAM [ICDT ’10] Expected w.h.p. [ICDT ’10] Expected w.h.p. Expected w.h.p. X, Y: μ-random X: smooth Y: restricted X: smooth X, Y: μ-random X: smooth Y: restricted X: smooth Average-Case Efficient Dynamic 3-Sided Range Reporting

Konstantinos Tsakalidis 9  Unknown non-changing μ-Random probabilistic distribution  (f,g)-Smooth distribution  Not exceed a specific bound, no matter how small subinterval  Includes regular, uniform distributions  Any distribution is (f,Θ(n))-smooth  Restricted class of distributions  Few elements occur very often  Many elements occur rarely  Zipfian, Power Law Distributions Probabilistic Distributions Smooth Restricted

Konstantinos Tsakalidis 10 Priority Search Tree [McCreight’75] Move Up Maximum Y Space: O(n) Update: Update: O(log n) Pointer Machine

Konstantinos Tsakalidis 11 Query by X-Coordinate: logn + t PathSubtreesInX( s) Pointer Machine O(logn)

Konstantinos Tsakalidis 12 Query by Y-Coordinate: logn + t u ulul urur [Alstrup, Brodal, Rauhe ‘00] 1D Range Maximum Queries (Children) u Find next point to be reported in O(1) time O(1) time Pointer Machine word-RAM

Konstantinos Tsakalidis 13 [ISAAC ‘09] Update:O(log log n) exp. amo. Query: O(log log n+t) exp. w.h.p. Space: O(n) Weight i =Θ(2 2 i )  O(loglogn) expected w.h.p. [Mehlhorn, Tsakalidis ’93, Kaporis et al. ’06] [Andersson, Thorup ‘07] RMQ O(1) expected amortized word-RAM

Konstantinos Tsakalidis 14 I/O Model SpaceQuery I/OsUpdate I/Os [ISAAC‘09] Expected w.h.p.Amortized Expected Average-Case Efficient Dynamic 3-Sided Range Reporting SpaceQuery TimeUpdate Time [ISAAC ’09] Expected w.h.p. Expected amortized word-RAM X: smooth

Konstantinos Tsakalidis 15 Overview  Dynamic Planar Orthogonal 3-Sided Range Reporting Queries  [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”  [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”  Dynamic Planar Orthogonal Range Maxima Reporting Queries  [ICALP ’11] “Dynamic Planar Range Maxima Queries”  Multi-Versioned Indexed Databases  [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis 16 Orthogonal Range MAXIMA Reporting Queries OR “Generalized Planar SKYLINE Operator” Dominance Maxima Queries Report all maximal points among points with x in [x l,+∞) and y in [y b,+∞) Contour Maxima Queries Report all maximal points among points with x in (-∞, x l ] 3-Sided Maxima Queries Report all maximal points among points with x in [x l, x r ] and y in [y b,+∞) Salary Age Employees 4-Sided Maxima Queries Report all maximal points among points with x in [x l, x r ] and y in [y b,y t ] Interesting Points Oldest and Best Payed Maximal Point Dominates: Is “Above” Is NOT Dominated xlxl ybyb xlxl ybyb xrxr ybyb xlxl xlxl xrxr ybyb ytyt

Konstantinos Tsakalidis 17 Worst-Case Efficient Dynamic Range MAXIMA Reporting Pointer MachineInsertDelete Overmars, van Leeuwen ‘81 logn + t-log 2 n Frederickson, Rodger ‘90logn + tlog 2 n+t logn(1+t) lognlog 2 n Janardan ‘91logn + t lognlog 2 n Kapoor ‘00logn + t amo.-logn [ICALP ’11]logn + t logn word-RAMInsertDelete [ICALP ’11]

Konstantinos Tsakalidis 18 Tournament Tree Copy Up Maximum Y Y-Winning Paths Pointer Machine

Konstantinos Tsakalidis 19 Tournament Tree Right(u)MAX( ) u Pointer Machine Find next point to be reported in O(1) time

Konstantinos Tsakalidis 20 3-Sided Range Maxima Queries Query Time: log n + t MAX( ) Pointer Machine Subtrees(Paths) O(logn)

Konstantinos Tsakalidis 21 Update Operation Pointer Machine Previous Update: O(log 2 n)

Konstantinos Tsakalidis 22 U URUR ULUL Update Operation Pointer Machine MAX(Right(u R )) MAX(Right(u)) MAX(Right(u L )) [Sundar ‘89] Priority Queue with Attrition O(1) time

Konstantinos Tsakalidis 23 Reconstruct Rollback Update Operation Pointer Machine Partially Perstistent Priority Queue with Attrition O(1) time, space overhead per update step [Brodal ‘96] worst case [Driscol et al. ‘89] amortized Space:O(n) Update:O(logn)

Konstantinos Tsakalidis 24 [ICALP ‘11] [ICALP ’11]SpaceInsertDelete Pointer Machinenlogn+tlogn word-RAMn Pointer Machinenlognlog 2 n+tlog 2 n [ICALP ’11]SpaceInsertDelete

Konstantinos Tsakalidis 25 Rectangular Visibility Queries 4x4x (+∞,+∞) (+∞,-∞) (-∞,+∞) (-∞,-∞) Proximity Queries/Similarity Search 4-Sided Range Maxima Queries

Konstantinos Tsakalidis 26 Worst-Case Efficient 4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries Pointer MachineSpaceInsertDelete Overmars, Wood ‘88nlognlog 2 n+tlog 2 nlog 3 n Overmars, Wood ‘88nlognlog 2 n +t logn log 2 n [ICALP ’11]nlognlog 2 n+tlog 2 n

Konstantinos Tsakalidis 27 Overview  Dynamic Planar Orthogonal 3-Sided Range Reporting Queries  [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time”  [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees”  Dynamic Planar Orthogonal Range Maxima Reporting Queries  [ICALP ’11] “Dynamic Planar Range Maxima Queries”  Multi-Versioned Indexed Databases  [SODA ‘12] “Fully Persistent B-Trees”

Konstantinos Tsakalidis 28 B-Trees [Bayer,McCreight ‘72] NameAgeSalary… Andreas … Maria … John … Helen … Jacob … Indexed Database Space: O(N/B) blocks Update:O(log B N) I/Os Access: O(log B N) I/Os Multi-Versioned Databases Btrfs Data Platform

Konstantinos Tsakalidis 29 Fully Persistent B-Trees I/O ModelSpaceQuery I/OsUpdate I/Os Amortized Lanka, Mays ‘91n/B(log B n + t/B)log B mlog B n log B m [SODA ’12]n/Blog B n + t/Blog B n + log 2 B n elements in one version m update operations = #versions B block size

Konstantinos Tsakalidis 30 [SODA ‘12] Incremental B-Trees  Lazy Updates  O(log B N) READs  O(1) WRITEs that make O(1) changes to a block Result Space O(N/B) Query O(log B N+t/B) I/Os Update O(log B N + log 2 B) I/Os I/O-Efficient Full Persistence  Interface of Primitive Operations  READ  WRITE  Input is a pointer-based Structure  Node occupies O(1) blocks  Node has indegree O(1)  O(1) I/O-Overhead per access to a block  O(log 2 B) I/O-Overhead per change to a block  [Driscol et al.’89] Node-Splitting Method  ACCESS  NEW_NODE  NEW_VERSION

Konstantinos Tsakalidis 31 Mange Tak Konstantinos Tsakalidis Ph.D. Student Tsakalidis K., et al. [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries” [SODA ‘12] “Fully Persistent B-Trees”