Konstantinos Tsakalidis 1 Dynamic Data Structures: Orthogonal Range Queries and Update Efficiency Konstantinos Tsakalidis PhD Defense 23 September 2011
Konstantinos Tsakalidis 2 Κωνσταντίνος Τσακαλίδης B. Eng. Computer Engineering and Informatics Dpt., University of Patras, Greece Sum Intern Google Inc., Mountain View, California, USA Ph. D. Student (Part A) MADALGO, Aarhus University, Denmark Sum Visiting Prof. Ian Munro D. Cheriton School of Computer Science, University of Waterloo, Canada Ph. D. Student (Part B)
Konstantinos Tsakalidis 3 Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries” Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis 4 Databases and Geometry NameAgeSalaryDatePhone… Andreas / … Maria6.5004/ … John / … Helen / … Jacob / … Planar (D=2) Euclidean Space 38 Query Operation Question about stored data Update Operation/Transaction Insert/Delete Tuple Change Value N points D dimensions 29 Salary Age … Date Name Phone
Konstantinos Tsakalidis 5 Models of Computation Pointer Machine Record O(1) fields word-RAM I/O Model [Aggarwal, Vitter ‘88] Space w bits/cell O(1) Time N M<N N B B words N/B M/B I/O Operation #Occupied Records #Arithmetic Operations +#Pointer Traversals Time #Occupied Cells #Arithmetic Operations +#cell READ/WRITEs #Occupied Blocks #I/O Operations specialized database Memory Disk
Konstantinos Tsakalidis 6 Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries” Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis 7 Orthogonal Range Reporting Queries Salary Age 1000 Contour Query Report all points with: Salary > 1000 Dominance Query Report all points with: Salary > 1000 and Age > Sided Query Report all points with: 2000 > Salary > 1000 and Age > 35 Employees
Konstantinos Tsakalidis 8 I/O Model SpaceQuery I/OsUpdate I/Os External Priority Search Tree [Arge’99] amo. [ICDT ’10] Amortized Expected w.h.p. [ICDT ’10] Expected w.h.p. Amortized Expected w.h.p. [ISAAC‘09] Expected w.h.p.Amortized Expected [ISAAC ’09] Expected w.h.p. Expected amortized Worst-Case Efficient Dynamic 3-Sided Range Reporting word-RAM SpaceQuery TimeUpdate Time Fusion Tree [Willard’00] [Mortensen’06] I/O Model SpaceQuery I/OsUpdate I/Os External Priority Search Tree [Arge’99] amo. SpaceQuery TimeUpdate Time Priority Search Tree [McCreight’85] Pointer Machine word-RAM [ICDT ’10] Expected w.h.p. [ICDT ’10] Expected w.h.p. Expected w.h.p. X, Y: μ-random X: smooth Y: restricted X: smooth X, Y: μ-random X: smooth Y: restricted X: smooth Average-Case Efficient Dynamic 3-Sided Range Reporting
Konstantinos Tsakalidis 9 Unknown non-changing μ-Random probabilistic distribution (f,g)-Smooth distribution Not exceed a specific bound, no matter how small subinterval Includes regular, uniform distributions Any distribution is (f,Θ(n))-smooth Restricted class of distributions Few elements occur very often Many elements occur rarely Zipfian, Power Law Distributions Probabilistic Distributions Smooth Restricted
Konstantinos Tsakalidis 10 Priority Search Tree [McCreight’75] Move Up Maximum Y Space: O(n) Update: Update: O(log n) Pointer Machine
Konstantinos Tsakalidis 11 Query by X-Coordinate: logn + t PathSubtreesInX( s) Pointer Machine O(logn)
Konstantinos Tsakalidis 12 Query by Y-Coordinate: logn + t u ulul urur [Alstrup, Brodal, Rauhe ‘00] 1D Range Maximum Queries (Children) u Find next point to be reported in O(1) time O(1) time Pointer Machine word-RAM
Konstantinos Tsakalidis 13 [ISAAC ‘09] Update:O(log log n) exp. amo. Query: O(log log n+t) exp. w.h.p. Space: O(n) Weight i =Θ(2 2 i ) O(loglogn) expected w.h.p. [Mehlhorn, Tsakalidis ’93, Kaporis et al. ’06] [Andersson, Thorup ‘07] RMQ O(1) expected amortized word-RAM
Konstantinos Tsakalidis 14 I/O Model SpaceQuery I/OsUpdate I/Os [ISAAC‘09] Expected w.h.p.Amortized Expected Average-Case Efficient Dynamic 3-Sided Range Reporting SpaceQuery TimeUpdate Time [ISAAC ’09] Expected w.h.p. Expected amortized word-RAM X: smooth
Konstantinos Tsakalidis 15 Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries” Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis 16 Orthogonal Range MAXIMA Reporting Queries OR “Generalized Planar SKYLINE Operator” Dominance Maxima Queries Report all maximal points among points with x in [x l,+∞) and y in [y b,+∞) Contour Maxima Queries Report all maximal points among points with x in (-∞, x l ] 3-Sided Maxima Queries Report all maximal points among points with x in [x l, x r ] and y in [y b,+∞) Salary Age Employees 4-Sided Maxima Queries Report all maximal points among points with x in [x l, x r ] and y in [y b,y t ] Interesting Points Oldest and Best Payed Maximal Point Dominates: Is “Above” Is NOT Dominated xlxl ybyb xlxl ybyb xrxr ybyb xlxl xlxl xrxr ybyb ytyt
Konstantinos Tsakalidis 17 Worst-Case Efficient Dynamic Range MAXIMA Reporting Pointer MachineInsertDelete Overmars, van Leeuwen ‘81 logn + t-log 2 n Frederickson, Rodger ‘90logn + tlog 2 n+t logn(1+t) lognlog 2 n Janardan ‘91logn + t lognlog 2 n Kapoor ‘00logn + t amo.-logn [ICALP ’11]logn + t logn word-RAMInsertDelete [ICALP ’11]
Konstantinos Tsakalidis 18 Tournament Tree Copy Up Maximum Y Y-Winning Paths Pointer Machine
Konstantinos Tsakalidis 19 Tournament Tree Right(u)MAX( ) u Pointer Machine Find next point to be reported in O(1) time
Konstantinos Tsakalidis 20 3-Sided Range Maxima Queries Query Time: log n + t MAX( ) Pointer Machine Subtrees(Paths) O(logn)
Konstantinos Tsakalidis 21 Update Operation Pointer Machine Previous Update: O(log 2 n)
Konstantinos Tsakalidis 22 U URUR ULUL Update Operation Pointer Machine MAX(Right(u R )) MAX(Right(u)) MAX(Right(u L )) [Sundar ‘89] Priority Queue with Attrition O(1) time
Konstantinos Tsakalidis 23 Reconstruct Rollback Update Operation Pointer Machine Partially Perstistent Priority Queue with Attrition O(1) time, space overhead per update step [Brodal ‘96] worst case [Driscol et al. ‘89] amortized Space:O(n) Update:O(logn)
Konstantinos Tsakalidis 24 [ICALP ‘11] [ICALP ’11]SpaceInsertDelete Pointer Machinenlogn+tlogn word-RAMn Pointer Machinenlognlog 2 n+tlog 2 n [ICALP ’11]SpaceInsertDelete
Konstantinos Tsakalidis 25 Rectangular Visibility Queries 4x4x (+∞,+∞) (+∞,-∞) (-∞,+∞) (-∞,-∞) Proximity Queries/Similarity Search 4-Sided Range Maxima Queries
Konstantinos Tsakalidis 26 Worst-Case Efficient 4-Sided Range MAXIMA Reporting and Rectangular Visibility Queries Pointer MachineSpaceInsertDelete Overmars, Wood ‘88nlognlog 2 n+tlog 2 nlog 3 n Overmars, Wood ‘88nlognlog 2 n +t logn log 2 n [ICALP ’11]nlognlog 2 n+tlog 2 n
Konstantinos Tsakalidis 27 Overview Dynamic Planar Orthogonal 3-Sided Range Reporting Queries [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” Dynamic Planar Orthogonal Range Maxima Reporting Queries [ICALP ’11] “Dynamic Planar Range Maxima Queries” Multi-Versioned Indexed Databases [SODA ‘12] “Fully Persistent B-Trees”
Konstantinos Tsakalidis 28 B-Trees [Bayer,McCreight ‘72] NameAgeSalary… Andreas … Maria … John … Helen … Jacob … Indexed Database Space: O(N/B) blocks Update:O(log B N) I/Os Access: O(log B N) I/Os Multi-Versioned Databases Btrfs Data Platform
Konstantinos Tsakalidis 29 Fully Persistent B-Trees I/O ModelSpaceQuery I/OsUpdate I/Os Amortized Lanka, Mays ‘91n/B(log B n + t/B)log B mlog B n log B m [SODA ’12]n/Blog B n + t/Blog B n + log 2 B n elements in one version m update operations = #versions B block size
Konstantinos Tsakalidis 30 [SODA ‘12] Incremental B-Trees Lazy Updates O(log B N) READs O(1) WRITEs that make O(1) changes to a block Result Space O(N/B) Query O(log B N+t/B) I/Os Update O(log B N + log 2 B) I/Os I/O-Efficient Full Persistence Interface of Primitive Operations READ WRITE Input is a pointer-based Structure Node occupies O(1) blocks Node has indegree O(1) O(1) I/O-Overhead per access to a block O(log 2 B) I/O-Overhead per change to a block [Driscol et al.’89] Node-Splitting Method ACCESS NEW_NODE NEW_VERSION
Konstantinos Tsakalidis 31 Mange Tak Konstantinos Tsakalidis Ph.D. Student Tsakalidis K., et al. [ISAAC ‘09] “Dynamic 3-Sided Planar Range Queries with Expected Doubly Logarithmic Time” [ICDT ’10] “Efficient Processing of 3-Sided Range Queries with Probabilistic Guarantees” [ICALP ’11] “Dynamic Planar Range Maxima Queries” [SODA ‘12] “Fully Persistent B-Trees”