Micha Streppel TU Eindhoven  NCIM-Groep, the Netherlands and Ke Yi AT&T Labs, USA  HKUST, Hong Kong.

Slides:



Advertisements
Similar presentations
Augmenting Data Structures Advanced Algorithms & Data Structures Lecture Theme 07 – Part I Prof. Dr. Th. Ottmann Summer Semester 2006.
Advertisements

I/O-Algorithms Lars Arge Fall 2014 September 25, 2014.
External Memory Geometric Data Structures
I/O-Efficient Construction of Constrained Delaunay Triangulations Pankaj K. Agarwal, Lars Arge, and Ke Yi Duke University.
Spatial Indexing I Point Access Methods. PAMs Point Access Methods Multidimensional Hashing: Grid File Exponential growth of the directory Hierarchical.
Orthogonal Range Searching 3Computational Geometry Prof. Dr. Th. Ottmann 1 Orthogonal Range Searching 1.Linear Range Search : 1-dim Range Trees 2.2-dimensional.
I/O-Algorithms Lars Arge University of Aarhus February 21, 2005.
I/O-Algorithms Lars Arge Aarhus University February 27, 2007.
I/O-Algorithms Lars Arge Spring 2011 March 8, 2011.
1 B trees Nodes have more than 2 children Each internal node has between k and 2k children and between k-1 and 2k-1 keys A leaf has between k-1 and 2k-1.
Optimal Planar Point Enclosure Indexing Lars Arge, Vasilis Samoladas and Ke Yi Department of Computer Science Duke University Technical University of Crete.
I/O-Algorithms Lars Arge Aarhus University February 13, 2007.
I/O-Algorithms Lars Arge Aarhus University March 16, 2006.
I/O-Algorithms Lars Arge Spring 2009 February 2, 2009.
Approximate Range Searching in the Absolute Error Model Guilherme D. da Fonseca CAPES BEX Advisor: David M. Mount.
Tradeoffs in Approximate Range Searching Made Simpler Sunil Arya Hong Kong University of Science and Technology Guilherme D. da Fonseca Universidade Federal.
I/O-Algorithms Lars Arge Aarhus University February 16, 2006.
I/O-Algorithms Lars Arge Aarhus University February 7, 2005.
I/O-Algorithms Lars Arge University of Aarhus February 13, 2005.
Temporal Indexing MVBT. Temporal Indexing Transaction time databases : update the last version, query all versions Queries: “Find all employees that worked.
I/O-Algorithms Lars Arge Spring 2009 April 28, 2009.
I/O-Algorithms Lars Arge University of Aarhus March 1, 2005.
I/O-Algorithms Lars Arge Spring 2009 March 3, 2009.
I/O-Algorithms Lars Arge Aarhus University February 6, 2007.
Lars Arge1, Mark de Berg2, Herman Haverkort3 and Ke Yi1
I/O-Algorithms Lars Arge Aarhus University March 5, 2008.
I/O-Efficient Structures for Orthogonal Range Max and Stabbing Max Queries Second Year Project Presentation Ke Yi Advisor: Lars Arge Committee: Pankaj.
I/O-Algorithms Lars Arge Aarhus University February 9, 2006.
I/O-Algorithms Lars Arge Aarhus University March 9, 2006.
I/O-Algorithms Lars Arge Aarhus University February 14, 2008.
I/O-Algorithms Lars Arge Aarhus University March 6, 2007.
I/O-Algorithms Lars Arge University of Aarhus March 7, 2005.
1 Geometric index structures April 15, 2004 Based on GUW Chapter , [Arge01] Sections 1, 2.1 (persistent B- trees), 3-4 (static versions.
B + -Trees (Part 1) Lecture 20 COMP171 Fall 2006.
B + -Trees (Part 1). Motivation AVL tree with N nodes is an excellent data structure for searching, indexing, etc. –The Big-Oh analysis shows most operations.
B + -Trees (Part 1) COMP171. Slide 2 Main and secondary memories  Secondary storage device is much, much slower than the main RAM  Pages and blocks.
Balanced Trees. Binary Search tree with a balance condition Why? For every node in the tree, the height of its left and right subtrees must differ by.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
AALG, lecture 11, © Simonas Šaltenis, Range Searching in 2D Main goals of the lecture: to understand and to be able to analyze the kd-trees and.
Indexing. Goals: Store large files Support multiple search keys Support efficient insert, delete, and range queries.
Heavily based on slides by Lars Arge I/O-Algorithms Thomas Mølhave Spring 2012 February 9, 2012.
Chapter Tow Search Trees BY HUSSEIN SALIM QASIM WESAM HRBI FADHEEL CS 6310 ADVANCE DATA STRUCTURE AND ALGORITHM DR. ELISE DE DONCKER 1.
UNC Chapel Hill M. C. Lin Orthogonal Range Searching Reading: Chapter 5 of the Textbook Driving Applications –Querying a Database Related Application –Crystal.
External Memory Algorithms for Geometric Problems Piotr Indyk (slides partially by Lars Arge and Jeff Vitter)
B-trees and kd-trees Piotr Indyk (slides partially by Lars Arge from Duke U)
Bin Yao Spring 2014 (Slides were made available by Feifei Li) Advanced Topics in Data Management.
Mehdi Mohammadi March Western Michigan University Department of Computer Science CS Advanced Data Structure.
Lars Arge Presented by Or Ozery. I/O Model Previously defined: N = # of elements in input M = # of elements that fit into memory B = # of elements per.
2IL50 Data Structures Fall 2015 Lecture 9: Range Searching.
Orthogonal Range Search
Lecture 2: External Memory Indexing Structures CS6931 Database Seminar.
Bin Yao (Slides made available by Feifei Li) R-tree: Indexing Structure for Data in Multi- dimensional Space.
Review for Exam 2 Topics covered (since exam 1): –Splay Tree –K-D Trees –RB Tree –Priority Queue and Binary Heap –B-Tree For each of these data structures.
Lecture 3: External Memory Indexing Structures (Contd) CS6931 Database Seminar.
Equivalence Between Priority Queues and Sorting in External Memory
External Memory Geometric Data Structures Lars Arge Duke University June 27, 2002 Summer School on Massive Datasets.
Problem Definition I/O-efficient Rectangular Segment Search Gautam K. Das and Bradford G. Nickerson Faculty of Computer science, University of New Brunswick,
BINARY TREES Objectives Define trees as data structures Define the terms associated with trees Discuss tree traversal algorithms Discuss a binary.
Internal Memory Pointer MachineRandom Access MachineStatic Setting Data resides in records (nodes) that can be accessed via pointers (links). The priority.
X1x1 x2x2 top-k y 3-sided x1x1 x2x2 External Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates Gerth Stølting Brodal Aarhus.
arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
Spatial Indexing I Point Access Methods.
Segment tree and Interval Tree
Advanced Topics in Data Management
R-tree: Indexing Structure for Data in Multi-dimensional Space
STACS arxiv.org/abs/ y 3-sided x1 x2 x1 x2 top-k
Reporting (1-D) Given a set of points S on the line, preprocess them to build structure that allows efficient queries of the from: Given an interval I=[x1,x2]
8th Workshop on Massive Data Algorithms, August 23, 2016
Presentation transcript:

Micha Streppel TU Eindhoven  NCIM-Groep, the Netherlands and Ke Yi AT&T Labs, USA  HKUST, Hong Kong

A set S of N points in R d Build a data structure such that given a query range Q, S ∩ Q can be returned efficiently Q focus on range reporting, range aggregation in paper

size: M size: infinite block size: BI/O

1D: B-tree Size: O(N/B), Query: O(log B (N/B)+k/B)) 2D: Half planes [Agarwal et al. 2000] Size: O(N/B), Query: O(log B (N/B)+k/B)) Orthogonal rectangles [Arge et al. 1999] Size: O(N/B), Query: Θ((N/B) ε +k/B) Query: O(log B (N/B)+k/B)), Size: Θ((N/B) log(N/B)/loglog B N) kdB-tree [Robinson 1981] Size O(N/B), Query: O((N/B) ½ + k/B) Q Q Q Exact range searching is difficult!

Internal memory: BBD-tree [Arya and Mount, 1995] BAR-tree [Duncan et al. 2001] Size: O(N), Query: O(log(N) + 1/ε + k ε ) for any convex Q External memory: this paper! Q radius = ε · diam(Q)

Internal memoryExternal memory 1D O(log(N) + k)O(log B (N/B) + k/B) 2D: half planes O(log(N) + k)O(log B (N/B) + k/B) 2D: orthogonal rectangles O(N ε + k)O((N/B) ε + k/B) 2D: kd-trees O(N ½ + k)O((N/B) ½ + k/B) 2D: approximate range searching O(log(N) + 1/ε + k ε )O(log B (N/B) + 1/ε + k ε /B) Query bounds of linear structures in internal/external memory previously

B = 3 Internal memory: O(N ½ + k) External memory: O((N/B) ½ + k/B) for orthogonal rectangle ranges

A space-partitioning scheme Similar to kd-tree But also use diagonal cuts All cells are convex and fat Some cuts have to be unbalanced But no two consecutive unbalanced cuts Height: O(log N) Query range intersects O(log(N) + 1/ε + k ε ) cells (any convex range)

Top-down blocking Rules for u: Check u’s two subtrees T1, T2 Add u if both have ≥ B/2 nodes If T1 small, check if entire T1 fits then add T1 else do not add u Not possible for both T1 and T2 to be small B = 8

Any subtree T u is stored in O(|T u |/B+ 1 ) blocks

Q QεQε nodes completely inside Q ε nodes intersects both Q and ∂Q ε total #: O(1/ε) total I/O: O(1/ε) total #: O(k ε ) organized in O(1/ε) subtrees total I/O: O(1/ε + k ε /B)

There are O(log N) such nodes, but we would like O(log B N) I/Os

size = B/ 2 − 1

Identify shallow nodes top-down u is shallow if there is a path of length log(B) beneath u is stored in more than c blocks For such a u Do a BFS for log(B) levels Move these nodes from their original blocks to a new block size = B/ 2 − 1 Achieving the desired query I/O: O(log B (N/B) + 1/ε + k ε /B)

Construction: O(N/B · log M/B (N/B)) I/Os Same as sorting Insertions and deletions Use partial rebuilding O(log B N + 1/B · log M/B (N/B)log(N/B)) I/Os amortized

S: a collection of objects The density of S is the smallest number λ such that any ball b is intersected by at most λ objects o in S with radius(o) ≥ radius(b) [de Berg et al. 1997] low density high density

The object-BAR-tree (using guarding sets [de Berg et al. 2003] ) Size: O(λN/B) Query: O(log B (N/B) + λ/B· 1 /ε + λ·k ε /B) Construction: O(λ N/B · log M/B (N/B)) low density high density

Extends to d dimensions Query becomes O(log B (N/B) + 1/ε d-1 + k ε /B) Non-convex query ranges Query becomes O(log B (N/B) + 1/ε d + k ε /B) Construction and query process does not depend on ε The actual cost is O(log B (N/B) + min ε {1/ε d-1 + k ε /B}) Open problems How to update the object-BAR-tree efficiently?

Thank you!