Indexing Network Voronoi Diagrams*

Slides:



Advertisements
Similar presentations
1 DATA STRUCTURES USED IN SPATIAL DATA MINING. 2 What is Spatial data ? broadly be defined as data which covers multidimensional points, lines, rectangles,
Advertisements

The A-tree: An Index Structure for High-dimensional Spaces Using Relative Approximation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa.
Nearest Neighbor Search
Computer Science and Engineering Inverted Linear Quadtree: Efficient Top K Spatial Keyword Search Chengyuan Zhang 1,Ying Zhang 1,Wenjie Zhang 1, Xuemin.
Spatial Indexing SAMs. Spatial Indexing Point Access Methods can index only points. What about regions? Z-ordering and quadtrees Use the transformation.
Spatial Data Structures Hanan Samet Computer Science Department
Graphics Application Lab The DR-tree: A Main Memory Data Structure for Complex Multi-dimensional Objects Seung-Hyun Ji Graphics Application Lab YOUNG-JU.
Danzhou Liu Ee-Peng Lim Wee-Keong Ng
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs Tomáš Skopal, David Hoksza Charles University in Prague Department of Software Engineering.
Introduction to Spatial Database System Presented by Xiaozhi Yu.
3D Shape Histograms for Similarity Search and Classification in Spatial Databases. Mihael Ankerst,Gabi Kastenmuller, Hans-Peter-Kriegel,Thomas Seidl Univ.
Yoshiharu Ishikawa (Nagoya University) Yoji Machida (University of Tsukuba) Hiroyuki Kitagawa (University of Tsukuba) A Dynamic Mobility Histogram Construction.
Similarity Search for Adaptive Ellipsoid Queries Using Spatial Transformation Yasushi Sakurai (NTT Cyber Space Laboratories) Masatoshi Yoshikawa (Nara.
Spatial Mining.
A Generic Framework for Handling Uncertain Data with Local Correlations Xiang Lian and Lei Chen Department of Computer Science and Engineering The Hong.
2-dimensional indexing structure
Multiple-key indexes Index on one attribute provides pointer to an index on the other. If V is a value of the first attribute, then the index we reach.
Subscription Subsumption Evaluation for Content-Based Publish/Subscribe Systems Hojjat Jafarpour, Bijit Hore, Sharad Mehrotra, and Nalini Venkatasubramanian.
Spatial Indexing SAMs.
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing.
Spatial Information Systems (SIS) COMP Raster-based structures (1)
An Incremental Refining Spatial Join Algorithm for Estimating Query Results in GIS Wan D. Bae, Shayma Alkobaisi, Scott T. Leutenegger Department of Computer.
Scalable Network Distance Browsing in Spatial Database Samet, H., Sankaranarayanan, J., and Alborzi H. Proceedings of the 2008 ACM SIGMOD international.
R-Trees 2-dimensional indexing structure. R-trees 2-dimensional version of the B-tree: B-tree of maximum degree 8; degree between 3 and 8 Internal nodes.
Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing (part 2)
Spatial Indexing I Point Access Methods. Spatial Indexing Point Access Methods (PAMs) vs Spatial Access Methods (SAMs) PAM: index only point data Hierarchical.
Spatial Indexing. Spatial Queries Given a collection of geometric objects (points, lines, polygons,...) organize them on disk, to answer point queries.
Fast Subsequence Matching in Time-Series Databases Christos Faloutsos M. Ranganathan Yannis Manolopoulos Department of Computer Science and ISR University.
Roger ZimmermannCOMPSAC 2004, September 30 Spatial Data Query Support in Peer-to-Peer Systems Roger Zimmermann, Wei-Shinn Ku, and Haojun Wang Computer.
Data Structures for Computer Graphics Point Based Representations and Data Structures Lectured by Vlastimil Havran.
Parallel Adaptive Mesh Refinement Combined With Multigrid for a Poisson Equation CRTI RD Project Review Meeting Canadian Meteorological Centre August.
Improving Min/Max Aggregation over Spatial Objects Donghui Zhang, Vassilis J. Tsotras University of California, Riverside ACM GIS’01.
Trees for spatial data representation and searching
AAU A Trajectory Splitting Model for Efficient Spatio-Temporal Indexing Presented by YuQing Zhang  Slobodan Rasetic Jorg Sander James Elding Mario A.
Spatial Data Management Chapter 28. Types of Spatial Data Point Data –Points in a multidimensional space E.g., Raster data such as satellite imagery,
Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Join-Queries between two Spatial Datasets Indexed by a Single R*-tree Michael Vassilakopoulos.
1 SD-Rtree: A Scalable Distributed Rtree Witold Litwin & Cédric du Mouza & Philippe Rigaux.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
R ++ -tree: an efficient spatial access method for highly redundant point data Martin Šumák, Peter Gurský University of P. J. Šafárik in Košice.
Parallel dynamic batch loading in the M-tree Jakub Lokoč Department of Software Engineering Charles University in Prague, FMP.
Lossless Compression CIS 465 Multimedia. Compression Compression: the process of coding that will effectively reduce the total number of bits needed to.
R-Tree. 2 Spatial Database (Ia) Consider: Given a city map, ‘index’ all university buildings in an efficient structure for quick topological search.
A Study of Balanced Search Trees: Brainstorming a New Balanced Search Tree Anthony Kim, 2005 Computer Systems Research.
Data Management+ Laboratory Dynamic Skylines Considering Range Queries Speaker: Adam Adviser: Yuling Hsueh 16th International Conference, DASFAA 2011 Wen-Chi.
Nearest Neighbor Queries Chris Buzzerd, Dave Boerner, and Kevin Stewart.
Reporter : Yu Shing Li 1.  Introduction  Querying and update in the cloud  Multi-dimensional index R-Tree and KD-tree Basic Structure Pruning Irrelevant.
On Computing Top-t Influential Spatial Sites Authors: T. Xia, D. Zhang, E. Kanoulas, Y.Du Northeastern University, USA Appeared in: VLDB 2005 Presenter:
9/2/2005VLDB 2005, Trondheim, Norway1 On Computing Top-t Most Influential Spatial Sites Tian Xia, Donghui Zhang, Evangelos Kanoulas, Yang Du Northeastern.
Observer Relative Data Extraction Linas Bukauskas 3DVDM group Aalborg University, Denmark 2001.
Clustering of Uncertain data objects by Voronoi- diagram-based approach Speaker: Chan Kai Fong, Paul Dept of CS, HKU.
August 30, 2004STDBM 2004 at Toronto Extracting Mobility Statistics from Indexed Spatio-Temporal Datasets Yoshiharu Ishikawa Yuichi Tsukamoto Hiroyuki.
A New Spatial Index Structure for Efficient Query Processing in Location Based Services Speaker: Yihao Jhang Adviser: Yuling Hsueh 2010 IEEE International.
Spatial and Geographic Databases. Spatial databases store information related to spatial locations, and support efficient storage, indexing and querying.
Spatial Indexing Techniques Introduction to Spatial Computing CSE 5ISC Some slides adapted from Spatial Databases: A Tour by Shashi Shekhar Prentice Hall.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
A Spatial Index Structure for High Dimensional Point Data Wei Wang, Jiong Yang, and Richard Muntz Data Mining Lab Department of Computer Science University.
Rethinking Choices for Multi-dimensional Point Indexing You Jung Kim and Jignesh M. Patel University of Michigan.
Jeremy Iverson & Zhang Yun 1.  Chapter 6 Key Concepts ◦ Structures and access methods ◦ R-Tree  R*-Tree  Mobile Object Indexing  Questions 2.
A Flexible Spatio-temporal indexing Scheme for Large Scale GPS Tracks Retrieval Yu Zheng, Longhao Wang, Xing Xie Microsoft Research.
Spatial Approximate String Search. Abstract This work deals with the approximate string search in large spatial databases. Specifically, we investigate.
Spatial Data Management
Presented by: Omar Alqahtani Fall 2016
Spatial Indexing I Point Access Methods.
Voronoi-based Geospatial Query Processing with MapReduce
Continuous Density Queries for Moving Objects
Multidimensional Search Structures
Donghui Zhang, Tian Xia Northeastern University
Efficient Aggregation over Objects with Extent
Presentation transcript:

Indexing Network Voronoi Diagrams* Hello, today I want to present this paper. Indexing Network Voronoi Diagrams. It come from International Conference DASFAA 2012. Ugur Demiryurek and Cyrus Shahabi 2012 Speaker: Yihao Jhang Adviser: Prof. Yuling Hsueh

Outline Abstract Introduction Approach Experimental Conclusion Voronoi Diagram Network Voronoi Diagram R-Tree The Voronoi R-Tree Approach Quad-Tree The Voronoi Quad-Tree Experimental Conclusion This is the outline that I will go through later. National Chung Cheng University

Abstract First is the abstract.

Abstract The Network Voronoi diagram and its variants have been extensively used in the context of numerous applications in road networks, particularly to efficiently evaluate various spatial proximity queries such as kNN. Existing index structures, treating a network Voronoi cell as a simple polygon, may yield inaccurate results due to the network topology, and fail to scale to large networks with numerous Voronoi generators. The author proposed the Voronoi Quad-tree to solve this problem. The Network Voronoi diagram and its variants have been extensively used in the context of numerous applications in road networks, particularly to efficiently evaluate various spatial proximity queries such as kNN. Although the existing approaches successfully utilize the network Voronoi diagram as a way to partition the space for their specific problems, there is little emphasis on how to efficiently find and access the network Voronoi cell containing a particular point or edge of the network. In this paper, author study the index structures on network Voronoi diagrams that enable exact and fast response to contain query in road networks. Existing index structures, treating a network Voronoi cell as a simple polygon, may yield inaccurate results due to the network topology, and fail to scale to large networks with numerous Voronoi generators. So, author propose a method, termed Voronoi-Quad-tree (or VQ-tree for short) Use Quad-tree to index network Voronoi diagrams to address both of these shortcomings. Finally, demonstrate the efficiency of VQ-tree via experimental evaluations with real-world datasets consisting of a variety of large road networks with numerous data objects. National Chung Cheng University

Introduction Next is the introduction. I will talk about some background techniques.

Voronoi Diagram (VD) 𝑃:{ 𝑝 1 , 𝑝 2 ,…, 𝑝 𝑛 } be a set of 𝑛 distinct sites distributed in the Euclidean space. 𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒 𝑞, 𝑝 𝑖 <𝑑𝑖𝑠𝑡𝑎𝑛𝑐𝑒(𝑞, 𝑝 𝑗 ) for each 𝑝 𝑗 ∈𝑃 with 𝑗≠𝑖 Each edge of VC( 𝑝 𝑖 ) is a segment of the perpendicular bisector of the line segment connecting 𝑝 to another point of the set P. First is Voronoi diagram. Let P be a set of n distinct sites, like generator points. These generator points can be considered any spatial type of objects, like gas station or restaurant. We define the Voronoi diagram of P as the subdivision of the space into n cells, one for each site in point set P, with the property that a point q lies in the cell corresponding to a site pi if and only if distance from q to pi less than distance from q to pj. Each edge of Voronoi cell is a segment of the perpendicular bisector of the line segment connecting p to another point of the set P. So we can see this figure. This figure show s the ordinary Voronoi diagram of eight points where the distance metric in Euclidean. National Chung Cheng University

Voronoi Diagram (cont.) Definition 1. The region given by VC( 𝑝 𝑖 ) = 𝑝|𝑑(𝑝, 𝑝 𝑖 )≤(𝑝, 𝑝 𝑗 ) where 𝑑(𝑝, 𝑝 𝑖 ) is the minimum Euclidean distance between 𝑝 and 𝑝 𝑖 . Definition 2. Voronoi Diagram (VD)  𝑉𝐷 𝑃 ={𝑉𝐶 𝑝 𝑖 ,…,𝑉𝐶( 𝑝 𝑛 )} Here, we have two definition for Voronoi diagram. Definition 1 is defined for what is Voronoi cell and Definition 2 is defined for what is the Voronoi diagram. As shown on the figure, The gray region of p4 represent a cell as defined in definition1. And the distance for each point in the gray region is nearer to p4 than to other points. And Voronoi diagram is set of these cells. National Chung Cheng University

Network Voronoi Diagram (NVD) The Voronoi diagram with a spatial network. 𝑝 1 , 𝑝 2 , and 𝑝 3 are the Voronoi generators (i.e., data objects such as restaurants, hotels). 𝑝 4 to 𝑝 16 are the intersections on a road network that interconnected by a set of edges. Next, after the Voronoi diagram, I’m going to talk about the NETWORK Voronoi diagram. With network Voronoi diagram, the Voronoi diagram described above is generalized by replacing the Euclidean space with a spatial network, like a road network. Let’s see the figure (a). p1, p2 , and p3 are the Voronoi generators, such as restaurants or hotels. And p4 to p16 are the intersections on a road network that interconnected by a set of edges. And figure B shows the NVD of the road Network where each line style corresponds to the shortest path tree based on the generators p1, p2 and p3. Each shortest path tree composes a network Voronoi cell and some edges can be partially contained in different network Voronoi cells. The border point b1 to b7 are the nodes where the shortest path trees meet as a result of the parallel Dijkstra algorithm. The border points between any two generator pi and pj are equally distanced from pi and pj. For example, b1 between p1 and p2 are equally distanced from p1 and p2. National Chung Cheng University

Network Voronoi Diagram (cont.) This figure shows a real network Voronoi diagram with respect to 50 data objects in Los Angeles road network. Each network node marked with a different color corresponds to a network Voronoi cell. National Chung Cheng University

Network Voronoi Diagram (cont.) Definition: 𝑁𝑉𝐷 𝑃 ={ 𝑉 𝑒𝑑𝑔𝑒 𝑝 1 ,…, 𝑉 𝑒𝑑𝑔𝑒 𝑝 𝑛 } NVD: Network Voronoi Diagram. 𝑉 𝑒𝑑𝑔𝑒 𝑝 𝑖 : Voronoi edge set of 𝑝 𝑖 . This is a simple definition for Network Voronoi Diagram. The Vedge is Voronoi edge set of pi And the Network Voronoi diagram is a set of Vedge. National Chung Cheng University

R-tree R-trees are tree data structures used for spatial access methods. The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree. Is a balanced search tree. Next, I going to introduce the R-tree. R-trees are tree data structures used for spatial access methods. And was proposed by Antonin Guttman in 1984. The key idea of the data structure is to group nearby objects and represent them with their minimum bounding rectangle in the next higher level of the tree. At the leaf level, each rectangle describes a single object; at higher levels the aggregation of an increasing number of objects. This can also be seen as an increasingly coarse approximation of the data set. Similar to the B-tree, the R-tree is also a balanced search tree, organizes the data in pages, and is designed for storage on disk. The figure shows a simple concept of R-tree. MBR and tree structure. National Chung Cheng University

The Voronoi R-tree VR-tree for short. VR-tree is based on the R-tree that splits the network space with hierarchically nested Minimum Bound Rectangles (MBR) generated around network Voronoi cells. 𝑐𝑜𝑛𝑡𝑎𝑖𝑛(𝑞) After we introduce the R-tree and the network Voronoi diagram. Next, I going to talk about The Voronoi R-tree. VR-tree for short. VR-tree is based on the R-tree that splits the network space with hierarchically nested MBR generated around network Voronoi cells. Given the location of a query point q, a contain q function query invoked on VR-tree starts from the root node and iteratively checks the MBRs with respect to a q to decide whether or not to further search the child nodes. National Chung Cheng University

The Voronoi R-tree (cont.) First shortcoming Inaccurate results for a 𝑐𝑜𝑛𝑡𝑎𝑖𝑛(𝑞) query. False-negative edges. But VR-tree has two main shortcomings. Let’s see the first shortcoming. VR-tree may yield inaccurate results for a contain query. This is because VR-tree makes the simplifying assumption that although the Network Voronoi Diagram is computed based on the network distance metric, its Network Voronoi Cells are treated as regular polygons and index using the R-tree. However, such approach may cause misclassification of the network edges in the network Voronoi cell and hence inaccurate results. We call these network edges are false-negative edges. We can see Figure B, although the new edges are included inside the Voronoi cell of p1, the network distance from any point on the false-negative edges to p3 is shorter than p1. Thus, with VR-tree, when q is located on false-negative edges. The function contain q will return incorrect Voronoi generator as the NN. National Chung Cheng University

The Voronoi R-tree (cont.) This figure depicts the Network Voronoi cell of a particular data object in Los Angeles road network where border nodes and false-negative edges are marked by light blue and red color respectively. National Chung Cheng University

The Voronoi R-tree (cont.) Second shortcoming Inefficient due to non-disjoint partitioning of the space. Affected by topologies and distribution of the objects. Example: The parent node(s) of the overlapping MBRs have to be accessed repeatedly in order to search the child nodes that contain q. Thus, with VR-tree the amount of work often depends on the overlapping areas of MBRs. And this is Second shortcoming of VR-tree. VR-tree is inefficient due to non-disjoint partitioning of the space. And depending on the different topologies of the road network and the distribution of the objects on the network segments, The overlapping areas of MBRs of network Voronoi cells may be quite large, and hence significant computation overhead in traversing R-tree for contain(q) query. For example: The parent nodes of the overlapping MBRs have to be accessed repeatedly in order to search the child nodes that contain q. Thus, with VR-tree the amount of work often depends on the overlapping areas of MBRs. The author also implemented VR-tree with R+ tree to reduce the impact of overlapping areas. However, author observe that the performance of VR+ tree is still less as compared to VQ-tree was proposed by the author. National Chung Cheng University

Approach Next, I going to present the approach.

Quad-Tree A tree data structure in which each internal node has exactly four children. Used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. Here, I introduce the Quad-tree. A quadtree is a tree data structure in which each internal node has exactly four children. Quadtrees are most often used to partition a two dimensional space by recursively subdividing it into four quadrants or regions. Like this figure. National Chung Cheng University

The Point Quad-tree The point quadtree is an adaption of a binary tree used to represent two dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. There are two types of the quad tree, first is the point quad tree and second is the region quad tree. First, I going to introduce the point quadtree. Point quad tree is an adaption of a binary tree used to represent two dimensional point data. It shares the features of all quadtrees but is a true tree as the center of a subdivision is always on a point. National Chung Cheng University

The Region Quad-tree The region quadtree represents a partition of space in two dimensions by decomposing the region into four equal quadrants, sub quadrants, and so on with each leaf node containing data corresponding to a specific sub region. Each node in the tree either has exactly four children, or has no children (a leaf node). Second type is the region quadtree. Region quadtree represents a partition of space in two dimensions by decomposing the region into four equal quadrants, subquadrants, and so on with each leaf node containing data corresponding to a specific subregion. Each node in the tree either has exactly four children, or has no children (a leaf node). A region quadtree may also be used as a variable resolution representation of a data field. For example, the temperatures in an area may be stored as a quadtree, with each leaf node storing the average temperature over the subregion it represents. National Chung Cheng University

The VQ-tree Enables disjoint decomposition of the underlying space. Color code. Concept of the region quadtree. Here, is our main approach proposed by author, The Voronoi Quad-tree. It enables disjoint decomposition of the underlying space to solve the problem of VR-tree. The main observation behind VQ-tree is that each color coded area in Left figure is a spatially contiguous region in the network space. The regions are mutually exclusive as they do not have any overlapping areas and collectively exhaustive as every location in the network space is associated with at least one generator. Therefore, an exact approximation of the network Voronoi diagram can be obtained by using a region quad-tree where the leaf nodes of the quad-tree correspond to a region in a Voronoi cell in Network Voronoi Diagram. The VQ-tree algorithm recursively subdivide the quadrants until each quadrant contains only one network Voronoi cell information. And Right figure illustrates the quad-blocks generated on the road network in Left figure. National Chung Cheng University

The VQ-tree (cont.) Algorithm 1 presents the outline for VQ-tree. Given a set of N nodes with their color codes and bounding box x1, x2, y1, y2 that contains N as an input. And the algorithm creates VQ-tree by recursively splitting the quadrants until all the nodes in a quadrant have the same color code. National Chung Cheng University

Experiment Next is the experiments

Experiment Experimental Setup VQ-tree vs. VR-tree Dataset California (CA), Los Angeles (LA) and San Joaquin County (SJ) Workstation with 2.7 GHz CPU and 12GB RAM Author conducted experiments with different spatial networks and various parameters to evaluate the performance of VQ-tree and VR-tree. It dataset is used California (CA), Los Angeles (LA) and San Joaquin County (SJ) networks. And workstation with 2.7 GHz Pentium Core Duo processor and 12 GB RAM. And experimental parameters value in table 1. National Chung Cheng University

Experiment (cont.) Ratio of False-Negative Edges First experiment is ratio of false-negative edges. To identify false-negative edges, author compare the encoded values of each node based on VR-tree and VQ-tree. Figure a shows the ratio of false-negative edges of both networks where the object cardinality ranging from 10 to 1000. The ratio of incorrectly identified edges is 16% on average in both networks. And maximum recorded for LA and CA is 24% and 29% respectively. Figure B illustrates the ratio of false-negative edges with different object distribution for both CA and LA road networks. We can observe that the number of false-negative edges is less in Gaussian distribution. This is because as objects are clustered in the spatial network with Gaussian distribution, the corresponding shortest path trees would be less disperse and hence spatially close border points. National Chung Cheng University

Experiment (cont.) Precomputation Time Next experiment is compare Precomputation time of VQ-tree with VR-tree. Figure A shows the Precomputation time with varying network size. And Figure B illustrates the impact of object cardinality over Precomputation time in LA road network. We observe that as the number of objects increases, the preprocessing time for both approaches increases. National Chung Cheng University

Experiment (cont.) Ratio of Index Reconstruction Time Next experiment is compare the index reconstruction time of VQ-tree with VR-tree. In this experiments, author update the location of the randomly selected data objects and measure the index reconstruction overhead in both VR-tree and VQ-tree. We can observe that VQ-tree outperforms in VR-tree with respect to index reconstruction. This is because the insert operations in VR-tree are expensive. National Chung Cheng University

Experiment (cont.) Response Time Final experiment is compare the response time of VQ-tree with VR-tree. The results indicate that VQ-tree outperforms VR-tree and scales better with large number of data objects. This is because of the fact that, with VR-tree, the amount of work often depends on the size of the overlapping areas. The overlapping areas may belong to more than one NVC and hence during the search the parent nodes of the overlapping MBRs have to be accessed repeatedly. National Chung Cheng University

VQ-tree enable efficient access to the network Voronoi cells containing a particular point or edge of the network. Intend to pursue this study in two directions. Investigate disk organization strategies for Voronoi Quad-tree. Work on incremental index update techniques to avoid node reconstruction overhead due to update in the location of Voronoi generators. Conclusion Q&A Thank you for listening! Final we make a conclusion. VQ-tree enable efficient access to the network Voronoi cells containing a particular point or edge of the network. Author also intend to pursue this study in two directions. First is investigate disk organization strategies for Voronoi Quad-tree. Second , work on incremental index update techniques to avoid node reconstruction overhead due to update in the location of Voronoi generators. Ok. It’s end for this present. Thank you for listening. National Chung Cheng University