Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal

Similar presentations


Presentation on theme: "1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal"— Presentation transcript:

1 1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal http://www-users.cs.umn.edu/~smithal/

2 Chapter Organization 2 OLD Organization 5.1 Evaluation of Spatial Operations 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Summary New Organization 5.1 Evaluation of Spatial Operations - Parallel spatial joins -Top k spatial joins 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Introduction to query models 5.7 Spatial Query types Reverse nearest neighbour queries (RNN) Skyline queries 5.8 Trends : Spatial Query Evaluation on Hadoop 5.9 Summary

3 New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO2 - 5.1.6 LO6 - 5.7 LO7 - 5.8 LO8 - 5.9 3

4 Parallel spatial joins Concept In a parallel architecture, work is distributed amongst several processors. For a spatial join, the work can be distributed in both the filtering and refinement stages. Top k spatial joins Concept A spatial join finds all pairs of objects satisfying a given relation between the objects Given two data sets A and B, the top-k spatial Join retrieves the k objects in data set A or B that intersect the maximum number of objects from the other data set 4

5 Example – Parallel spatial join 5 Src: Parallel Processing of Spatial Joins Using R-trees Thomas Brinkhoff, Hans-Peter Kriegel, Bernhard Seeger Steps- Task creation - Creating a set of tasks to be executed in parallel. Task assignment Task execution

6 New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO2 - 5.1.6 LO6 - 5.7 LO7 - 5.8 LO8 - 5.9 6

7 LO6: Introduction to query models Concept Overview of Query models for Oracle spatial & ArcSDE Oracle Spatial: provides a SQL schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial features in an Oracle database. Oracle Spatial uses a two-tier query model to resolve spatial queries and spatial joins. It implements the idea of Filter-Refine Paradigm. The two operations are referred to as primary and secondary filter operations. The primary filter permits fast selection of candidate records to pass along to the secondary filter. The secondary filter-Expensive- yields an accurate answer to a spatial query. 7

8 Example 8 The primary filter checks to see if the MBRs of the candidate objects interact, not whether the objects themselves interact. The secondary filter ensures that only candidate objects that actually interact are selected.

9 New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO2 - 5.1.6 LO6 - 5.7 LO7 - 5.8 LO8 - 5.9 9

10 LO7.1: Understand concept of rnn queries Reverse Nearest Neighbor Queries Concept – Focuses on inverse relations among points Example - 5 data points What are the RNNs of 1? 10 4 3 2 1 5

11 11 Example: Business Impact Analysis

12 Algorithm Step 1: For each point p ε S, determine the distance to the nearest neighbor of p in S, denoted N(p). N(p) = min q ε S –{p} d(p,q). For each p ε S, generate a circle (p,N(p)) where p is its center and N(p) its radius. Step 2: For any query q (example Target store), determine all the circles (p,N(p)) that contain q and return their centers p. 12

13 New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO2 - 5.1.6 LO6 - 5.7 LO7 - 5.8 LO8 - 5.9 13

14 LO7.2 : Understanding concept of skyline queries Example - You have to attend a conference and for your stay you are trying to find a good hotel. Your purpose is to optimize this hotel search so that both the distance from conference centre as well as price of the booking is low. 14

15 Concept Domination: a point dominates A another point B if and only if the coordinate of A on any axis is not larger than the corresponding coordinate of B. 15

16 Example Given a set of points, the skyline query returns a set of points (referred to as the skyline points), such that any point in skyline is not dominated by any other point in the dataset. 16

17 Distance from conference center Price h1 h3 h2 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 S1 S3 S2 S4 Example contd….

18 Distance from conference center Price h1 h3 h2 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 S1 S2 S3 S4 Example contd….

19 Distance from conference center Price h1 h2 h4 Result

20 New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO2 - 5.1.6 LO6 - 5.7 LO7 - 5.8 LO8 - 5.9 20

21 Spatial Query Evaluation on Hadoop 21 Hadoop HDFS – Hadoop Distributed File System Map Reduce : Programming paradigm

22 Parallel Databases v/s Map Reduce 22 Parallel DBMS or Map Reduce Hadoop Parallel DBMS Hadoop Structured Data Semi Structured data Expensive to set up Can be done with low budget Complex analytics not easy Complex analytics easier A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden & M. Stonebraker "A comparison of approaches to large-scale data analysis," SIGMOD ’09 Conclusion: Hadoop/Map reduce cannot replace DBMS Combination or Map Reduce and SQL - Aster Data

23 Spatial Query Evaluation 23 Map Stage 1) Homogenize data 2) Map to tiles. 3) Merge tiles into buckets. Reduce Stage 1)Filter to find overlapping MBRs 2)Refine results


Download ppt "1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal"

Similar presentations


Ads by Google