1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

1 Spatial Join. 2 Papers to Present “Efficient Processing of Spatial Joins using R-trees”, T. Brinkhoff, H-P Kriegel and B. Seeger, Proc. SIGMOD, 1993.
Chapter 15 Algorithms for Query Processing and Optimization Copyright © 2004 Pearson Education, Inc.
Effective Keyword Based Selection of Relational Databases Bei Yu, Guoliang Li, Karen Sollins, Anthony K.H Tung.
CS 540 Database Management Systems
Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
IiWAS2002, Bandung, Indonesia Teaching and Learning Databases Dr. Stéphane Bressan National University of Singapore.
Database Systems: A Practical Approach to Design, Implementation and Management International Computer Science S. Carolyn Begg, Thomas Connolly Lecture.
Query Processing (overview)
Spatial Information Systems (SIS) COMP Spatial access methods: Indexing.
ACS-4902 Ron McFadyen Chapter 15 Algorithms for Query Processing and Optimization.
Lead Black Slide. © 2001 Business & Information Systems 2/e2 Chapter 7 Information System Data Management.
Lecture Nine Database Planning, Design, and Administration
Chapter 8 Physical Database Design. McGraw-Hill/Irwin © 2004 The McGraw-Hill Companies, Inc. All rights reserved. Outline Overview of Physical Database.
Chapter 17 Methodology – Physical Database Design for Relational Databases Transparencies © Pearson Education Limited 1995, 2005.
PARALLEL DBMS VS MAP REDUCE “MapReduce and parallel DBMSs: friends or foes?” Stonebraker, Daniel Abadi, David J Dewitt et al.
Introduction Using time property and location property from lost items’ pictures, we construct the Lost and Found System which combined with image search.
Database System Development Lifecycle © Pearson Education Limited 1995, 2005.
Committed to Deliver….  We are Leaders in Hadoop Ecosystem.  We support, maintain, monitor and provide services over Hadoop whether you run apache Hadoop,
Overview of the Database Development Process
MapReduce VS Parallel DBMSs
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
MapReduce vs. Parallel DBMS Hamid Safizadeh, Otelia Buffington
1 California State University, Fullerton Chapter 7 Information System Data Management.
Lecture 9 Methodology – Physical Database Design for Relational Databases.
1 Introduction to Database Systems. 2 Database and Database System / A database is a shared collection of logically related data designed to meet the.
The X-Tree An Index Structure for High Dimensional Data Stefan Berchtold, Daniel A Keim, Hans Peter Kriegel Institute of Computer Science Munich, Germany.
Project 2 Presentation & Demo Course: Distributed Systems By Pooja Singhal 11/22/
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 13 Query Processing Melissa Jamili CS 157B November 11, 2004.
Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters Hung-chih Yang(Yahoo!), Ali Dasdan(Yahoo!), Ruey-Lung Hsiao(UCLA), D. Stott Parker(UCLA)
Daniel J. Abadi · Adam Marcus · Samuel R. Madden ·Kate Hollenbach Presenter: Vishnu Prathish Date: Oct 1 st 2013 CS 848 – Information Integration on the.
Lead Black Slide Powered by DeSiaMore1. 2 Chapter 7 Information System Data Management.
Physical Database Design I, Ch. Eick 1 Physical Database Design I About 25% of Chapter 20 Simple queries:= no joins, no complex aggregate functions Focus.
Chapter 6 Distributed File Systems Summary Bernard Chen 2007 CSc 8230.
INFO1408 Database Design Concepts Week 15: Introduction to Database Management Systems.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Chapter 12 Query Processing (1) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
Hung-chih Yang 1, Ali Dasdan 1 Ruey-Lung Hsiao 2, D. Stott Parker 2
2005/12/021 Fast Image Retrieval Using Low Frequency DCT Coefficients Dept. of Computer Engineering Tatung University Presenter: Yo-Ping Huang ( 黃有評 )
MapReduce and Data Management Based on slides from Jimmy Lin’s lecture slides ( (licensed.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
A Comparison of Approaches to Large-Scale Data Analysis Andrew Pavlo, Erik Paulson, Alexander Rasin, Daniel J. Abadi, David J. Dewitt, Samuel Madden, Michael.
Relational Operator Evaluation. Overview Application Programmer (e.g., business analyst, Data architect) Sophisticated Application Programmer (e.g.,
Chapter 8 Physical Database Design. Outline Overview of Physical Database Design Inputs of Physical Database Design File Structures Query Optimization.
Session 1 Module 1: Introduction to Data Integrity
Advance Database Systems Query Optimization Ch 15 Department of Computer Science The University of Lahore.
1 CSIS 7101: CSIS 7101: Spatial Data (Part 1) The R*-tree : An Efficient and Robust Access Method for Points and Rectangles Rollo Chan Chu Chung Man Mak.
HEMANTH GOKAVARAPU SANTHOSH KUMAR SAMINATHAN Frequent Word Combinations Mining and Indexing on HBase.
CS 540 Database Management Systems
Lecture 15: Query Optimization. Very Big Picture Usually, there are many possible query execution plans. The optimizer is trying to chose a good one.
MapReduce and Parallel DMBSs: Friends or Foes? Michael Stonebraker, Daniel Abadi, David J. Dewitt, Sam Madden, Erik Paulson, Andrew Pavlo, Alexander Rasin.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Chapter 13: Query Processing
1 Spatial Query Processing using the R-tree Donghui Zhang CCIS, Northeastern University Feb 8, 2005.
CS522 Advanced database Systems
Database Management System
Storage and Indexes Chapter 8 & 9
Latihan Answer the following questions using the relational schema from the Exercises at the end of Chapter 3: Create the Hotel table using the integrity.
Chapter 12: Query Processing
Sameh Shohdy, Yu Su, and Gagan Agrawal
External Sorting The slides for this text are organized into chapters. This lecture covers Chapter 11. Chapter 1: Introduction to Database Systems Chapter.
File Organizations and Indexing
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Probabilistic Data Management
Chapter 12 Query Processing (1)
Evaluation of Relational Operations: Other Techniques
Evaluation of Relational Operations: Other Techniques
File Organizations and Indexing
Presentation transcript:

1 Chapter 5 : Query Processing and Optimization Group 4: Nipun Garg, Surabhi Mithal

Chapter Organization 2 OLD Organization 5.1 Evaluation of Spatial Operations 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Summary New Organization 5.1 Evaluation of Spatial Operations - Parallel spatial joins -Top k spatial joins 5.2 Query Optimization 5.3 Analysis of Spatial Index Structures 5.4 Distributed Spatial Database Systems 5.5 Parallel Spatial Database Systems 5.6 Introduction to query models 5.7 Spatial Query types Reverse nearest neighbour queries (RNN) Skyline queries 5.8 Trends : Spatial Query Evaluation on Hadoop 5.9 Summary

New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO LO LO LO

Parallel spatial joins Concept In a parallel architecture, work is distributed amongst several processors. For a spatial join, the work can be distributed in both the filtering and refinement stages. Top k spatial joins Concept A spatial join finds all pairs of objects satisfying a given relation between the objects Given two data sets A and B, the top-k spatial Join retrieves the k objects in data set A or B that intersect the maximum number of objects from the other data set 4

Example – Parallel spatial join 5 Src: Parallel Processing of Spatial Joins Using R-trees Thomas Brinkhoff, Hans-Peter Kriegel, Bernhard Seeger Steps- Task creation - Creating a set of tasks to be executed in parallel. Task assignment Task execution

New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO LO LO LO

LO6: Introduction to query models Concept Overview of Query models for Oracle spatial & ArcSDE Oracle Spatial: provides a SQL schema and functions that facilitate the storage, retrieval, update, and query of collections of spatial features in an Oracle database. Oracle Spatial uses a two-tier query model to resolve spatial queries and spatial joins. It implements the idea of Filter-Refine Paradigm. The two operations are referred to as primary and secondary filter operations. The primary filter permits fast selection of candidate records to pass along to the secondary filter. The secondary filter-Expensive- yields an accurate answer to a spatial query. 7

Example 8 The primary filter checks to see if the MBRs of the candidate objects interact, not whether the objects themselves interact. The secondary filter ensures that only candidate objects that actually interact are selected.

New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO LO LO LO

LO7.1: Understand concept of rnn queries Reverse Nearest Neighbor Queries Concept – Focuses on inverse relations among points Example - 5 data points What are the RNNs of 1?

11 Example: Business Impact Analysis

Algorithm Step 1: For each point p ε S, determine the distance to the nearest neighbor of p in S, denoted N(p). N(p) = min q ε S –{p} d(p,q). For each p ε S, generate a circle (p,N(p)) where p is its center and N(p) its radius. Step 2: For any query q (example Target store), determine all the circles (p,N(p)) that contain q and return their centers p. 12

New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO LO LO LO

LO7.2 : Understanding concept of skyline queries Example - You have to attend a conference and for your stay you are trying to find a good hotel. Your purpose is to optimize this hotel search so that both the distance from conference centre as well as price of the booking is low. 14

Concept Domination: a point dominates A another point B if and only if the coordinate of A on any axis is not larger than the corresponding coordinate of B. 15

Example Given a set of points, the skyline query returns a set of points (referred to as the skyline points), such that any point in skyline is not dominated by any other point in the dataset. 16

Distance from conference center Price h1 h3 h2 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 S1 S3 S2 S4 Example contd….

Distance from conference center Price h1 h3 h2 h4 h5 h6 h7 h8 h9 h10 h11 h12 h13 S1 S2 S3 S4 Example contd….

Distance from conference center Price h1 h2 h4 Result

New Learning Objectives Learning Objectives (LO) LO2 : Learn about alternative algorithms to process spatial queries LO6: Introduction to query models LO7: Understanding new spatial query types LO7.1 : Understanding concept of RNN queries LO7.2 : Understanding concept of skyline queries LO8 : Trends : Spatial queries on Hadoop Map Reduce Mapping Sections to learning objectives LO LO LO LO

Spatial Query Evaluation on Hadoop 21 Hadoop HDFS – Hadoop Distributed File System Map Reduce : Programming paradigm

Parallel Databases v/s Map Reduce 22 Parallel DBMS or Map Reduce Hadoop Parallel DBMS Hadoop Structured Data Semi Structured data Expensive to set up Can be done with low budget Complex analytics not easy Complex analytics easier A. Pavlo, E. Paulson, A. Rasin, D. J. Abadi, D. J. DeWitt, S. Madden & M. Stonebraker "A comparison of approaches to large-scale data analysis," SIGMOD ’09 Conclusion: Hadoop/Map reduce cannot replace DBMS Combination or Map Reduce and SQL - Aster Data

Spatial Query Evaluation 23 Map Stage 1) Homogenize data 2) Map to tiles. 3) Merge tiles into buckets. Reduce Stage 1)Filter to find overlapping MBRs 2)Refine results