NiagaraCQ A Scalable Continuous Query System for Internet Databases Jianjun Chen, David J DeWitt, Feng Tian, Yuan Wang University of Wisconsin – Madison.

Slides:



Advertisements
Similar presentations
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Advertisements

Supporting top-k join queries in relational databases Ihab F. Ilyas, Walid G. Aref, Ahmed K. Elmagarmid Presented by Rebecca M. Atchley Thursday, April.
CS 540 Database Management Systems
Evaluation of Relational Operators CS634 Lecture 11, Mar Slides based on “Database Management Systems” 3 rd ed, Ramakrishnan and Gehrke.
Lecture 13: Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data.
Database Management Systems 3ed, R. Ramakrishnan and Johannes Gehrke1 Evaluation of Relational Operations: Other Techniques Chapter 14, Part B.
Query Evaluation. An SQL query and its RA equiv. Employees (sin INT, ename VARCHAR(20), rating INT, age REAL) Maintenances (sin INT, planeId INT, day.
1 Relational Query Optimization Module 5, Lecture 2.
DB performance tuning using indexes Section 8.5 and Chapters 20 (Raghu)
NiagaraCQ A Scalable Continuous Query System for Internet Databases.
Query Processing (overview)
1 NiagaraCQ: A Scalable Continuous Query System for Internet Databases CS561 Presentation Xiaoning Wang.
Query Optimization 3 Cost Estimation R&G, Chapters 12, 13, 14 Lecture 15.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Database Performance Tuning and Query Optimization.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases (modified slides available on course webpage) Jianjun Chen et al Computer Sciences.
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 13: Query Processing.
CSE314 Database Systems More SQL: Complex Queries, Triggers, Views, and Schema Modification Doç. Dr. Mehmet Göktürk src: Elmasri & Navanthe 6E Pearson.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Query Evaluation Chapter 12: Overview.
Physical Database Design & Performance. Optimizing for Query Performance For DBs with high retrieval traffic as compared to maintenance traffic, optimizing.
CSCE Database Systems Chapter 15: Query Execution 1.
Database Management 9. course. Execution of queries.
Data Streams and Continuous Query Systems
Ashwani Roy Understanding Graphical Execution Plans Level 200.
©Silberschatz, Korth and Sudarshan13.1Database System Concepts Chapter 13: Query Processing Overview Measures of Query Cost Selection Operation Sorting.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
C-Store: Tuple Reconstruction Jianlin Feng School of Software SUN YAT-SEN UNIVERSITY Mar 27, 2009.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Lecture 1- Query Processing Advanced Databases Masood Niazi Torshiz Islamic Azad university- Mashhad Branch
Introduction to Query Optimization, R. Ramakrishnan and J. Gehrke 1 Introduction to Query Optimization Chapter 13.
16.7 Completing the Physical- Query-Plan By Aniket Mulye CS257 Prof: Dr. T. Y. Lin.
CS4432: Database Systems II Query Processing- Part 2.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Query Processing – Query Trees. Evaluation of SQL Conceptual order of evaluation – Cartesian product of all tables in from clause – Rows not satisfying.
Query Execution. Where are we? File organizations: sorted, hashed, heaps. Indexes: hash index, B+-tree Indexes can be clustered or not. Data can be stored.
20 Copyright © 2008, Oracle. All rights reserved. Cache Management.
CS 440 Database Management Systems Lecture 5: Query Processing 1.
Chapter 12 Query Processing (2) Yonsei University 2 nd Semester, 2013 Sanghyun Park.
File Processing : Query Processing 2008, Spring Pusan National University Ki-Joune Li.
Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,
Query Execution Query compiler Execution engine Index/record mgr. Buffer manager Storage manager storage User/ Application Query update Query execution.
NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD.
Query Processing and Query Optimization Database System Implementation CSE 507 Some slides adapted from Silberschatz, Korth and Sudarshan Database System.
1 Overview of Query Evaluation Chapter Outline  Query Optimization Overview  Algorithm for Relational Operations.
Retele de senzori Curs 2 - 1st edition UNIVERSITATEA „ TRANSILVANIA ” DIN BRAŞOV FACULTATEA DE INGINERIE ELECTRICĂ ŞI ŞTIINŢA CALCULATOARELOR.
Database System Architecture and Implementation Execution Costs 1 Slides Credit: Michael Grossniklaus – Uni-Konstanz.
More SQL: Complex Queries, Triggers, Views, and Schema Modification
Practical Database Design and Tuning
CS 540 Database Management Systems
Indexes By Adrienne Watt.
CS 440 Database Management Systems
Parallel Databases.
Database Management System
NiagaraCQ : A Scalable Continuous Query System for Internet Databases
Chapter 12: Query Processing
Database Performance Tuning and Query Optimization
Introduction to Query Optimization
Chapter 15 QUERY EXECUTION.
Evaluation of Relational Operations: Other Operations
Physical Database Design
Practical Database Design and Tuning
Selected Topics: External Sorting, Join Algorithms, …
Relational Algebra Chapter 4, Sections 4.1 – 4.2
Lecture 2- Query Processing (continued)
Lecture 13: Query Execution
Chapter 8 Advanced SQL.
Chapter 11 Database Performance Tuning and Query Optimization
Evaluation of Relational Operations: Other Techniques
Presentation transcript:

NiagaraCQ A Scalable Continuous Query System for Internet Databases Jianjun Chen, David J DeWitt, Feng Tian, Yuan Wang University of Wisconsin – Madison 2000 Slides adapted from Rachel Pottlinger and Yehoshua Sagiv Presented by Andrea Connell

2NiagaraCQ Problem Lack of a scalable and efficient system which supports persistent queries, that allow users to receive new results when they become available: Notify me whenever the price of Dell or Micron stock drops by more than 5% and the price of Intel stock remains unchanged over next three months. The internet has a large amount of frequently updating data – how do we manage CQs efficiently

3NiagaraCQ Approach Incremental Grouping by similar query structure Grouped CQs share computation and data Grouped CQs share computation and data Reduce I/O Reduce I/O Reduce unnecessary query invocations Reduce unnecessary query invocations Change-based or timer-based queries Incremental Evaluation User Interface - high level query language

4NiagaraCQ Command Language Create continuous query: CREATE CQ_name XML-QL query DO action {START start_time} {EVERY time_interval} {EXPIRE expiration_time} Delete continuous query: DELETE CQ_name

5NiagaraCQ Expression Signature Represent the same syntax structure, but possibly different constant values, in different queries. Where Where INTC INTC element_as $g element_as $g in “ construct $g = Quotes.Quote.Symbol constant in quotes.xml Where Where MSFT MSFT element_as $g element_as $g in “ construct $g

6NiagaraCQ Query Plan Trigger Action I Select Symbol=“INTC” File Scan quotes.xml Trigger Action J Select Symbol=“MSFT” File Scan quotes.xml

7NiagaraCQ Groups Groups are created for queries based on their expression signatures. Consists of three parts: Group Signature Group Constant table Group Query Plan

8NiagaraCQ Groups Groups are created for queries based on their expression signatures. Consists of three parts: = Quotes.Quote.Symbol constant in quotes.xml Group Signature Group Constant table Group Query Plan

9NiagaraCQ Groups Groups are created for queries based on their expression signatures. Consists of three parts: Group Signature Group Constant table Group Query Plan Constant_valueDestination_buffer …… INTC Dest. I MSFT Dest. J …… Stored on disk

10NiagaraCQ Groups Groups are created for queries based on their expression signatures. Consists of three parts: Group Signature Group Constant table Group Query Plan File File Scan quotes.xml Constant Table Symbol = Constant_value Join Split Action IAction J..... Stored in memory-resident hash table

11NiagaraCQ Incremental Grouping Algorithm 1.Group optimizer traverses the query plan bottom up. 2.Matches the query’s expression signature with the signatures of existing groups 3.Group optimizer breaks the query plan into two parts Lower – removed Upper – added to the group plan. 4.Adds the constant and action to the constant table. Trigger Action Select Symbol=“AOL” File Scan quotes.xml Groups may not be optimal

Example 12NiagaraCQ Using the constant table, the split function moves all values for MS to buffer A and SUN to buffer B What are these buffers? How do they work?

13NiagaraCQ Pipeline Approach Tuples are pipelined directly from the output of one operator into the input of the next operator. All parts of the group are combined (including trigger actions), creating a single execution plan. Disadvantages Doesn’t work for grouping timer-based queries. Doesn’t work for grouping timer-based queries. Split operator may become a bottleneck. Split operator may become a bottleneck. Not all trigger actions may need to be executed. Not all trigger actions may need to be executed.

14NiagaraCQ Intermediate Files Figure 3.8

15NiagaraCQ Intermediate Files Advantages Each query is scheduled independently Each query is scheduled independently Intermediate files and original data sources are monitored in the same way Intermediate files and original data sources are monitored in the same way The potential bottleneck problem of the pipelined approach is avoided. The potential bottleneck problem of the pipelined approach is avoided.Disadvantages Extra disk I/Os. Extra disk I/Os. Split operator becomes a blocking operator. Split operator becomes a blocking operator.

16NiagaraCQ Range Queries What if we want to return every stock with a price increase of more than 5%? A range query may have an upper bound and a lower bound, so the constant table is modified to include these two columns. Where Where <Change_ratio>$c</> element_as $g element_as $g in “quotes.xml”, $c>0.05 construct $g Where Where <Change_ratio>$c</> element_as $g element_as $g in “quotes.xml”, $c>0.15 construct $g Overlap in intermediate files

17NiagaraCQ Virtual Intermediate Files All outputs from split operator are stored in one real intermediate file. This file has clustered index on the range. Virtual intermediate files store a value range. The value range is used to retrieve data from the real intermediate file. Modification of virtual intermediate files can trigger upper-level queries.

Grouping of Join Operators Since joins can be very expensive, joins with the same expression are grouped. Which order: Join first, or Selection first? 18NiagaraCQ This paper says Selection; Future work says join

19NiagaraCQ Event Detection Types of Events Data-source change Push-based (inform NiagaraCQ of changes) Pull-based (checked periodically by NiagaraCQ) Timer Set to a specific time interval Grouped with other timer-based queries Only fired if data has changed

20NiagaraCQ Incremental Evaluation Queries are invoked only on changed data For each file, NiagaraCQ keeps a “ delta file ” Queries are run over delta files when possible Incremental evaluation of join operators requires complete data files Time stamp is added to each tuple in the delta file in order to support timer-based queries Tuples remain in delta file for the longest time interval within the group

21NiagaraCQ System Architecture Figure 4.1

22NiagaraCQ Continues Queries Processing Continuous Query Manager (CQM) Event Detector (ED) Data Manager (DM) Query Engine (QE) 1 2, CQM adds continuous queries with file and timer information to enable ED to monitor the events 2. ED asks DM to monitor changes to files3. When a timer event happens, ED asks DM the last modified time of files 4. DM informs ED of changes to pushed-based data sources 5. If file changes and timer events are satisfied, ED provides CQM with a list of firing CQs 6. CQM invokes QE to execute firing CQs7. File scan operator calls DM to retrieve selected documents 8. DM only returns changes between last fire time and current fire time Figure 4.2 NiagaraCQ Niagara

Experimental Results 23NiagaraCQ Simple Selection Range Selection Selection & Join Equal & Range Mixed Queries

24NiagaraCQ References NiagaraCQ: A Scalable Continuous Query System for Internet Databases Design and Evaluation of Alternative Selection Placement Strategies in Optimizing Continuous Queries Dynamic Re-grouping of Continuous Queries Follow Up Papers

What kinds of applications other than stock quotes would this be appropriate for? What would it not work for? NiagaraCQ is somewhat similar to RSS. What types of applications are better with RSS and which are better with NiagaraCQ? Are expression signatures too simple? Do they group together enough of the kinds of queries that this system is meant to handle? Do you think they would work better or worse for SQL queries instead of XML? 25NiagaraCQ Discussion