Presentation is loading. Please wait.

Presentation is loading. Please wait.

NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD.

Similar presentations


Presentation on theme: "NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD."— Presentation transcript:

1 NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD 2000 Talk by S. Sudarshan

2 Continuous Queries Example Inform me when there is a new publication related to multi-query optimization A broad classification  Change based  Timer based

3 NiagaraCQ A CQ system for the Internet Continuous Queries on XML data sets Scalable CQ processing Incremental group optimization Handles both change based and timer based queries in a uniform way

4 NiagaraCQ command language Creating a CQ Create CQ_name XML-QL query Do action { START start_time} { EVERY time_interval} { EXPIRE expiration_time} Delete CQ_name

5 Expression Signature Query examples Where INTC element_as $g in “http://www.stock.com/quotes.xml” construct $g Where MSFT element_as $g in “http://www.stock.com/quotes.xml” construct $g Expression signatures Quotes.Quote.Symbol in quotes.xml constant =

6 Query plans Trigger Action ITrigger Action J File Scan Select Symbol = “MSFT” Select Symbol = “INTC” File Scan quotes.xml

7 Group Group Signature  Common signature of all queries in the group Group constant table Constant_value Dest_buffer INTC Dest. I MSFT Dest. J

8 The group plan

9 Incremental Grouping Algo When a new query is submitted If the expression signature of the new query matches that of existing groups Break the query plan into two parts Remove the lower part Add the upper part onto the group plan else create a new group

10

11 Query split with materialized intermediate files Why not use a pipeline scheme ?  Split operator may block simple queries  Gives a single complicated execution plan  A large portion of query plan may not need to be executed at each invocation  Does not work for grouping timer based queries Using intermediate files  Cut query plan into 2 parts at split operator  Add a file scan operator to upper part to read intermediate file  Intermediate files are monitored just like other data sources

12 The query split scheme

13 Trade-offs Other advantages of materialized intermediate files  Only the necessary queries are executed  Uniform handling of intermediate files and original data source files Disadvantages  Split operator becomes a blocking operator  Extra disk I/Os

14 Range Predicates E.g. R.a < val or val1 < R.a < val2 Multiple such ranges Problem  Intermediate files may contain duplicate tuples Idea: Virtual intermediate files Use an index to implement this

15 Incremental grouping of selection predicates Multiple selection predicates in a query CNF for predicates on same data source Incremental grouping  Choose the most selective conjunct and implement virtual file on this conjunct Evaluation of other predicates  Upper levels of continuous query Example query Where ”INTC” $p element_as $g in “quotes.xml”, $p < 100 Construct $g

16 Incremental grouping of join operators A join query Quotes.Quote.Change_Ratio constant in “quotes.xml” Where $s element_as $g in “quotes.xml”, $s element_as $t in “companies.xml” construct $g, $t

17 Queries that contain both join and selection Example query : Where $s ”Computer Service” element_as $g in “quotes.xml”, $s element_as $t in “companies.xml” construct $g, $t Where to place the selection operator ?  Below the join Eliminates irrelevant tuples  Above the join Allows sharing Pick based on cost model

18 Grouping timer-based queries Challenge  Sharing common computation Event List  Stores time events sorted in time order

19 Incremental evaluation Invoke queries only on changed data For each source file, NiagaraCQ keeps a delta file Also for the intermediate files Time stamp store the each tuple Incremental evaluation of join operators requires complete data files

20 Memory Caching Thousands of continuous queries can’t fit in memory What should we cache ?  Grouped query plans What about non-grouped queries ?  Favor small delta files  Front part of the event list

21 System Architecture

22 CQ processing

23 Experimental Results Example query : Where ”INTC” element_as $g in “quotes.xml”, construct $g N = number of installed queries F= number of fired queries C = number of tuples modified

24 Performance Results Case 1: F=N, C=1000 Case 2: F=100, C=1000

25 Performance Results F=N=2000, vary data size

26 Thank You

27 Outline General strategy of incremental group optimization Query split with materialized intermediate files Incremental grouping of selection and join operators System architecture Experimental results

28 Incremental group optimization General Strategy Why can’t we regroup all queries when a new query is added ? Use of expression signatures for grouping  Same syntax structure  Different constant values


Download ppt "NiagaraCQ : A Scalable Continuous Query System for Internet Databases Jianjun Chen et al Computer Sciences Dept. University of Wisconsin-Madison SIGMOD."

Similar presentations


Ads by Google