Presentation is loading. Please wait.

Presentation is loading. Please wait.

Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton.

Similar presentations


Presentation on theme: "Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton."— Presentation transcript:

1 Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton

2 Outline Introduction Estimation of Output Rates Using Estimates to Optimize Queries Experimental Validation Conclusion

3 Introduction Cost-Based Query Optimization Decide between execution plans by minimizing the execution cost of evaluating the query Cardinality Estimation Use cardinalities of tables at leaves of query tree Very effective in most traditional DBMS applications

4 Introduction Drawback of Cardinality Estimation In case of streaming data, cardinality is not often know In case of infinite streams it may not even be well defined Rate-Based Optimization Uses estimates of rates of the streams in the query evaluation tree

5 Introduction Motivating Example 2 selection operators with selectivity 0.1 Input rate = 500 per second Operator 1 processes 50 inputs per second Operator 2 processes data as fast as it arrives Run a query that applies the 2 selections in series on an infinite stream

6 Introduction

7

8 Rate-Based Query Optimization Focus – what is the expected output rate of this query plan? Advantages Can optimize for most possible outputs in the first n seconds Can optimize for first k results

9 Estimation of Output Rates Output Rate = Number of outputs transmitted Time Needed to make the transmission

10 Projections 2 cases Cost of one projection is <= inter-arrival time Cost of one projection is > inter-arrival time

11 Selections Output Rate,

12 Joins and Cartesian Products If rl is the left input rate and rr is the right input rate, what is the output rate for results generated by the arrival of this tuples?

13 Cost Model for Join Operator Implementations

14

15 Using Estimates to Optimize Queries General framework for rate-based optimization Optimize for a specific time point in the execution process Which plan will produce the most results by time to? Optimize for output production size Which plan is the first to reach N results?

16 Examples of heuristics Local Rate Maximization

17 Examples of heuristics Local Time Minimization

18 Grammar

19 Algorithm

20 Experimental Validation Focus Does the cost model correctly estimate individual plan performance? Is the framework capable of providing correct decisions regarding the best choices among a set of plans?

21 Experimental Validation Experimental Setup Synthetically generated XML data set To investigate streaming behavior, view files as prefixes of streams

22 Experimental Validation

23 Validation of the cost Model

24

25

26

27

28 Complex Plans

29

30

31

32

33 Comparison to the Traditional Cost Model

34 Conclusions Rate-based optimization enables query optimization when working with infinite input streams It allows optimization for specific points in time during query evaluation Experiments indicate that rate-based optimization is a potentially viable approach


Download ppt "Rate-Based Query Optimization for Streaming Information Sources Stratis D. Viglas Jeffrey F. Naughton."

Similar presentations


Ads by Google