Download presentation

Presentation is loading. Please wait.

Published byOswaldo Week Modified over 2 years ago

1
View Usability and Safety for the Answering of Top-k Queries via Materialized Views Eftychia Baikousi Panos Vassiliadis University of Ioannina Dept. of Computer Science

2
DOLAP 2009, Hong Kong, 6 Nov 2009 2 Forecast Problem of answering a top-k query through materialized top-n views Theoretical guarantees when a top-n materialized view can answer a top-k query Algorithmic techniques for answering a top-k query from a materialized view Properties of the safe areas of views

3
DOLAP 2009, Hong Kong, 6 Nov 2009 3 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

4
DOLAP 2009, Hong Kong, 6 Nov 2009 4 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

5
DOLAP 2009, Hong Kong, 6 Nov 2009 5 Top-k query Given a relation R (id, x1, x2, x3) and a query Q, sum(x1, x2, x3) Find k tuples with highest grades according to Q idx1x2x3 a0.30.60.7 b0.20.30.4 c 0.50.9 d0.70.60.1 R Top-2 tuples sum 1.6 0.9 1.8 1.4

6
DOLAP 2009, Hong Kong, 6 Nov 2009 6 Motivating Example Given a relation Region (id, name, today_traffic, yesterday_traffic, budget,..) a materialized view V of top-2 regions according to the query Q: 0.6*diff traffic + 0.4*budget idNamet_trafficy_trafficbudgetV 1LA1820217.2 2NY425415-1.2 3Dallas262284.4 4Chicago3028115.6 nameV LA7.2 Dallas4.4 Region V Telecommunication Company Executives see sale reports in PDAs Can a new top-k query (e.g. 0.5*diff traffic + 0.3*budget) be answered from V ?

7
DOLAP 2009, Hong Kong, 6 Nov 2009 7 Problem definition Given a base relation R (ID, X, Y) a materialized view V (ID, X, Y, s) that contains top-n tuples of the form (id, s) where s is defined as s = w (a · x + y) and w, a are positive parameters a query Q (ID, X, Y, s Q ) that requests for top k n tuples of the form (id, s Q ) where s Q is defined as s Q = w Q (a Q · x + y) and w Q, a Q are positive parameters Introduce an algorithm that decides whether V by itself is suitable to answer Q and compute Qs answer

8
DOLAP 2009, Hong Kong, 6 Nov 2009 8 Related Work Gautam Das, Dimitrios Gunopulos, Nick Koudas, Dimitris Tsirogiannis : Answering Top-k Queries Using Views, VLDB 06 Answer top-k query Q by making use of ranking views V LPTA in 2-steps SelectViews (V, Q) Selects efficient subset of views U for answering Q, U contains the sorted lists over each attribute of the relation Answer Q from U Linear programming adaptation of TA algorithm Stopping condition : solution of linear program min (top-k)

9
DOLAP 2009, Hong Kong, 6 Nov 2009 9 Related Work – Geometric Representation (0) Assume Relation R (ID, X, Y) Two views V u ( id, Score 1 ) and V d ( id, Score 2 ) Query Q( id, Score) Scoring functions of the form Score = w ( a·x +y) Depicted as y = a -1 ·x

10
DOLAP 2009, Hong Kong, 6 Nov 2009 10 Related Work – Geometric Representation (1) M : the k th tuple in Q Stopping condition: sweeping line ( ) crosses position A 1 B Any point below line AB has smaller score than M in regards to Q

11
DOLAP 2009, Hong Kong, 6 Nov 2009 11 Related Work – Geometric Representation (2) Stopping condition: intersection point S of sweeping lines (, ) lies on line AB Any point below line AB has smaller score than M in regards to Q

12
DOLAP 2009, Hong Kong, 6 Nov 2009 12 Related Work SelectViews (V,Q) is Data dependant based on estimation of the last tuple of Q according to the data distribution No theoretically established guarantees that the set of views will answer Q

13
DOLAP 2009, Hong Kong, 6 Nov 2009 13 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

14
DOLAP 2009, Hong Kong, 6 Nov 2009 14 Overview of the method 1. Theoretical guarantees of Answering a query Q via a view V U 2. Theoretical guarantees are too strict 3. Parallelism of safe areas

15
DOLAP 2009, Hong Kong, 6 Nov 2009 15 Example idxyV a7415 b2716 c428 d113 Q 18 11 10 3 R V top-3 with score x+2y Q top-1 with score 2x+y

16
DOLAP 2009, Hong Kong, 6 Nov 2009 16 Construction of safe area V U (ID, X, Y, s U ) Containing top n tuples with score s U =w U (a U ·x+y) t N the n th tuple in V U L U :x NU y NU line perpendicular to V U passing from t N and meeting axes X and Y L Q :x NU y Q line perpendicular to Q passing from x NU

17
DOLAP 2009, Hong Kong, 6 Nov 2009 17 Safe area Safe area defined as the area above line L Q (shaded area) Observations Any tuple in safe area has score (in regards to Q) higher than any tuple outside the safe area Tuples in safe area belong in both V U and Q

18
DOLAP 2009, Hong Kong, 6 Nov 2009 18 Answering Q from V U THEOREM 1 V U can answer Q if safe area contains at least k tuples Inverse does not always hold

19
DOLAP 2009, Hong Kong, 6 Nov 2009 19 Overview of the method 1. Theoretical guarantees of Answering a query Q via a view V U 2. Theoretical guarantees are too strict 3. Parallelism of safe areas

20
DOLAP 2009, Hong Kong, 6 Nov 2009 20 Answering Q from V U cont. THEOREM 2 It is possible that V U can answer Q if safe area contains less than k tuples This holds when: area defined by (yellow triangle) line L U, X-axis and line L 1 producing the lowest possible score for Q from tuples of V U Is void of tuples

21
DOLAP 2009, Hong Kong, 6 Nov 2009 21 Algorithm TestViewSuitability Three main steps Step 1: Compute safe area (Q, V) Step 2: Count tuples in V that belong in the safe area Step 3: If there are more than k, then return (true) Else return (false)

22
DOLAP 2009, Hong Kong, 6 Nov 2009 22 Overview of the method 1. Theoretical guarantees of Answering a query Q via a view V U 2. Theoretical guarantees are too strict 3. Parallelism of safe areas

23
DOLAP 2009, Hong Kong, 6 Nov 2009 23 Combining two views Lines L QU, L QD Q characterizing the safe areas for V U and V D L QU L QD safe area of one view (V U ) encompassed in safe area of the other view (V D )

24
DOLAP 2009, Hong Kong, 6 Nov 2009 24 Combining two views THEOREM 3 If two views are not safe for answering Q by themselves, then the combination of them cannot safely guarantee the answer to Q, in regards to the safe areas.

25
DOLAP 2009, Hong Kong, 6 Nov 2009 25 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

26
DOLAP 2009, Hong Kong, 6 Nov 2009 26 Experimental methodology Test the following methods Our algorithm TA algorithm (it can guarantee view usability correctness) For the following goals Effectiveness Number of queries answered by views Efficiency Time savings from usage of queries

27
DOLAP 2009, Hong Kong, 6 Nov 2009 27 Experimental methodology Synthetic data sets: Random data sets of different sizes for a relation of the form R (ID, X, Y) Sequence of queries with random coefficients and result size k Size of source table R (tuples)|R|1x10 4, 5x10 4, 1x10 5 Max size of mat. View (tuples)k10, 50, 100, 500, 1000 Number of queries asked|Q||Q|100, 1000 Experimental parameters:

28
DOLAP 2009, Hong Kong, 6 Nov 2009 28 Effectiveness Percentage of views used for 100 queries

29
DOLAP 2009, Hong Kong, 6 Nov 2009 29 Effectiveness Percentage of views used for different time spans

30
DOLAP 2009, Hong Kong, 6 Nov 2009 30 Efficiency Time savings from the usage of queries for different database sizes and requested results Conflicting case The number of stored results rises, while the savings drop Due to the size of used memory Memory allocation becomes slow Probably one view is able to answer lot of queries Savings increase for reasonable ks of size 0.1%

31
DOLAP 2009, Hong Kong, 6 Nov 2009 31 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

32
DOLAP 2009, Hong Kong, 6 Nov 2009 32 Conclusions We have provided theoretical and algorithmic results for the problem of answering top-k queries via materialized views Theoretical – algorithmic results: Theorem1: Theoretical guarantees for a view to answer a top-k query, Theorem2: Strictness of Theorem1 Parallelism of safe areas

33
DOLAP 2009, Hong Kong, 6 Nov 2009 33 Contents Motivation & Problem Definition Overview of the Method Theoretical guarantees Strictness of theorem Safe area properties Experiments Conclusions Future extensions

34
DOLAP 2009, Hong Kong, 6 Nov 2009 34 Future Work Optimization in case of time and storage constraints View Caching Hierarchical structures for the set of views Sorting techniques

35
DOLAP 2009, Hong Kong, 6 Nov 2009 35 Thank you for your attention! … many thanks to our hosts!

36
DOLAP 2009, Hong Kong, 6 Nov 2009 36 Auxiliary Time Savings

Similar presentations

© 2017 SlidePlayer.com Inc.

All rights reserved.

Ads by Google