Presentation is loading. Please wait.

Presentation is loading. Please wait.

July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management.

Similar presentations


Presentation on theme: "July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management."— Presentation transcript:

1 July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management of Information Systems – R.C. Athena

2 July 29HDMS'08 Outline Introduction –Skyline (SL) and dynamic skyline queries (DSL) Related work Evaluating dynamic skyline queries –Computing orthant skylines (OSL) –Computing dynamic skyline via caching LRU, LFU, LPP cache replacement policies Experimental evaluation Conclusions and Future work

3 July 29HDMS'08 Skyline queries (SL) Given a dataset of d- dimensional points –SL contains points not dominated by others –x dominates y iff x as good as y in all dimensions and strictly better in at least one

4 July 29HDMS'08 Skyline queries (SL) Given a dataset of d- dimensional points –SL contains points not dominated by others –x dominates y iff x as good as y in all dimensions and strictly better in at least one Example –Dataset of hotels –Prefer cheap hotels close to the sea Distance from sea Price

5 July 29HDMS'08 Skyline queries (SL) Given a dataset of d- dimensional points –SL contains points not dominated by others –x dominates y iff x as good as y in all dimensions and strictly better in at least one Example –Dataset of hotels –Prefer cheap hotels close to the sea Distance from sea Price Skyline points

6 July 29HDMS'08 Skyline queries (SL) Given a dataset of d- dimensional points –SL contains points not dominated by others –x dominates y iff x as good as y in all dimensions and strictly better in at least one Example –Dataset of hotels –Prefer cheap hotels close to the sea Distance from sea Price Skyline points p1p1

7 July 29HDMS'08 Skyline queries (SL) Given a dataset of d- dimensional points –SL contains points not dominated by others –x dominates y iff x as good as y in all dimensions and strictly better in at least one Example –Dataset of hotels –Prefer cheap hotels close to the sea Distance from sea Price Skyline points p1p1 p2p2

8 July 29HDMS'08 Dynamic skyline queries (DSL) Extension of skyline queries –Given a query point q –DSL contains points not dynamically dominated by others w.r.t q –x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL –Transform points w.r.t. q

9 July 29HDMS'08 Dynamic skyline queries (DSL) Extension of skyline queries –Given a query point q –DSL contains points not dynamically dominated by others w.r.t q –x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL –Transform points w.r.t. q Example –User defines “ideal” hotel q Distance from sea Price Query point q

10 July 29HDMS'08 Dynamic skyline queries (DSL) Extension of skyline queries –Given a query point q –DSL contains points not dynamically dominated by others w.r.t q –x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL –Transform points w.r.t. q Example –User defines “ideal” hotel q Distance from sea Price Dynamic Skyline points q

11 July 29HDMS'08 Dynamic skyline queries (DSL) Extension of skyline queries –Given a query point q –DSL contains points not dynamically dominated by others w.r.t q –x dynamically dominates y iff x as preferable as y w.r.t. q in all dimensions and strictly more preferable w.r.t. q in at least one Can be treated as static SL –Transform points w.r.t. q Example –User defines “ideal” hotel q Distance from sea Price Dynamic Skyline points q p4p4 p5p5

12 July 29HDMS'08 Intuition (1) Traditional SL algorithms need to run anew for each DSL query Our idea –Exploit results from past queries to reduce processing cost for future DSL queries –Cache past queries –Decide which queries in cache are useful

13 July 29HDMS'08 Intuition (2) Distance from sea Price

14 July 29HDMS'08 Intuition (2) 2 past DSL queries –q a, q b Each query partitions space in 4 quadrants Distance from sea Price qaqa qbqb

15 July 29HDMS'08 Intuition (3) A new query q arrives Consider DSL for q a –p 1 is contained DSL(q a ) –p 1 dominates p 2, p 3, p 4 p 1 lies in upper right quadrant w.r.t. q a q a lies in upper right quadrant w.r.t. q p 1 dominates also p 2, p 3, p 4 w.r.t. to q –Exclude p 2, p 3, p 4 from dominance test for DSL(q) Distance from sea Price qaqa qbqb p1p1 p2p2 p3p3 p4p4 q Shaded area denotes points dominated by p 1

16 July 29HDMS'08 Contribution in brief Caching past DSL queries cannot reduce processing cost for future ones –We need more information about dominance relationships Introduce orthant skylines (OSL) and examine their relationship with DSL Extend Bitmap algorithm to compute OSL in parallel with DSL Cache OSL to enhance DSL queries evaluation –Present 3 cache replacement policies LRU, LFU, LPP Experimental evaluation of caching mechanism

17 July 29HDMS'08 Related work Non-indexed methods –Block-Nested Loops (BnL) –Bitmap –Multidimensional Divide and Conquer (DnC) –Sort First Scan (SFS) Index-based methods –B-tree sort points according to the lowest valued coordinate –R-tree Nearest neighbor based (NN) Branch and bound (BBS)

18 July 29HDMS'08 Related work Non-indexed methods –Block-Nested Loops (BnL) –Bitmap –Multidimensional Divide and Conquer (DnC) –Sort First Scan (SFS) Index-based methods –B-tree sort points according to the lowest valued coordinate –R-tree Nearest neighbor based (NN) Branch and bound (BBS)

19 July 29HDMS'08 Bitmap BnL variant Suitable for domains with low cardinality and discrete In brief –Computes a bitmap representation of the points in the dataset –Examines each point separately (dominance test) Checks whether it is contained in the skyline or not Exploits fast bitwise operations OR/AND

20 July 29HDMS'08 Bitmap – Dominance test For each point p –Define A = A 1 & A 2 & … & A d Denotes the points as good as p in all dimensions –Define B = B 1 | B 2 | … | B d Denotes the points strictly better than p in at least one dimension –Dominance test: If C = A & B has all bits set to 0 then p is in SL

21 July 29HDMS'08 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL –Useful for pruning Given a dataset of d- dimensional points and a query point q –Space partitioned in 2 d orthants –o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q

22 July 29HDMS'08 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL –Useful for pruning Given a dataset of d- dimensional points and a query point q –Space partitioned in 2 d orthants –o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 Query point q

23 July 29HDMS'08 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL –Useful for pruning Given a dataset of d- dimensional points and a query point q –Space partitioned in 2 d orthants –o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 Query point q

24 July 29HDMS'08 Orthant skyline (OSL) OSL provides more information about dominance relationships than DSL –Useful for pruning Given a dataset of d- dimensional points and a query point q –Space partitioned in 2 d orthants –o-th orthant skyline (OSL) of q contains points of the o-th orthant not dynamically dominated by others inside orthant o w.r.t q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 Query point q Quadrant 2 skyline points

25 July 29HDMS'08 OSL and DSL relationship Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q

26 July 29HDMS'08 OSL and DSL relationship Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q

27 July 29HDMS'08 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q

28 July 29HDMS'08 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q

29 July 29HDMS'08 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q

30 July 29HDMS'08 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q p1p1 p2p2

31 July 29HDMS'08 OSL and DSL relationship Map points from quadrants 1,2,3 to points inside quadrant 0 Compute DSL w.r.t. q Union of all OSLs is superset of DSL w.r.t. to q Quadrant 2 Distance from sea Price Quadrant 0 Quadrant 3 Quadrant 1 q p3p3 p2p2

32 July 29HDMS'08 Computing orthant skylines Algorithm DBM –Extends Bitmap to compute DSL and OSLs at the same time Method: –Compute bitmap representation Transform each point coordinates w.r.t. to query q –Dominance test, point p, orthant o p not in OSL o and not in DSL p not in DSL, but in OSL o p in DSL and in OSL o

33 July 29HDMS'08 Dynamic skylines Via Caching Cache OSLs instead of DSLs –Query cache contains (query point q j, OSLs) –OSLs encode by bitmaps Algorithm cDBM –OSL contains information about dominance test inside orthant –Discard points inside orthants from dominance tests Method: –Compute bitmap representation –For each point p consider its position (orthant) w.r.t. to cache queries q j –If p in the same orthant o w.r.t q j as q j w.r.t. q and p not in OSL o (q j ) then exclude it from OSL o (q), DSL(q)

34 July 29HDMS'08 Cache Replacement Policies General idea –Limited cache space –Identify least useful query point in cache –Replace it with new one

35 July 29HDMS'08 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q –Consider as input the query point cache Q –Only query points in OSL of Q w.r.t. q are useful –Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point

36 July 29HDMS'08 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q –Consider as input the query point cache Q –Only query points in OSL of Q w.r.t. q are useful –Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point Distance from sea Price qaqa qbqb qcqc qdqd q

37 July 29HDMS'08 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q –Consider as input the query point cache Q –Only query points in OSL of Q w.r.t. q are useful –Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point Distance from sea Price qaqa qbqb qcqc qdqd q Redundant queries

38 July 29HDMS'08 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q –Consider as input the query points in cache Q –Only query points in OSL of Q w.r.t. q are useful –Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point Distance from sea Price qaqa qbqb qcqc qdqd q Redundant queries

39 July 29HDMS'08 Usage-based policies Only a few queries in cache are useful Log cache query usage Given a new query q –Consider as input the query points in cache Q –Only query points in OSL of Q w.r.t. q are useful –Update cache - remove: Least Recently Used (LRU) query point Least Frequently Used (LFU) query point Distance from sea Price qaqa qbqb qcqc qdqd q Redundant queries

40 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q

41 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q 2:2 4 5:2 4 3:2 4 74:2 4

42 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q 2:3 4 5:7 4 3:4 4 74:88 4

43 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q 2:3 176 5:7 4 3:4 4 74:88 4

44 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q 2:3 176 5:7 20 3:4 21 74:88 222

45 July 29HDMS'08 Pruning power-based policy Usage-based policies do not indicate usefulness Useful cached query –Great pruning power Probability that a query can prune points of dataset from DSL computation –Depends on Points dominated by query in an orthant j Points contained in the antisymetric orthant of j Update cache – remove –Query point with less pruning power (LPP) Distance from sea Price qaqa q 2:3 176 5:7 20 3:4 21 74:88 222

46 July 29HDMS'08 Experimental Evaluation Synthetic datasets –Distribution types Independent, correlated, anti-correlated –Number of points N 10k, 20k, 50k, 100k, –Dimensionality d = {2,3,4,5,6} –Domain size for dimension |D| = {10,20,50} Compare –Bitmap (NO-CACHE) –cDBM with LFU,LRU,LPP cache replacement policies –Query cache |Q| = {10,20,30,40,50} past query points Cache size is |Q|*N bits uncompressed

47 July 29HDMS'08 Varying query cache size Dataset: N = 50k points, with d = 4 dimensions of |D| = 20 domain size LFU,LRU cache queries not representative for future ones LPP caches queries with great pruning power Anti-correlatedIndependent

48 July 29HDMS'08 Effect of distribution parameters Relative improvement in running time over NO-CACHE Vary number of points N –d = 4 dimensions of |D| = 20 domain size Vary number of dimensions d –N = 50k, |D| = 20 Correlated vary d Correlated vary N

49 July 29HDMS'08 Conclusions and Future work Conclusions –Introduced orthant skylines (OSLs) and discussed its relationship with DSL –Extended Bitmap to compute OSLs and DSL at the same time (DBM algorithm) –Proposed caching mechanism of OSLs to reduce cost for future DSL queries LRU, LFU, LPP cache replacement policies –Experimentally verified the efficiency of caching mechanism Future work –Apply caching mechanism to index-based methods –Further increase pruning power of cached queries

50 July 29HDMS'08 Questions ?


Download ppt "July 29HDMS'08 Caching Dynamic Skyline Queries D. Sacharidis 1, P. Bouros 1, T. Sellis 1,2 1 National Technical University of Athens 2 Institute for Management."

Similar presentations


Ads by Google