Presentation is loading. Please wait.

Presentation is loading. Please wait.

The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 92-114. 張天彥 2015/12/05.

Similar presentations


Presentation on theme: "The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 92-114. 張天彥 2015/12/05."— Presentation transcript:

1 The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 92-114. 張天彥 2015/12/05

2 Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline Conclusions 1

3 Introduction to skyline queries The concept of domination Distance Price A B C 2 10 1 1 12 1.2 C1.8km$13 Distance to the beach Price of a hotel room Distance to the beach Price of a hotel room A dominates B and C

4 Introduction to skyline queries Definition of the skyline points Find all points are not dominated by other points 3

5 Introduction to skyline queries Definition of the skyline points Find all points are not dominated by other points 4

6 Introduction to skyline queries Definition of the skyline points Distance Price A A B B F F D D E E H H G G C C Find all points are not dominated by other points 5

7 Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline 6

8 The σ-Neighborhood Skyline Queries Problem to be solved by σ-N Skyline Queries Unquantifiable attributes Distance Price A A B B C C D D E E F F G G H H I I Unquantifiable attributeQuantifiable attribute 7

9 The σ-Neighborhood Skyline Queries σ-N Skyline Queries Distance Price A A B B F F D D E E H H G G C C 0.2km $2 σ-N skyline region σ-N skyline point 0 0 2 20 I can tolerant 10% error $2 in price, 0.2km in distance 8

10 The σ-Neighborhood Skyline Queries Applied the σ-N Skyline query in a dataset with unquantifiable attribute Distance Price A B F D E HG C …… The user can tolerant 10% error ($2 in price 0.2km in distance) 9

11 The σ-Neighborhood Skyline Queries Difficulties of finding σ-N Skyline Queries Distance Price A B F D E HG C Naïve algorithm 1. Find the skyline points by the existing skyline algorithms  first scan 2. Find the σ-N skyline points by the skyline points  second scan Assume there are 1M data points in the dataset  2M data points need to be check 10

12 The σ-Neighborhood Skyline Queries Difficulties of finding σ-N Skyline Queries Distance Price A B F D E HG C Naïve algorithm 1. Find the skyline points by the existing skyline algorithms  first scan 2. Find the σ-N skyline points by the skyline points  second scan Assume there are 1M data points in the dataset  2M data points need to be check The cost can be too high to afford when the dataset is large Can we solve the σ- N Skyline query in one scan? Can we solve the σ-N Skyline query without scanning all data points in the dataset? 11

13 The σ-Neighborhood Skyline Queries The algorithms for the σ-N Skyline Queries 12 Rσ-N algorithm (based on R-tree) Mσ-N algorithm (based on M + -tree) Existing indexed structureNewly developed indexed structure Same searching idea Two pruning mechanisms

14 The σ-Neighborhood Skyline Queries R-tree s2 p2 p3 p4 p1 Example: 13 R-tree is constructed based on the size of area s3 s1 p5 e1 e3 e2 e4 e5e6 e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

15 The σ-Neighborhood Skyline Queries Searching idea s2 p2 p3 p4 p1 Example: 14 Objective: Find skyline points and σ-N Skyline points in one scan s3 s1 p5 e1 e3 e2 e5 e4 e6 e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

16 The σ-Neighborhood Skyline Queries Searching idea Example: 15 Objective: Find skyline points and σ-N Skyline points in one scan e7 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7

17 The σ-Neighborhood Skyline Queries Searching idea Example: 16 Objective: Find skyline points and σ-N Skyline points in one scan e5e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 e5 e6 e7

18 The σ-Neighborhood Skyline Queries Searching idea Example: 17 Objective: Find skyline points and σ-N Skyline points in one scan e1 e2 e5e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 e1 e2

19 The σ-Neighborhood Skyline Queries Searching idea s2 Example: 18 Objective: Find skyline points and σ-N Skyline points in one scan s1 e1 e2 e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 s1s2 Range query: e2 v.s. s1 & s2

20 The σ-Neighborhood Skyline Queries Searching idea s2 p2 p1 Example: 19 Objective: Find skyline points and σ-N Skyline points in one scan s1 e2 e6 e5 e1 e2 s1s2 p1p2 e6 e3 e4 p3p4 s3p5 e7 p1p2 Range query: p1 v.s. s1 & s2 p1 Range query: p2 v.s. s1 & s2 Range query: e2 v.s. s1 & s2 Range query: e6 v.s. s1 & s2

21 The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm 20 1.Too many redundant points (Caused by the property of R-tree) R-tree is constructed based on the size of area 1.Too many redundant points (Caused by the property of R-tree) R-tree is constructed based on the size of area e1 e2 Insert into e1 or e2? Insert into e1 or e2? Ans: e1 Insert into e1 or e2? Insert into e1 or e2? e1 e2 Ans: e2 Unrelated points

22 The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm 21 1. Too many redundant points s1s2 Additional I/O cost Additional range queries

23 The σ-Neighborhood Skyline Queries Disadvantages of Rσ-N algorithm 22 2. Too many range queries s1s1s2s2s3s3s4s4s5s5 s6s6 A B e1 e1 v.s. s1 ? e1 v.s. s2 ? e1 v.s. s3 ? e1 v.s. s4 ? e1 v.s. s5 ? e1 v.s. s6 ? A v.s. s1 ? A v.s. s2 ? A v.s. s3 ? A v.s. s4 ? A v.s. s5 ? A v.s. s6 ? B v.s. s1 ? B v.s. s2 ? B v.s. s3 ? B v.s. s4 ? B v.s. s5 ? B v.s. s6 ?

24 The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems R-tree is constructed based on the area M + -tree is constructed based on the distance 23 Related points

25 The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems R-tree is constructed based on the area M + -tree is constructed based on the distance 24 Number of redundant points

26 The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems Triangle inequality Skyline point s A, center of M + BR B C σ 8 2 5 5 Only C needs further check 25 Number of redundant points <7

27 The σ-Neighborhood Skyline Queries Using M + -tree to solve the problems e1 v.s. s6 ? e1 v.s. s5 ? e1 v.s. s4 ? e1 v.s. s3 ? e1 v.s. s2 ? e1 v.s. s1 ? Original: e1 v.s. s1, s2, s3, s4, s5, s6 A v.s. s1, s2, s3, s4, s5, s6 B v.s. s1, s2, s3, s4, s5, s6  18 times of range query s1s1s2s2s3s3s4s4s5s5 s6s6 A B e1 summation 26 A v.s. s6 ? A v.s. s1 ? B v.s. s6 ? B v.s. s1 ?  10 times of range query Number of range queries Easy to get the summation of e1 Summation line

28 The σ-Neighborhood Skyline Queries Simulations Number of data points: 1M Number of dimensions: 2, 3, 4, 5, 6 Number of data points: 1M Number of dimensions: 2, 3, 4, 5, 6 Independent dataset Anti-correlated dataset 27

29 The σ-Neighborhood Skyline Queries Simulations- selection of σ Independent dataset Anti-correlated dataset 28

30 The σ-Neighborhood Skyline Queries Simulations 29

31 The σ-Neighborhood Skyline Queries Conclusion 10 using the Mσ-N algorithm in σ-N skyline queries is far more efficient than the Rσ-N algorithm.

32 Outline Introduction to skyline queries The σ-Neighborhood Skyline Queries k-dominant Skyline 30

33 k-dominant skylines k-dominant skylines 有利於在高維資料減少 支配 skylines 點集數量 Distance Price A B C D E F G H I K L k-Dominant Skylines 31

34


Download ppt "The σ-neighborhood skyline queries Chen, Yi-Chung; LEE, Chiang. The σ-neighborhood skyline queries. Information Sciences, 2015, 322: 92-114. 張天彥 2015/12/05."

Similar presentations


Ads by Google