Presentation is loading. Please wait.

Presentation is loading. Please wait.

Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2,

Similar presentations


Presentation on theme: "Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2,"— Presentation transcript:

1 Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2, Tai Wong 2 and Yubao Liu 4 The Hong Kong University of Science and Technology 1 The Chinese University of Hong Kong 2 Simon Fraser University 3 Sun Yat-Sen University 4 Prepared by Raymond Chi-Wing Wong Presented by Raymond Chi-Wing Wong

2 Outline 1.Introduction a.Skyline b.Contributions 2.Problem Definition 3.Adaptive SFS 4.IPO-Tree 5.Conclusion

3 1. Introduction Package IDPriceHotel-class a16004 b24001 c30005 3 packages Suppose we want to look for a vacation package Package a “ dominates ” package b We want to have a cheaper package. We want to have a higher hotel-class. We know that 1.Package a has a cheaper price 2.Package a has a higher hotel-class We want to find a set of packages which are NOT dominated by any other pacakges All of the “ best ” possible choices. i.e., {a, c} skyline

4 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) 6 packages Suppose we want to look for a vacation package We want to have a cheaper package. We want to have a higher hotel-class. How about this one? Different customers may have different preferences on Hotel-group. Suppose a customer has the following preferences. H < T < M The skyline points are packages a and c. Suppose another customer has the following preferences. H < M < T The skyline points are packages a, c and e. In other words, different preferences give different skyline points.

5 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) 6 packages Suppose we want to look for a vacation package Suppose a customer has the following preferences. H < T < M The skyline points are packages a and c. Suppose another customer has the following preferences. H < M < T The skyline points are packages a, c and e. In other words, different preferences give different skyline points. Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently

6 Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently 1. Introduction

7 Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently

8 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query It works. However, this solution is not scalable and the results cannot be returned efficiently.

9 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query Full Materialization solution: Pre-computation: For each possible preference, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly for a skyline query It works when there are limited number of preferences. However, this solution is not scalable when there are a lot of possible preferences. e.g. three nominal attributes (like Hotel-Group) each of which contains 40 possible values there are 4.1 x 10 9 possible preferences (in our problem setting).

10 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query Full Materialization solution: Pre-computation: For each possible preference, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly for a skyline query Semi-Materialization solution: Pre-computation: For SOME possible preferences, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly OR with simple operations for a skyline query Good tradeoff between storage consumption and efficiency

11 1. Introduction Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given a preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query Full Materialization solution: Pre-computation: For each possible preference, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly for a skyline query Semi-Materialization solution: Pre-computation: For SOME possible preferences, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly OR with simple operations for a skyline query Adaptive SFS IPO-Tree (Implicit Preference Order Tree) Questions: 1.What preferences should be stored? 2.With these preferences, how can we perform a skyline query efficiently?

12 1. Contributions Most Existing Work Assume that each attribute has a certain ordering (either totally ordered or partially ordered) on the attribute values Our Work Different users can have different preferences (i.e., the ordering on attribute values are different with different users) Propose a semi-materialization method IPO- tree to answer the skyline query efficiently.

13 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla)

14 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) A user prefers M to H.

15 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) A user prefers H to *. All possible values in attribute Hotel-group other than “ M ” and “ H ” (in this case, “ T ” ) This is the reason why we call an implicit preference. Problem: Given an implicit preference on Hotel-group, we want to find the skyline with respect to this preference efficiently

16 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given an implicit preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Binary orders = { } All possible values in attribute Hotel-group other than “ M ” and “ H ” (in this case, “ T ” ) M<H

17 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given an implicit preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Binary orders = { } All possible values in attribute Hotel-group other than “ M ” and “ H ” (in this case, “ T ” ) M<H, M<T

18 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given an implicit preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Binary orders = { } All possible values in attribute Hotel-group other than “ M ” and “ H ” (in this case, “ T ” ) M<H, M<T, H<T

19 2. Problem Definition Usually, a user should NOT specify an ordering on all possible values on attribute Hotel-Group Only list a few of the most favorite choices e.g. M < H < * Implicit preference Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Problem: Given an implicit preference on Hotel-group, we want to find the skyline with respect to this preference efficiently Since the user gives only TWO choices, we define the order of his preference to be TWO. We also call this preference the second-order implicit preference. All possible values in attribute Hotel-group other than “ M ” and “ H ” (in this case, “ T ” ) Idea of our proposed semi-materialization IPO-tree 1.Store the skyline wrt the first-order implicit preference ONLY 2.Find the skyline wrt the implicit preference of any ordering from the skyline wrt the first-order implicit preference Questions: 1.What preferences should be stored? 2.With these preferences, how can we perform a skyline query efficiently?

20 3. Adaptive SFS Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query Full Materialization solution: Pre-computation: For each possible preference, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly for a skyline query Semi-Materialization solution: Pre-computation: For SOME possible preferences, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly OR with simple operations for a skyline query Adaptive SFS IPO-Tree (Implicit Preference Order Tree)

21 3. Adaptive SFS Original SFS Idea: Suppose we have a function f Each tuple is assigned with a score obtained by f Sort the tuples in ascending order of the scores Process the tuples with this ordering Adaptive SFS Similar idea However, the original score function is based on Numeric attributes NOT nominal attributes What we change is the score function Idea: 1. Pre-Computation: first pre-sort the tuples according to this new score function 2. Skyline Query: re-sort the tuples for a skyline query

22 4. IPO-Tree Straightforward solution: Adopt some existing skyline techniques such as SFS (Sort-First Skyline) to compute the skyline on-the-fly when we need to perform a skyline query Full Materialization solution: Pre-computation: For each possible preference, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly for a skyline query Semi-Materialization solution: Pre-computation: For SOME possible preferences, (1) pre-compute the skyline and (2) store it in a storage Skyline Query: return the stored skyline directly OR with simple operations for a skyline query Adaptive SFS IPO-Tree (Implicit Preference Order Tree)

23 4. IPO-Tree Idea of our proposed semi-materialization IPO-tree 1.Store the skyline with respect to the first-order implicit preference ONLY 2.Find the skyline with respect the implicit preference of any ordering from the skyline with respect to the first-order implicit preference Questions: 1.What preferences should be stored? 2.With these preferences, how can we perform a skyline query efficiently?

24 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Binary Orders: {M < T, M < H} Some values other than “ M ” (i.e., “ H ” and “ T ” )

25 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} Binary Orders: {M < T, M < H} Some values other than “ H ” (i.e., “ T ” and “ M ” ) Binary Orders: {H < T, H < M} f is NOT a skyline point.Why?

26 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} Binary Orders: {H < T, H < M} f is NOT a skyline point.Why? With the binary order H<M, c dominates f We say that “ H<M ” disqualifies f as a skyline point.

27 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H

28 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Some values other than “ M ” and “ H ” (i.e., “ T ” ) Binary Orders: {H < T, H < M} M<H, M<T

29 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Some values other than “ M ” and “ H ” (i.e., “ T ” ) Binary Orders: {H < T, H < M} M<H, M<T, H<T

30 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H, M<T, H<T PSKY 1 = a set of data points in SKY 1 with value “ M ” = {e, f}

31 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H, M<T, H<T PSKY 1 = {e, f}

32 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H, M<T, H<T PSKY 1 = {e, f}

33 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H, M<T, H<T PSKY 1 = {e, f} SKY 3 ={ } SKY 3 = (SKY 1 SKY 2 ) U PSKY 1 U = {a, c, e} U {e, f} = {a, c, e, f} a, c, e, f Additional binary order! This binary order may disqualify some data points in SKY 3 like “ f ” Observation: These points must be in PSKY 1

34 4. IPO-Tree M < * SKY 1 = {a, c, e, f} Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) H < * SKY 2 = {a, c, e} M < H < * Binary Orders: {M < T, M < H} Binary Orders: { } Binary Orders: {H < T, H < M} M<H, M<T, H<T PSKY 1 = {e, f} SKY 3 ={ } SKY 3 = (SKY 1 SKY 2 ) U PSKY 1 U = {a, c, e} U {e, f} = {a, c, e, f} a, c, e, f Skyline wrt the first-order preference Skyline wrt the second-order preference Skyline wrt the first-order preference

35 4. IPO-Tree M < * SKY 1 = {a, c, e, f} H < * SKY 2 = {a, c, e} M < H < * SKY 3 ={ } a, c, e, f Skyline wrt the first-order preference Skyline wrt the second-order preference Skyline wrt the first-order preference v 1 < v 2 <* v 1 < * v 2 < * Merging Property

36 4. IPO-Tree Second-order PreferenceSkyline wrt the first-order preference Skyline wrt the second-order preference Skyline wrt the first-order preference Third-order PreferenceSkyline wrt the first-order preference Skyline wrt the third-order preference Skyline wrt the second-order preference Fourth-order PreferenceSkyline wrt the first-order preference Skyline wrt the fourth-order preference Skyline wrt the third-order preference v 1 < v 2 <* v 1 < * v 2 < * v 1 < v 2 < v 3 < * v 1 < v 2 < * v 3 < * v 1 < v 2 < v 3 < v 4 < * v 1 < v 2 < v 3 < * v 4 < *

37 5. Empirical Study Datasets Synthetic Dataset Anti-correlated dataset Real Dataset (from UCI) Nursery Dataset Default Values (Synthetic) No. of tuples = 500K No. of numeric dimensions = 3 No. of nominal dimensions = 2 No. of values in a nominal dimension = 20 Order of implicit preference = 3

38 5. Empirical Study Variation No. of data points No. of numeric dimensions No. of nominal dimensions Cardinality of nominal dimensions Order of implicit preference Comparison SFS-D SFS-A IPO Tree IPO Tree-10 Original SFS Adaptive SFS IPO Tree which stores 10 most frequent values for each nominal attribute (for comparison)

39 5. Empirical Study Synthetic Data Set

40 5. Empirical Study Real Data Set

41 6. Conclusion Different customers have different preferences  different skylines Skyline Query on Nominal Attributes Adaptive SFS algorithm IPO-Tree algorithm Experiments

42 Q&A

43 3. Adaptive SFS Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla) Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla)

44 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Using some existing algorithms, we can first remove some data points which must not be in skyline with respect to any implicit preference

45 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Package IDScore a c e f Step 1 (Pre-computation): pre-sort the tuples according to the new score function Each value in attribute Hotel-Group is assigned with a SPECIAL value This special value is set to the total number of possible values in Hotel-Group (i.e., 3)

46 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Package IDScore a c e f Step 1 (Pre-computation): pre-sort the tuples according to the new score function Score of point a is 1600 + 1 + 3 Each value in attribute Hotel-Group is assigned with a SPECIAL value This special value is set to the total number of possible values in Hotel-Group (i.e., 3) = 1604 1604

47 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Package IDScore a c e f Step 1 (Pre-computation): pre-sort the tuples according to the new score function Score of point c is 3000 + 0 + 3 Each value in attribute Hotel-Group is assigned with a SPECIAL value This special value is set to the total number of possible values in Hotel-Group (i.e., 3) = 3003 1604 3003

48 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Package IDScore a c e f Step 1 (Pre-computation): pre-sort the tuples according to the new score function Each value in attribute Hotel-Group is assigned with a SPECIAL value This special value is set to the total number of possible values in Hotel-Group (i.e., 3) 1604 3003 2406 3005 Package IDScore a1604 e2406 c3003 f3005

49 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Package IDScore a1604 e2406 c3003 f3005

50 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Step 2 (Skyline Query): re-sort the tuples for a skyline query (e.g., H<T<*) Package IDScore a1604 e2406 c3003 f3005 Value “ H ” is assigned with value 1. Value “ T ” is assigned with value 2. All values other than “ H ” and “ T ” (i.e., “ M ” ) are still equal to value 3. Pre-computation: Package IDScore a e c f Skyline Query: Score of point a is 1600 + 1 + 2= 1603 1603 2406 3005

51 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Step 2 (Skyline Query): re-sort the tuples for a skyline query (e.g., H<T<*) Package IDScore a1604 e2406 c3003 f3005 Value “ H ” is assigned with value 1. Value “ T ” is assigned with value 2. All values other than “ H ” and “ T ” (i.e., “ M ” ) are still equal to value 3. Pre-computation: Package IDScore a e c f Skyline Query: Score of point c is 3000 + 0 + 1=3001 1603 3001 2406 3005 Since the score of a and c are updated, we need to re-sort a and c. Note that the ordering of all OTHER points not containing “ H ” nor “ T ” remains unchanged.

52 3. Adaptive SFS Package IDPriceReverse Hotel-class Hotel-group a16001T (Tulips) b24004T (Tulips) c30000H (Horizon) d36001H (Horizon) e24003M (Mozilla) f30002M (Mozilla) Step 2 (Skyline Query): re-sort the tuples for a skyline query (e.g., H<T<*) Package IDScore a1604 e2406 c3003 f3005 Pre-computation: Package IDScore a e c f Skyline Query: 1603 3001 2406 3005 We just use the original SFS. With this sorted list, we find the skyline = {a, c}

53 4. IPO-Tree Idea Pre-computation Store the skyline wrt the first-order preference Skyline Query Find the skyline wrt the preference of any order according to the stored skylines wrt the first-order preference e.g.1 Hotel-Group: M<* Airline : G<* e.g.2 Hotel-Group: M<* Airline :  e.g.3 Hotel-Group:  Airline : G<* How can we do it efficiently? We propose an indexing structure called IPO-tree

54 4. IPO-Tree Package IDPriceReverse Hotel-class Hotel-groupAirline a16001T (Tulips)G (Gonna) b24004T (Tulips)G (Gonna) c30000H (Horizon)G (Gonna) d36001H (Horizon)R (Redish) e24003M (Mozilla)R (Redish) f30002M (Mozilla)W (Wings) root T<* H<* M<*  G<*R<*W<*  G<*R<*W<*  G<*R<*W<*  G<*R<*W<*  Hotel-group: T<* Airline : G<* Hotel-group: T<* Airline :  Hotel-group:  Airline : G<* Hotel-Group Airline e.g. three nominal attributes (like Hotel-Group) each of which contains 40 possible values Full Materialization there are 4.1 x 10 9 possible preferences (in our problem setting). Semi-Materialization IPO-tree there are 70,644 nodes (which is significantly smaller than 4.1 x 10 9 ).

55 4. IPO-Tree One nominal attribute Merging Property Multiple nominal attributes Consider ONE nominal attribute at a time with Merging Property Fix the ordering of OTHER nominal attributes Then, consider each of other nominal attributes with Merging Property

56 4. IPO-Tree Package IDPriceHotel-classHotel-group a16004T (Tulips) b24001T (Tulips) c30005H (Horizon) d36004H (Horizon) e24002M (Mozilla) f30003M (Mozilla)

57 4. IPO-Tree Package IDPriceHotel-classHotel-groupAirline a16004T (Tulips)G (Gonna) b24001T (Tulips)G (Gonna) c30005H (Horizon)G (Gonna) d36004H (Horizon)R (Redish) e24002M (Mozilla)R (Redish) f30003M (Mozilla)W (Wings) Hotel-Group: M<H<* Airline : G<R<* Hotel-Group: M<* Airline : G<R<* Hotel-Group: H<* Airline : G<R<* Hotel-Group: M<* Airline : G<* Hotel-Group: H<* Airline : R<* Hotel-Group: H<* Airline : G<* Hotel-Group: H<* Airline : R<*

58 4. IPO-Tree M < * SKY 1 = {a, c, e, f} H < * SKY 2 = {a, c, e} M < H < * PSKY 1 = {e, f} SKY 3 ={ } SKY 3 = (SKY 1 SKY 2 ) U PSKY 1 U = {a, c, e} U {e, f} = {a, c, e, f} a, c, e, f

59 4. IPO-Tree Theorem: Given a user query with x-th order implicit preference on m ’’ nominal attributes, the number of set operations required for an x-th order implicit preference is O(x m ’’ ). m ’’ = 2 x = 2 No. of set operations = O(2 2 ) Hotel-Group: M<H<* Airline : G<R<* e.g.


Download ppt "Efficient Skyline Querying with Variable User Preferences on Nominal Attributes Raymond Chi-Wing Wong 1, Ada Wai-Chee Fu 2, Jian Pei 3, Yip Sing Ho 2,"

Similar presentations


Ads by Google