Presentation is loading. Please wait.

Presentation is loading. Please wait.

Information Technology Influence Computation in Spatial Dabases Muhammad Aamir Cheema Faculty of Information Technology Monash University, Australia

Similar presentations


Presentation on theme: "Information Technology Influence Computation in Spatial Dabases Muhammad Aamir Cheema Faculty of Information Technology Monash University, Australia"— Presentation transcript:

1 Information Technology Influence Computation in Spatial Dabases Muhammad Aamir Cheema Faculty of Information Technology Monash University, Australia aamir.cheema@monash.edu www.aamircheema.com

2 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Other work

3 Faculty of Information Technology Introduction: Influence Set Influence Influence Set

4 Faculty of Information Technology Introduction: Influence Set A facility f is important for u if it is one of the top-k facilities for a user u considering her preferences, e.g.,  Distance  Rating  Price Important facility?

5 Faculty of Information Technology Introduction: Influence Set  Important to identify potential users/customers  Used in various applications such as marketing, cluster and outlier analysis, and decision support systems Significance Types

6 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Other work

7 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

8 Faculty of Information Technology Reverse k Nearest Neighbors (RkNN) Definition of importance –A facility f is important to a user if f is one of its k closest facilities Reverse k Nearest Neighbors –Find every user u for which the query facility q is one of its k-closest facilities. Influence set of f 1 is {u 1,u 2 } Influence set of f 2 is {u 3 } K=1 u2u2 f1f1 f2f2 u1u1 u3u3

9 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

10 Faculty of Information Technology Pre-computation based approach [F. Korn et al., SIGMOD 2000] Pre-computation –For each user u Draw a circle centered at u containing its k closest facilities –Index these circles using an R-tree Query processing –Find the circles that contain q Problems –arbitrary k? –data updates? u1u1 f1f1 f2f2 u2u2 u3u3 f3f3 u4u4 k = 1 q q

11 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

12 Faculty of Information Technology On-the-fly RkNN Algorithms Pruning Verification Find the users that lie in the unpruned space For each such user, check whether it is a RkNN of q or not Prune the search space using near by facilities of q Data indexed by R-trees

13 Faculty of Information Technology On-the-fly RkNN Algorithms Pruning Verification Half-space Region-based TPL (VLDB 2004), TPL++ (PVLDB 2015) FINCH (PVLDB 2008), InfZone (ICDE 2011) Six-regions (SIGMOD 2000) SLICE (ICDE 2014)

14 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

15 Faculty of Information Technology 1.Divide the whole space centred at the query q into six equal regions each of 60 o 2.Let f be a facility in a partition P 3.Let u be a user in P for which dist(u,q) > dist(q,f) 4.q cannot be the closest facility of u Proof Sketch:  fqu ≤ 60 o and  ufq > 60 o  ufq >  fqu  uq > uf f q u Six-regions: Pruning [I. Stanoi et al., SIGMOD Workshop 2000]

16 Faculty of Information Technology 1.Divide the whole space centred at the query q into six equal regions 2.Find the k-th nearest neighbor in each Partition. 3.The k-th nearest facility of q in each region defines the area that can be pruned b a c d q u1u1 u2u2 Six-regions: Pruning [I. Stanoi et al., SIGMOD Workshop 2000] k = 2

17 Faculty of Information Technology Access users R-tree and prune the entries that lie in the pruned area For each unpruned user u –Issue a boolean range query to check if u is a RkNN or not Disadvantage: Requires boolean range query for each candidate user b a c d q u1u1 Six-regions: Verification [I. Stanoi et al., SIGMOD 2000] k = 2

18 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

19 Faculty of Information Technology Half-space Pruning: q cannot be the closest facility of u if it lies in the half-space q cannot be among the k-closest facilities of u if u lies in k half-spaces Pruning Algorithm 1.Find the nearest unseen facility f in the unpruned area. 2.Draw a bisector between q and f to prune the search space 3.Go to step 1 unless all facilities in the unpruned area have been accessed b a c d q u TPL: Pruning [Y. Tao et al., VLDB 2004] k = 2

20 Faculty of Information Technology TPL: Pruning [Y. Tao et al., VLDB 2004] b q Advantage: Prunes more space than six-regions Disadvantage: X Pruning is more expensive especially when k is not small

21 Faculty of Information Technology TPL: Pruning [Y. Tao et al., VLDB 2004] Advantage: Prunes more space than six-regions Disadvantage: X Pruning is more expensive especially when k is not small  Find the k-half spaces that contain the user  Requires using subsets a q d c b u k = 2 {a,b}{b,c} {c,d}{a,c} k! (m-k)! m!

22 Faculty of Information Technology TPL: Pruning [Y. Tao et al., VLDB 2004] Solution: TPL does not use all possible subsets 1.Sort facilities by hilbert-values 2.Consider only the subset consisting of k consecutive facilities Considers m subsets X Some pruning power is lost a q d c b u k = 2 {a,b}{b,c} {c,d}{d,a} {a,b,c,d}

23 Faculty of Information Technology TPL: Verification [Y. Tao et al., VLDB 2004] Prune the user R-tree entries using the k-half spaces approach Determine the candidate users Issue a bulk boolean range query to verify all candidate users a q d c b u k = 2 {a,b}{b,c} {c,d}{d,a}

24 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

25 Faculty of Information Technology Key Idea Approximate the unpruned area by a convex polygon Advantage: Pruning is more efficient (e.g., point containment in logarithmic time) FINCH: Pruning [W. Wu et al., PVLDB 2008] a q c b u k = 2

26 Faculty of Information Technology Computing polygon Get intersection points of half-spaces and the boundary space For each intersection point –Compute a counter that denotes the number of half-spaces that contain it –Remove the intersections with counter ≥ k Compute the convex hull of remaining intersection points FINCH: Pruning [W. Wu et al., PVLDB 2008] a q c b u k = 2 2 1 1 3 1 1 0 0 0 0 0 1 2

27 Faculty of Information Technology Pruning Algorithm 1.Initialize whole space as the convex polygon 2.Find the nearest facility that lies inside the convex polygon 3.Draw its half-space, compute new intersections and their counters and update the convex polygon 4.Go to step 2 until there is an un-accessed facility inside the polygon FINCH: Pruning [W. Wu et al., PVLDB 2008] a q c b u k = 2

28 Faculty of Information Technology Prune the user R-tree entries that lie outside the convex polygon For each user that lies inside the polygon –Issue a boolean range query to check if it is a RkNN or not FINCH: Verification [W. Wu et al., PVLDB 2008] a q c b u k = 2

29 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

30 Faculty of Information Technology Influence Zone (InfZone): Motivation [M. Cheema et al., ICDE 2011] Pruning Verification Find the users that lie in the unpruned space For each such user, issue a boolean range query to verify it Prune the search space using near by facilities of q Influence Zone is an area such that a user u is a RkNN if and only if u is inside this area Compute influence zone using near by facilities Find the users that lie in the influence zone

31 Faculty of Information Technology The influence zone corresponds to the unpruned polygon when the bisectors of all the facilities have been considered for pruning. Challenges: How to compute unpruned polygon? Using all facilities for pruning will be very expensive d b c a q Influence Zone (InfZone): Challenges [M. Cheema et al., ICDE 2011] k = 2

32 Faculty of Information Technology Challenge 1: Constructing the polygon Like FINCH, compute the counters of all intersections Remove the intersections with counter ≥ k Keep only the intersections that either lie on the boundary of the data space OR have counter equal to k-1 or k-2 Keep only the extreme intersections on each boundary Sort the intersections according to their angles with q Connect the intersections in the sorted order Influence Zone (InfZone): Construction [M. Cheema et al., ICDE 2011] a q c b k = 2 2 1 1 3 1 1 0 0 0 0 0 2 0

33 Faculty of Information Technology Challenge 2: Avoid accessing all facilities Let C v denote the circle centered at a vertex v with radius dist(v,q) A facility f can be ignored if it lies outside C v for every vertex of the current influence zone An entry e of the facility R-tree can be ignored if it lies outside C v for every vertex of the current influence zone Influence Zone (InfZone): Construction [M. Cheema et al., ICDE 2011] a q c b k = 2 1 1 1 1 0 0 0 0

34 Faculty of Information Technology Influence Zone Construction Algorithm Initialize InfZone as the whole data space Enheap the root of the R-tree in a heap While heap is not empty –De-heap an entry e –If e lies outside every C v Ignore e –Else If e is an intermediate node –Insert children of e in the heap Else –Draw the bisector of e and update the current influence zone Influence Zone (InfZone): Construction [M. Cheema et al., ICDE 2011] a q c b k = 2 1 1 1 1 0 0 0 0

35 Faculty of Information Technology Prune the user R-tree entries that lie outside the influence zone Return the users that lie inside the influence zone Point containment can be done in logarithmic time O(log m) Rectangle containment takes linear time O(m) Influence Zone (InfZone): Verification [M. Cheema et al., ICDE 2011] a q c b k = 2 1 1 1 1 0 0 0 0

36 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

37 Faculty of Information Technology SLICE: Motivation [S. Yang et al., ICDE 2014] Regions-based (Six-regions) Half-space (InfZone) Range query Pruning CostO(m log k) O(km 2 ) Pruning Power Verification Cost Low High O(log m) SLICE O(m log m) High O(k) m is the # of facilities considered for pruning

38 Faculty of Information Technology 1.Divide the whole space centred at the query q into t equal regions 2.Draw arcs for each facility 3.k-th arc in each partition defines the pruning region Pruning requires checking only one distance q f1f1 f2f2 k=2 SLICE: Key Idea [S. Yang et al., ICDE 2014]

39 Faculty of Information Technology SLICE: Comparison with six-regions [S. Yang et al., ICDE 2014] q f Six-regionSLICE Partitions Pruned No. of Partitions One 6 6 Area pruned dist(f,q) any θ max

40 Faculty of Information Technology SLICE: Verification [S. Yang et al., ICDE 2014] Significant facility: –k-th arc in each partition is called the bounding arc –A facility f that prunes at least one point p ∈ P lying inside the bounding arc of P. –An insignifcant facility cannot prune any candidate user M N P Verification for a candidate Issuing range query for each candidate Access significant facilities during pruning High I/O and cpu cost Use significant facilities to verify O(k) Regions-based SLICE q

41 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

42 Faculty of Information Technology TPL++: Optimization 1 [S. Yang et al., PVLDB 2015] TPL: 1.Sort facilities by hilbert-values 2.Consider only the subset consisting of k consecutive facilities X Considers m subsets X Some pruning power is lost TPL++: 1.Initialize a counter to 0 2.Access facilities one by one 3.Increment the counter whenever a facility prunes the user u 4.Prune u when counter ≥ k a q d c b u k = 2 {a,b}{b,c} {c,d}{d,a} O(km) O(m)

43 Faculty of Information Technology Pruning power: TPL vs TPL++ [S. Yang et al., PVLDB 2015]

44 Faculty of Information Technology TPL++: Optimization 2 [S. Yang et al., PVLDB 2015] TPL: A facility entry e or a facility point that lies in the pruned space is ignored TPL++: A facility entry e that lies in the pruned space is ignored A facility point is used for pruning even if it lies in the pruned space a q d c b u d

45 Faculty of Information Technology TPL vs TPL++ 2 times better 20 times better

46 Faculty of Information Technology Outline: Reverse k Nearest Neighbors  Introduction  Pre-computation based approach  On-the-fly algorithms  Six-regions [2000]  TPL [2004]  FINCH [2008]  Influence Zone [2011]  SLICE [2014]  TPL++ [2015]  Comparison of RkNN algorithms

47 Faculty of Information Technology Pruning Six-regionsTPLTPL++FINCHInfZoneSLICE nodeO(1)O(km)O(m) O(1) pointO(1)O(km)O(m)O(logm)O(m)O(1) Adding fO(log k)O(logm) O(m 2 ) O(log m) Verification nodeO(1)O(km)O(m) O(1) pointO(1)O(km)O(m)O(logm) O(1) #candidatesLarge SmallMediumMinimalSmall Verifying uRange queryBulk Range query Range queryO(logm)O(k) Comparison of RkNN Algorithms

48 Faculty of Information Technology Experimental Comparison [Yang et al., PVLDB 2015] Setup –Intel Xeon 2.66 GHz CPU, 4GB Memory and Hard disk –Index: R*-tree –100 buffers –I/O cost and CPU cost –Average cost per query Data sets –Three real data sets (up to 25M points) –CA, LA and NA –Synthetic data sets follows different distributions (up to 20M points) Source code and data sets are available online

49 Faculty of Information Technology Experimental Comparison [Yang et al., PVLDB 2015]

50 Faculty of Information Technology 50 Ranking Criteria1st2nd3rd4th5th6th I/O (no buffer)TPL++,InfZoneSLICETPLFINCHSIX I/O (small buffer)TPL++,InfZoneFINCHSLICETPL,SIX CPU (k<10)SLICEInfZoneTPL++FINCHSIX,TPL CPU (10<k<25)SLICEInfZone, TPL++FINCHSIXTPL CPU (25<k<200)SLICETPL++SIXFINCHInfZoneTPL ImplementationSIX,SLICETPL, TPL++FINCH, InfZone Experimental Comparison [Yang et al., PVLDB 2015]

51 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Introduction  Monochromatic algorithms (2d)  Bichromatic algorithms (≥2d)  Reverse Skyline Queries  Other work

52 Faculty of Information Technology Reverse Top-k (RTk) Queries Introduced by [Vlachou et al., ICDE 2010] Examples are from [Vlachou et al, ICDE 2010] Score(p 2 ) = 0.2x3 + 0.8x2 = 2.2 Definition of importance (Top-k queries) –Each user u has a preference function –Score of a facility is score(f) = w[1]*f[1] + … w[d]*f[d] –A facility f is important to a user u if f is one of the top-k facilities for u Bichromatic Reverse Top-k Query (RTk) –Find every user u for which the query facility q is one of her top-k facilities. Tom and Max are the reverse top-1 users of p 2 Bob is not a reverse top-1 user of p 2

53 Faculty of Information Technology Examples are from [Vlachou et al, ICDE 2010] q = p 2, k=1 Bichromatic RTk queries –Find every user u for which the query facility q is one of her top-k facilities. (e.g., result is {Tom, Max}) Monochromatic RTk Queries –Find every weighting vector for which q is one of the top-k facilities. Result: line segment where w[price]=[1/7,5/6] Reverse Top-k (RTk) Queries: Types Introduced by [Vlachou et al., ICDE 2010]

54 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Introduction  Monochromatic algorithms (2d)  Bichromatic algorithms (≥2d)  Reverse Skyline Queries  Other work

55 Faculty of Information Technology Score(q) is the projection on the vector w Rank(q) w.r.t. w  number of facilities below the red line Rank(q) < Rank(f) for every w if q dominates f Ignore facilities that are dominated by q Result is empty if k facilities dominate q Monochromatic Reverse Top-k Algorithms [Vlachou et al., ICDE 2010] f q w=[0.5,0.5] f f

56 Faculty of Information Technology The relative rank of q and f depends on the rotation of the red line Monochromatic Reverse Top-k Algorithms [Vlachou et al., ICDE 2010] q f w w` w``

57 Faculty of Information Technology Algorithm Start with vertical line Rank(q)  Count the number of facilities on the left Rotate the line counter-clockwise Update Rank(q) when line intersects a facility Report the weighting vectors for which Rank(q) ≤ k Monochromatic Reverse Top-k Algorithms [Vlachou et al., ICDE 2010] q a b Rank(q) = 21

58 Faculty of Information Technology RTk using k-lower envelope (2d) [Cheema et al., EDBT 2014] Given a point a=(u,v) and a weighting vector W=(w 1, w 2 ), a.score = u*w 1 + v*w 2 A point a=(u,v) is mapped to a line a*: y=ux + v in dual The weighting vector W=(w 1, w 2 ) is mapped to a vertical line W*: x=w 1 /w 2 The intersection of a* and w* is the point where y= u(w 1 /w 2 )+ v = (u*w 1 +v*w 2) )/w 2 a b a* W*: x = w 1 / w 2 y a = a.score/w 2 y b = b.score/w 2 b* PrimalDual

59 Faculty of Information Technology RTk using k-lower envelope (2d) [Cheema et al., EDBT 2014] Query: Given a weighted vector W=(w 1,w 2 ), return k objects with smallest scores Solution: –Map W and all the objects to dual space –Return k lowest lines intersecting W* a b W*: x = w 1 / w 2 PrimalDual c d 1 2 Rank 1.a 2.b 3.c 4.d Rank 1.d 2.b 3.a 4.c W*: x = w 3 / w 4

60 Faculty of Information Technology RTk using k-lower envelope (2d) [Cheema et al., EDBT 2014] Given a set of lines L, mass of a point p is the number of lines that lie strictly below p k-lower envelope consists of every point p that lies on one of the lines in L and has mass equal to k-1. p p’ 2-lower envelope

61 Faculty of Information Technology RTk using k-lower envelope (2d) [Cheema et al., EDBT 2014] Map all facilities to dual space and compute k-lower envelope Map query point to dual space Return weighting vectors where query line is below the k-lower envelope Slide # 61 a b PrimalDual c d W*: x = w 1 / w 2 q

62 Faculty of Information Technology Computing k-lower envelope (2d) [Cheema et al., EDBT 2014]  Start from the left most point on k-lower envelope (always move towards right)  Upon reaching an intersection  Make a turn (i.e., leave the current road)  The path travelled is the k-lower envelope Slide # 62 a b PrimalDual c d

63 Faculty of Information Technology Computing k-lower envelope (2d) [Cheema et al., EDBT 2014]  Start from the left most point on k-lower envelope (always move towards right)  Upon reaching an intersection  Make a turn (i.e., leave the current road)  The path travelled is the k-lower envelope a b c d Line with k-th largest slope. i.e., point in primal with k-th largest x-value A point (u,v) in primal is mapped to a line y=ux+v

64 Faculty of Information Technology Computing k-lower envelope (2d) [Cheema et al., EDBT 2014]  Start from the left most point on k-lower envelope (always move towards right)  Upon reaching an intersection  Make a turn (i.e., leave the current road)  The path travelled is the k-lower envelope

65 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Introduction  Monochromatic algorithms (2d)  Bichromatic algorithms (≥2d)  Reverse Skyline Queries  Other work

66 Faculty of Information Technology Bichromatic Reverse Top-k (≥2d) [Vlachou et al., ICDE 2010] Given a set of facilities F and a set of weighting vectors W, return every weighting vector for which q is one of the top-k facilities Brute Force Algorithm:  For each vector w in W  Compute top-k facilities  Return w if q is among the top-k facilities

67 Faculty of Information Technology Bichromatic Reverse Top-k (≥2d) [Vlachou et al., ICDE 2010] Threshold based algorithm (RTA) Sort the weighting vectors by their pair-wise similarity (Similar vectors have similar top-k results) Evaluate the first top-k query, calculate a threshold For each weighting vector –Try to prune using the threshold –Refine threshold

68 Faculty of Information Technology Bichromatic Reverse Top-k (≥2d) [Vlachou et al., ICDE 2010] Evaluate top-2 query for w 1 Set threshold based on w 2 score(q) for w 2 > threshold  discard w 2 Compute top-k for w 3 and update the buffer W=[ w 1, w 2, w 3 ] Buffer: p1, p2 w1w1 q p4p4 p1p1 p2p2 p3p3 p5p5 p6p6 p7p7 p8p8 p9p9 p 10 w2w2 w3w3 Example is from [Vlachou et al, ICDE 2010]

69 Faculty of Information Technology Bichromatic Reverse Top-k (≥2d) [Vlachou et al., SIGMOD 2013] Branch-and-bound algorithm: Key idea Weighting vectors and facilities are indexed (e.g., by R-tree) Compute upper and lower bounds Prune using the bounds Process unpurned entries

70 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Introduction  Pre-computation based approach  On-the-fly algorithm  Other work

71 Faculty of Information Technology Reverse Skyline [Dellis et al., VLDB 2007] Dominance A point x dominates y if x is at least as good as y on all the dimensions and x is better than y on at least one dimension Skyline Return every point that is not dominated by any other point x y Distance Price z c a d

72 Faculty of Information Technology Reverse Skyline [Dellis et al., VLDB 2007] Dynamic Dominance A user u gives her ideal point A point x dominates y if its difference from u is not larger than y’s difference on each dimension and is smaller on at least one dimension Dynamic Skyline Return every point that is not dynamically dominated by any other point Transform each x[i] to |u[i] – x[i]| x Distance from airport Room size z y a b u y` a` z` b`

73 Faculty of Information Technology Reverse Skyline [Dellis et al., VLDB 2007] Definition of Importance A user u considers a facility f to be important if f is among the dynamic skyline for the user u Reverse Skyline Return every user u for which the query facility is in its dynamic skyline x Distance from airport Room size u y` a` z` b`

74 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Introduction  Pre-computation based approach  On-the-fly algorithm  Other work

75 Faculty of Information Technology Precomputation based approach [Dellis et al., VLDB 2007] Pre-computation For each user u –Compute and store its dynamic skyline Query processing u is not an answer if q is dominated by its pre-computed skyline u is an answer if q is not dominated by its pre-computed skyline x Distance from airport Room size u y` a` z` b` q q

76 Faculty of Information Technology Precomputation based approach [Dellis et al., VLDB 2007] Reducing storage requirement For each user u –Store only k of its dynamic skyline points Query processing –u is not an answer if q is dominated by any of the k stored points –u is guaranteed to be an answer if q dominates any of the k stored points –otherwise, call verification to check if u is an answer x Distance from airport Room size u z` b` q q q

77 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Introduction  Pre-computation based approach  On-the-fly algorithm  Other work

78 Faculty of Information Technology On-the-fly Algorithm [Dellis et al., VLDB 2007] Window of a user u is a rectangle centered at u and q on one of the corners A user u is an answer iff its window is empty Key idea Divide the space around q into 2 d partitions Compute skyline for each partition Any user dominated by these skylines cannot be the answer e Distance from airport Room size d c a b q f g u` u u

79 Faculty of Information Technology Outline  Introduction  Reverse k Nearest Neighbors Queries  Reverse Top-k Queries  Reverse Skyline Queries  Other work

80 Faculty of Information Technology Other work on reverse spatial queries  Uncertain data  Continuous Monitoring (e.g., moving objects, data stream)  Influence Maximization  Other spaces (e.g., road network, general metric space, non-metric space, obstructed space)  Spatial Keyword Queries  …

81 Faculty of Information Technology Open problems on reverse spatial queries  Location-based reverse top-k queries  Location-based reverse skyline queries

82 Faculty of Information Technology Location-based Reverse Top-k Definition of importance –Each user u has a preference function –A facility f is important to a user u if f is one of the top-k facilities for u Reverse Top-k Query (RTk) –Find every user u for which the query facility q is one of her top-k facilities. Influence set of f 1 is {u 2 } Influence set of f 2 is {u 1,u 3 } K=1 u2u2 f1f1 f2f2 u1u1 u3u3 Price=1 Price=2 2 3 0.9*price + 0.1*distance 0.5*price + 0.5*distance 1*distance

83 Faculty of Information Technology Location-based Reverse Skyline Dominance  A facility x dominates another facility y w.r.t. a user u, if for every attribute, u prefers x over y Definition of importance  A facility f is important to a user u if f is not dominated by any other facility Reverse Skyline  Find every user u for which the query facility q is not dominated by any other facility. Influence set of f 1 is {u 1,u 2 } Influence set of f 2 is {u 1,u 2,u 3 } u2u2 f1f1 f2f2 u1u1 u3u3 Price=1 Price=2

84 Faculty of Information Technology Acknowledgments Prof. Xuemin Lin (UNSW Australia) A/Prof. Wei Wang (UNSW Australia) Dr. Wenjie Zhang (UNSW Australia) Dr. Ying Zhang (UTS Australia) Shiyu Yang (UNSW Australia) Zhitao Shen (CISCO)

85 Faculty of Information Technology References 1.Flip Korn, S. Muthukrishnan: Influence Sets Based on Reverse Nearest Neighbor Queries. SIGMOD 2000:201-212 2.Ioana Stanoi, Divyakant Agrawal, Amr El Abbadi: Reverse Nearest Neighbor Queries for Dynamic Databases. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery 2000:44-53 3.Yufei Tao, Dimitris Papadias, Xiang Lian: Reverse kNN Search in Arbitrary Dimensionality. VLDB 2004:744-755 4.Evangelos Dellis, Bernhard Seeger: Efficient Computation of Reverse Skyline Queries. VLDB 2007:291-302 5.Wei Wu, Fei Yang, Chee Yong Chan, Kian-Lee Tan: FINCH: evaluating reverse k-Nearest-Neighbor queries on location data. PVLDB 1(1):1056-1067 (2008) 6.Akrivi Vlachou, Christos Doulkeridis, Yannis Kotidis, Kjetil Nørvåg: Reverse top-k queries. ICDE 2010:365-376 7.Muhammad Aamir Cheema, Xuemin Lin, Wenjie Zhang, Ying Zhang: Influence zone: Efficiently processing reverse k nearest neighbors queries. ICDE 2011:577-588 8.Akrivi Vlachou, Christos Doulkeridis, Yannis Kotidis, Kjetil Nørvåg: Monochromatic and Bichromatic Reverse Top-k Queries. IEEE Trans. Knowl. Data Eng. (TKDE) 23(8):1215-1229 (2011) 9.Muhammad Aamir Cheema, Wenjie Zhang, Xuemin Lin, Ying Zhang: Efficiently processing snapshot and continuous reverse k nearest neighbors queries. VLDB J. (VLDB) 21(5):703-728 (2012) 10.Akrivi Vlachou, Christos Doulkeridis, Kjetil Nørvåg, Yannis Kotidis: Branch-and-bound algorithm for reverse top-k queries. SIGMOD 2013:481-492 11.Shiyu Yang, Muhammad Aamir Cheema, Xuemin Lin, Ying Zhang: SLICE: Reviving regions-based pruning for reverse k nearest neighbors queries. ICDE 2014:760-771 12.Muhammad Aamir Cheema, Zhitao Shen, Xuemin Lin, Wenjie Zhang: A Unified Framework for Efficiently Processing Ranking Related Queries. EDBT 2014:427-438 13.Shiyu Yang, Muhammad Aamir Cheema, Xuemin Lin, Wei Wang: Reverse k Nearest Neighbors Query Processing: Experiments and Analysis. PVLDB 8(5):605-616 (2015)

86 Faculty of Information Technology Thanks

87 Faculty of Information Technology Acknowledgments


Download ppt "Information Technology Influence Computation in Spatial Dabases Muhammad Aamir Cheema Faculty of Information Technology Monash University, Australia"

Similar presentations


Ads by Google