Presentation is loading. Please wait.

Presentation is loading. Please wait.

The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg.

Similar presentations


Presentation on theme: "The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg."— Presentation transcript:

1 The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg University Kyriakos Mouratidis, Singapore Management University Nikos Mamoulis, University of Hong Kong

2 The University of Hong Kong 2 Outlines  Motivation  Related Work Assignment Problems  Solutions  Approximate Solutions  Conclusion

3 The University of Hong Kong 3 Motivation  Assume that Our system has a set of service providers (Q) which serve a set of customers Each service provider (q) can serve at most k customers simultaneously For every provider-customer (q,p) pair, our central server knows the cost to assign p to q  Our aim is to maximize our service utilization 1. Maximize the number of served customers 2. Minimize the total sum of weights

4 The University of Hong Kong 4 Case Study I  Concerning the case between wireless routers and laptops each router can serve at most 3 users concurrently the signal strength is measured by the Euclidean distance (longer distance means weaker signal)  Can it be solved by Nearest Neighbor Queries? 3-Nearest Neighbor Queries

5 The University of Hong Kong 5 Case Study I  Can it be solved by Reverse Nearest Neighbor Queries? Reverse Nearest Neighbor Queries

6 The University of Hong Kong 6 Case Study I  Can it be solved by Closest Pair Queries? 6-Clostest Pairs (2 routers * 3 capacities)

7 The University of Hong Kong 7 3 2 Case Study I  Can it be solved by Spatial Matching (Exclusive Closest Pair)? ECP matching Router’s capacity is 3 Find ECP between set {A} and {B} 1.Find closest pair (a,b) from (A,B) 2.(a,b) is a pair of ECP, a.k=a.k-1, b.k=b.k-1 (* k is the capacity value) 3.{A}={A}-a if a.k=0, {B}={B}-b if b.k=0, go to step 1 until {A} or {B} is empty 1 0 3 2 1 0

8 The University of Hong Kong 8 Case Study I  Can it be solved by optimal assignment? Optimal assignment Optimal assignment tries to server as many as possible users and also tries to minimize the sum cost (distance)

9 The University of Hong Kong 9 Related Work  Optimal assignment is to compute the maximum size matching with minimum assignment cost  Two popular algorithms Hungarian Algorithm Successive Shortest Path Algorithm (SSPA)  The time complexity of two algorithms is O(n 3 ) in worst case where n is the number of service providers or customers

10 The University of Hong Kong 10 Successive Shortest Path Algorithm (SSPA) q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D q1q1 q2q2 p1p1 p2p2 0 3 S D 1.Find shortest path (SP) from source to destination 2.Reverse the edge direction on SP 3.Repeat steps 1~2, Until no more path can be found q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D

11 The University of Hong Kong 11 Successive Shortest Path Algorithm (SSPA)  SSPA is easy to implement with capacity constraint  Assume that data set A is our routers with capacity 2, data set B is our users q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D 0 q1q1 q2q2 p1p1 p2p2 1 3 S D q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D q1q1 q2q2 p1p1 p2p2 0 1 1 3 S D

12 The University of Hong Kong 12 Preliminary Solution  In our problem settings, we have a set of service providers (Q) with capacity value k and a set of customers (P) which are indexed by an R-tree  Let us analyze SSPA performance in detail Consider the case |Q|=|P| and k=1 For every q in Q, we need to find a SP (time=N, where N=|Q|) Find a SP in the bipartite graph between Q and P (time=|E all |, where E all is all the edges between Q and P) So the time complexity is N*|E all |  The algorithm should do better if the bipartite graph is smaller N*|E sub | << N*|E all |, if |E sub | << |E all |

13 The University of Hong Kong 13 Preliminary Solution  A SP can be determined by a sub-graph, if the sub-graph is built in order q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 Only add edges with weight ≤ 1 into our graph p1p1 p2p2 p3p3 q1q1 015 q2q2 11214 q3q3 438 p1p1 p2p2 p3p3 q1q1 01>1 q2q2 1 q3q3

14 The University of Hong Kong 14 Solution - RIA  Range Incremental Algorithm (RIA) is based on the last observation to build the bipartite graph incrementally  Lemma 1 If all the edges with weight ≤ T are added into sub- graph (E sub ), then a SP from E sub with weight ≤ T must be a SP from E QxP q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 T=1, Only add those edges with weight ≤ T into our graph Weight of SP is 2 Increase threshold T=T+1 => T=2, it does not add any edge into graph PROBLEM p1p1 p2p2 p3p3 q1q1 01>1 q2q2 1 q3q3 p1p1 p2p2 p3p3 q1q1 01>2 q2q2 1 q3q3

15 The University of Hong Kong 15 Solution - NIA  Nearest Neighbor Incremental Algorithm (NIA) increases E sub by nearest neighbor q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 Heap H={(q 1,p 1,0), (q 2,p 1,1), (q 3,p 2,3)} Heap H={(q 1,p 2,1), (q 2,p 1,1), (q 3,p 2,3)} Lemma 2 If the weight of SP ≤ H.top(), then it is also a SP in E all Otherwise, add a new edge from H to E sub Heap H={(q 2,p 1,1), (q 3,p 2,3), (q 1,p 3,5)} p1p1 p2p2 p3p3 q1q1 0≥0 q2q2 q3q3 p1p1 p2p2 p3p3 q1q1 01≥1 q2q2 q3q3

16 The University of Hong Kong 16 Solution - IDA  Lemma 3 If any object in Q (which is our service providers) is not accessed from source S, then it is not necessary to add its nearest neighbor into E sub  We develop a novel algorithm Incremental On- Demand Algorithm (IDA) which is based on this lemma q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 Heap H={(q 1,p 2,1), (q 2,p 1,1), (q 3,p 2,3)} It is not necessary to add this edge in current state, since it cannot help us to find any new SP Heap H={(q 2,p 1,1), (q 3,p 2,3), (q 1,p 3,5)} p1p1 p2p2 p3p3 q1q1 01≥1 q2q2 q3q3

17 The University of Hong Kong 17 Solution - IDA q1q1 q2q2 p1p1 p2p2 S D q3q3 p3p3 Heap H={(q 1,p 2,1), (q 2,p 1,1), (q 3,p 2,3)} Heap H={(q 2,p 1,1), (q 3,p 2,3)} Heap H={(q 3,p 2,3), (q 2,p 2,12)} Heap H={(q 1,p 2,1), (q 3,p 2,3), (q 2,p 2,12)} Heap H={(q 3,p 2,3), (q 1,p 3,5), (q 2,p 2,12)}  IDA only expands the graph when it is necessary  It is expected to have a smaller sub-graph (smaller E sub ) when executing SP searches Weight of SP is 1-0+1=2 p1p1 p2p2 p3p3 q1q1 0≥1 q2q2 q3q3 p1p1 p2p2 p3p3 q1q1 0≥3 q2q2 1 q3q3 p1p1 p2p2 p3p3 q1q1 01 q2q2 1 q3q3

18 The University of Hong Kong 18 Experiments  Number of Service Providers |Q| (in thousands): 0.25 0.5 1 2.5 5  Number of Customers |P| (in thousands): 25 50 100 150 200  Capacity k: 20 40 80 160 320  Both datasets were generated on the road map of San Fransisco  Language C++  Pentium D 3.0 GHz with running on Ubuntu 7.10

19 The University of Hong Kong 19 Experiments  First, we test the performance on small dataset over different capacity k (|Q|=0.25, |P|=25 [in thousand])

20 The University of Hong Kong 20 Experiments

21 The University of Hong Kong 21 Experiments

22 The University of Hong Kong 22 Approximate Solution  Time-critical applications could favor fast answers over exact matching  Our approximate solutions provide a tunable trade-off between result accuracy and response time with theoretical guarantees for the assignment cost  Three phases of our general method Partitioning phase Concise matching phase Refinement phase a b centroid of group

23 The University of Hong Kong 23 Service provider Approximation (SA)  Service providers are sorted by Hilbert value and are grouped by this order  Each point q is inserted into an existing group G so that the diagonal of G’s MBR does not exceed δ  If no such group is found, then a new group is formed to contain q  The centroid of a group G is the geometric centroid. e.g., for x-coordinate, sum( q.x*q.k ) / sum(q.k) where q in G  Theoretical error bound is 2 * num of assignment * δ 3 4 1 1 δ

24 The University of Hong Kong 24 e1e1 Customer Approximation (CA)  Unlike SA, CA can do the grouping in R-tree  Theoretical error bound is num of assignment * δ e3e3 e2e2 e6e6 e7e7 e5e5 e4e4 e1e1 e2e2 e3e3 e4e4 e5e5 e6e6 e7e7 δ

25 The University of Hong Kong 25 Refinement Phase  In refinement phase, SA and CA only solve some smaller assignment problems  We could run the exact algorithm for each of these smaller problems. This, however, is expensive  Therefore, two heuristics methods are proposed NN-based refinement Use round robin fashion to find NN customer for each service provider Exclusive Closest Pair refinement Use ECP to make assignment 3 1 2 4

26 The University of Hong Kong 26 Experiments Quality = sum of approximate cost sum of optimal cost SA CA

27 The University of Hong Kong 27 Experiments

28 The University of Hong Kong 28 Conclusion  We proposed three algorithms which solve the CCA problem efficiently  All our methods try to Minimize I/O accesses Minimize CPU time  Also, we proposed two approximate solutions which achieve good tradeoff between execution time and matching quality  Our next step is to investigate Incremental updates to CCA solution Continuous monitoring of CCA Other types of matching (assignment) problems

29 The University of Hong Kong 29 Thank you! Any question?

30 The University of Hong Kong 30 Hungarian Algorithm 1.Find the smallest value for each row, and reduce it to every elements in each row 2.Find the smallest value for each column, and reduce it to every elements in each column 3.Find minimum number of lines to cover all zero 4.Find the smallest value for all uncovered elements, and reduce it to every uncovered elements (also, add it to the cell which is the intersection of two covered lines) 5.Repeat steps 3~4, until the number of lines is equal to |A| or |B|

31 The University of Hong Kong 31 Hungarian Algorithm  Hungarian is not easy to work with capacity constraint efficiently duplicating the row/column is a not a good solution  The memory usage of Hungarian is very high Sum(a.k)xSum(b.k), where a in A, b in B  Step 3 of Hungarian is not easy to do further optimization Find minimum number of lines to cover all zero b1b1 b2b2 b3b3 a1a1 019 a1a1 019 a2a2 157 a2a2 157 a3a3 878 a3a3 878 b1b1 b2b2 b3b3 a1a1 018 a2a2 045 a3a3 100

32 The University of Hong Kong 32 Optimization – Reducing Dijkstra Execution  Some optimizations to Dijkstra can be done Dijkstra stops search when the weight of a potential SP is higher than the top value in heap H Once a new path adds into E sub, it only affects one vertex and its sequential vertices Notice that Dijkstra cannot run with negative weight on the edges, but potential value can be used to solve this problem Each node has a potential value, and it is changed when updating the graph The potential weight of edges is calculated by the edge weight+two vertices’ potential values which is always larger than zero Potential vertices are affected by new added edge Unaffected vertices

33 The University of Hong Kong 33 Optimization – Incremental All Nearest Neighbor  All three proposed algorithms invoke numerous range/NN search operations around the service providers to the R-tree that indexes the customers  To reduce the I/O cost, we employ an incremental all-nearest-neighbor technique q p

34 The University of Hong Kong 34 Case Study II  In an election, if we have a good arrangement between voters and polling stations, then a lot of human resources can be saved Maximize the polling station utilization Minimize the total sum of travel distances to the polling station

35 The University of Hong Kong 35 Hungarian Algorithm b1b1 b2b2 b3b3 a1a1 019 a2a2 157 a3a3 878 b1b1 b2b2 b3b3 a1a1 019 a2a2 046 a3a3 101 b1b1 b2b2 b3b3 a1a1 018 a2a2 045 a3a3 100 b1b1 b2b2 b3b3 a1a1 007 a2a2 034 a3a3 200 -0 -7 -0 b1b1 b2b2 b3b3 a1a1 015 a2a2 11214 a3a3 438

36 The University of Hong Kong 36 Optimization – Reducing Dijkstra Execution a1a1 a2a2 b1b1 b2b2 a3a3 b3b3 a4a4 b4b4 1 0 2 1 2 2 Heap H={(a 1,b 2,2.5), …, …} Dijkstra stops search since the weight of SP is higher than the top value in heap H

37 The University of Hong Kong 37 Optimization – Reducing Dijkstra Execution a1a1 a2a2 b1b1 b2b2 S D a3a3 b3b3 b1b1 b2b2 b3b3 a1a1 015 a2a2 11214 a3a3 458 Heap H={(a 3,b 2,3)} Heap H={(a 3,b 1,4)} Heap H={(a 3,b 1,4), (a 1,b 3,5)} Heap H={(a 1,b 3,5), (a 3,b 3,8), (a 2,b 2,12)}

38 The University of Hong Kong 38 Related Works  Other solutions Cost scaling algorithm Capacity scaling algorithm  These two algorithms are based on SSPA with different optimizations None of them is suitable for our problem settings for different reasons


Download ppt "The University of Hong Kong 1 Capacity Constrained Assignment in Spatial Databases Authors: Leong Hou U, University of Hong Kong Man Lung Yiu, Aalborg."

Similar presentations


Ads by Google