1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and.

1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and Technology) Philip S. Yu (University of Illinois at Chicago) Minhao Jiang (Hong Kong University of Science and Technology) Presented by Raymond Chi-Wing Wong Prepared by Raymond Chi-Wing Wong

2 Outline 1.Introduction 2.Problem Definition 3.Related Work 4.Algorithm – Swap-Chain 5.Empirical Study 6.Conclusion

3 1. Introduction P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 Some existing studies consider the capacities of hospitals and the demands of customers Return an assignment between P and O such that the “condition” of the assignment is satisfied. 1 1 1 4 5 6 1 1 1 Different applications have different conditions. Worst-case Optimized Condition: In the assignment, the maximum matching distance (mmd) between a residential-estate o and a hospital p is minimized. Worst-case Optimized assignment Worst-case Optimized Assignmentmmd = 6

4 1. Introduction P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 Some existing studies consider the capacities of hospitals and the demands of customers Return an assignment between P and O such that the “condition” of the assignment is satisfied. 1 1 1 4 5 6 1 1 1 Different applications have different conditions. Worst-case Optimized Condition: In the assignment, the maximum matching distance (mmd) between a residential-estate o and a hospital p is minimized. Worst-case Optimized assignment Worst-case Optimized Assignment There are a lot of applications which need the worst- case optimized assignment. 1. Emergency applications (e.g., hospital allocation, fire stations and police stations) In Hong Kong Ambulance service, the minimized maximum distance is 12 minutes (driving distance). 2. Logistics, Data Warehouse Allocation 3. Mail Delivery 4. Profile Matching Worst-case dissatisfactory rate among customers is minimized mmd = 6

5 1. Introduction P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 Some existing studies consider the capacities of hospitals and the demands of customers Return an assignment between P and O such that the “condition” of the assignment is satisfied. 1 1 1 10 3 2 1 1 1 Different applications have different conditions. [Wong et al, VLDB 2009] Fair Condition: In the assignment, each o  O is allocated to p  P that (i) is as near to o as possible, and (ii) its servicing capacity has not been exhausted in serving other closer estates. Fair assignment Fair Assignment mmd = 10 mmd = 6Worst-case Optimized Assignment

6 1. Introduction P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 Some existing studies consider the capacities of hospitals and the demands of customers Return an assignment between P and O such that the “condition” of the assignment is satisfied. 1 1 1 7 5 2 1 1 1 Different applications have different conditions. [U et al, VLDBJ 2010] Globally Optimized Condition: The total “cost” of the assignment (i.e., the sum of the matching distances) is minimized. Globally Optimized Assignment Fair Assignment mmd = 10 Globally Optimized Assignmentmmd = 7 mmd = 6Worst-case Optimized Assignment

7 1. Introduction NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p 1, p 2 } O = {o 1, o 2, o 3, o 4, o 5 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 Some existing studies consider the capacities of hospitals and the demands of customers Return an assignment between P and O such that the “condition” of the assignment is satisfied. 1 1 1 7 5 2 1 1 1 Different applications have different conditions. [U et al, VLDBJ 2010] Globally Optimized Condition: The total “cost” of the assignment (i.e., the sum of the matching distances) is minimized. Globally Optimized Assignment Fair Assignment mmd = 10 The assignment with this globally optimized condition is said to be a globally optimized assignment. Globally Optimized Assignmentmmd = 7 mmd = 6Worst-case Optimized Assignment Existing spatial assignments cannot solve the problem of finding Worst-case optimized assignment well.

9 2. Problem Definition P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 4 5 6 1 1 1 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. Worst-case Optimized assignment mmd = 6 Each o  O is associated with its demand o.w (which is a positive integer) Each p  P is associated with its capacity p.w (which is a positive integer)

3. Related Work Bottleneck Matching Problem (BMP) Given two sets of objects, namely A and B, and a matching distance cost between each object in A and each object in B, BMP is to find a perfect matching (or assignment) between A and B which minimizes the maximum matching distance. 11 This problem considers that the demand of each object in A and the capacity of each object in B are both equal to 1. Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. Our problem considers that the demand of each object in A and the capacity of each object in B are both equal to any positive integer.

3. Related Work Bottleneck Matching Problem (BMP) 12 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. Threshold the fastest algorithm The algorithm requires to materialize all pairwise distances. Thus, it is not quite scalable.

3. Related Work 13 Spatial Assignment Problem Fair Assignment Global Optimized Assignment As we described before, they do not address our problem well. Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized.

3. Related Work 14 Major Contribution Propose an efficient and scalable algorithm (called Swap-Chain) for this problem More efficient and scalable than the adapted algorithm for the bottleneck problem Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized.

4. Algorithm – Swap-Chain Swap-Chain involves the following 3 steps. Step 1 (Initialization) Step 2 (Assignment Adjustment) Step 3 (Iterative Step) 16

4. Algorithm – Swap-Chain 17 p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 2 Step 1 (Initialization): Find a full assignment A using a given condition (e.g., fair assignment, globally optimized assignment and random assignment) Fair assignment mmd = 10

4. Algorithm – Swap-Chain 18 Step 2 (Assignment Adjustment): Re-assign some matches in A to form another full assignment A’ such that the mmd value of A’ is smaller than that of A. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 2 mmd = 10

4. Algorithm – Swap-Chain 19 Step 2 (Assignment Adjustment): Re-assign some matches in A to form another full assignment A’ such that the mmd value of A’ is smaller than that of A. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 2 5 7 mmd = 107

4. Algorithm – Swap-Chain 20 Step 3 (Iterative Step): Repeat Step 2 until it is not possible to perform the assignment adjustment step (Step 2). p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 2 5 7 mmd = 107

4. Algorithm – Swap-Chain 21 Step 3 (Iterative Step): Repeat Step 2 until it is not possible to perform the assignment adjustment step (Step 2). p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 6 5 4 mmd = 107 6 After this assignment adjustment step, we cannot re-adjust the assignment again so that the mmd value of the adjusted assignment is smaller. This is the final solution for our problem. Step 1 is easy. How can we perform Step 2 (i.e., how to re-assign some matches in A such that the mmd value of this assignment A is decreased)?

4. Algorithm – Swap-Chain Algorithm Swap-Chain makes use of extreme matches for re-adjusting the assignment in Step 2 Given an assignment A, a match in A is called an extreme match if the matching distance of this match is equal to the mmd value of A. 22

4. Algorithm – Swap-Chain 23 Consider the assignment obtained just after Step 1. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 2 mmd = 10 An extreme match Step (a): Break the extreme match (o, p)

4. Algorithm – Swap-Chain 24 Consider the assignment obtained just after Step 1. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 3 2 mmd = 10 Step (b): Find a set of objects in O and P to be involved for the assignment adjustment. Step (a): Break the extreme match (o, p) o2o2 List p2p2 o1o1 p1p1 A chain from o 2 7 5 A range query

4. Algorithm – Swap-Chain 25 We continue these sub-steps again to reduce the mmd value of the assignment. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 2 mmd = 10 Step (a): Break the extreme match (o, p) 7 7 5 An extreme match

4. Algorithm – Swap-Chain 26 We continue these sub-steps again to reduce the mmd value of the assignment. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 2 mmd = 10 Step (b): Find a set of objects in O and P to be involved for the assignment adjustment. Step (a): Break the extreme match (o, p) 7 List p3p3 o3o3 p2p2 A chain from o 2 5 o2o2 4 6 A range query

4. Algorithm – Swap-Chain 27 We cannot re-adjust the assignment anymore to reduce its mmd value. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 mmd = 107 5 4 6 6 The final solution.

4. Algorithm – Swap-Chain In the algorithm, we have to perform a range query on P We build an index on P Let the time complexity of building on P be 28 = O(n log n)

4. Algorithm – Swap-Chain The time complexity of Swap-Chain is equal to O(R. n. (log n + k)) 29 R is the number of extreme matches found in Swap-Chain k <<n R is typically a small number. In our experiments, R is equal to 500 on average when the dataset size is 1M.

4. Algorithm – Swap-Chain The space occupied by Swap-Chain mainly comes from the index on P (which is O(n log n)). 30

4. Algorithm – Swap-Chain Our Swap-Chain can be extended to handling the non-spatial problem 31 Due to the time limit, we do not discuss the details here.

5. Empirical Study Synthetic Dataset P and O: Uniform distribution Real Dataset 4 Datasets in Canada AB (Alberta) BC (British Columbia) ON (Ontario) QC (Quebec) For each dataset, O: a set of populated areas P: a set of fire stations 33

5. Empirical Study Measurements Execution Time Memory Our proposed algorithm Swap-Chain Two Sets of Experiments Comparison with Existing Spatial Assignment Comparison with an adapted algorithm of the bottleneck problem (Threshold-Adapt (TA)) 34

5. Empirical Study First Set: Comparison with Existing Spatial Assignment 35 Real dataset Synthetic dataset

5. Empirical Study Second Set: Comparison with the Adapted Algorithm Threshold-Adapt (TA) 36 Real dataset

5. Empirical Study 37 Second Set: Comparison with the Adapted Algorithm Threshold-Adapt (TA) (in thousands) Synthetic dataset

Problem which is to find the worst-case optimized assignment Algorithm Swap-Chain Efficient and Scalable Experiments 39

Q&A 40

41 1. Introduction Bichromatic Reverse Nearest Neighbor (BRNN or RNN) Given P and O are two sets of objects in the same data space Problem Given an object p  P, a BRNN query finds all the objects o  O whose nearest neighbor (NN) in P are p.

42 1. Introduction NN: Nearest neighbor RNN: Reverse nearest neighbor P = {p 1, p 2, p 3 } O = {o 1, o 2, o 3 } hospitals residential estates p3p3 p1p1 o1o1 o2o2 o3o3 NN in P = p 3 NN in P = p 2 RNN = {o 2, o 3 } RNN = {o 1 } p2p2 RNN = {} Capacities of hospitals are not considered. Demands of customers are not considered. There is a serving capacity of p 3

3. Related Work Bottleneck Matching Problem (BMP) Given two sets of objects, namely A and B, and a matching distance cost between each object in A and each object in B, BMP is to find a perfect matching (or assignment) between A and B which minimizes the matching distance. 43 This problem considers that the demand of each object in A and the capacity of each object in B are both equal to 1. Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. One may come up with a straightforward solution to solve our problem as follows. For each object p in P (in our problem), we duplicate this object p p.w times For each object o in O (in our problem), we duplicate this object o o.w times Our problem considers that the demand of each object in A and the capacity of each object in B are both equal to any positive integer. Thus, use an existing algorithm for BMP to solve our problem. However, this approach is cumbersome and undesirable (esp. the capacities/demands are very large).

44 Outline 1.Introduction 2.Problem Definition 3.Related Work 4.Algorithm – Threshold-Adapt 5.Algorithm – Swap-Chain 6.Empirical Study 7.Conclusion

4. Algorithm – Threshold-Adapt Threshold-Adapt is an algorithm which searches the “best” solution in the solution search. 45 Threshold-Adapt (for demands/ capacities equal to any positive integer) shares the same skeleton with Threshold (for demands/capacities equal to 1) However, this algorithm is not scalable

4. Algorithm – Threshold-Adapt 46 Before we introduce this algorithm, we give two concepts. Concept 1: Full assignment Concept 2: Feasibility

4. Algorithm – Threshold-Adapt Suppose that the total demands from O are at most the total capacities from P An assignment A between O and P is said to be full if each object in O is matched with an object in P. 47 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 2 A full assignment Concept 1: Full Assignment

4. Algorithm – Threshold-Adapt Suppose that the total demands from O are at most the total capacities from P An assignment A between O and P is said to be full if each object in O is matched with an object in P. 48 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 A non-full assignment Concept 1: Full Assignment

4. Algorithm – Threshold-Adapt We want to find the full assignment with the smallest mmd value. 49 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 10 3 2 mmd = 10 A full assignment Concept 1: Full Assignment

4. Algorithm – Threshold-Adapt 50 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 mmd = 6 4 5 6 We want to find the full assignment with the smallest mmd value. A full assignment We choose this full assignment since it has the smallest mmd value Concept 1: Full Assignment

4. Algorithm – Threshold-Adapt A value is said to be feasible for our problem if there exists a full assignment such that its mmd value is at most this value. 51 Concept 2: Feasibility p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 4 5 6 Consider a value 6 There exists a full assignment such that its mmd value is at most 6 6 is feasible.

4. Algorithm – Threshold-Adapt A value is said to be feasible for our problem if there exists a full assignment such that its mmd value is at most this value. 52 Concept 2: Feasibility p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 4 5 6 Consider a value 10 There exists a full assignment such that its mmd value is at most 10 10 is feasible.

4. Algorithm – Threshold-Adapt A value is said to be feasible for our problem if there exists a full assignment such that its mmd value is at most this value. 53 Concept 2: Feasibility p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 Consider a value 10 There exists a full assignment such that its mmd value is at most 10 10 is feasible. 10 3 2 There can be more than one assignment such that its mmd value is at most 10.

4. Algorithm – Threshold-Adapt A value is said to be feasible for our problem if there exists a full assignment such that its mmd value is at most this value. 54 Concept 2: Feasibility p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 Consider a value 1 There does exist a full assignment such that its mmd value is at most 1 1 is not feasible.

4. Algorithm – Threshold-Adapt We have described the two concepts. 55 Concept 2: Feasibility Concept 1: Full Assignment We present Threshold-Adapt next.

4. Algorithm – Threshold-Adapt Let S be the set of all pairwise distances between O and P 56 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. optimal mmd (i.e., 6) 5 9 p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 3 10 7 4 11 6 2 S = { } 5, 3, 9, 10, 7, 4, 11, 6, 2 Observation: The optimal mmd value is in S. Step 1: for each value v in S, we determine whether v is feasible for our problem Step 2: find the smallest value v which is feasible. Step 3: return the full assignment with its mmd value equal to v

4. Algorithm – Threshold-Adapt Let S be the set of all pairwise distances between O and P 57 Problem: to find an assignment between P and O such that the maximum matching distance (mmd) is minimized. 5 9 p3p3 p1p1 o1o1 o2o2 o3o3 p2p2 1 1 1 1 1 1 3 10 7 4 11 6 2 S = { } 5, 3, 9, 10, 7, 4, 11, 6, 2 Observation: The optimal mmd value is in S. Step 1: for each value v in S, we determine whether v is feasible for our problem Step 2: find the smallest value v which is feasible. Step 3: return the full assignment with its mmd value equal to v There are two remaining issues 1.Issue 1 How to determine whether a value v is feasible 2.Issue 2 How to improve the efficiency of this algorithm Can be done by Maximum-flow Algorithm Can be speeded up by binary search

4. Algorithm – Threshold-Adapt The time complexity of Threshold-Adapt is O(n 2 + . log n) where  is the complexity analysis of the maximum-flow algorithm. 58

4. Algorithm – Threshold-Adapt The space complexity of Threshold- Adapt is O(n 2 ) 59 This algorithm is not scalable

4. Algorithm – Swap-Chain The time complexity of Swap-Chain is equal to O(  +  + R. I) where  is the time complexity of Step 1 R is the number of extreme matches found in Swap- Chain I is the time complexity of performing the re-matching operation for a given extreme match 60  =O(n log n) if the fair assignment is used. In our experiments, R is equal to 500 on average on average when the dataset size is 1M. R is typically a small number. I = O(n (log n + k)) where k <<n

6. Empirical Study 61 Non-Spatial Problem Real dataset

1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and.

Similar presentations

Presentation on theme: "1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and.

Similar presentations

Presentation on theme: "1 On Optimal Worst-Case Matching Cheng Long (Hong Kong University of Science and Technology) Raymond Chi-Wing Wong (Hong Kong University of Science and."— Presentation transcript:

Similar presentations

About project

Feedback