Randomized Online Algorithm for Minimum Metric Bipartite Matching Adam Meyerson UCLA.

Slides:



Advertisements
Similar presentations
 Review: The Greedy Method
Advertisements

1 SOFSEM 2007 Weighted Nearest Neighbor Algorithms for the Graph Exploration Problem on Cycles Eiji Miyano Kyushu Institute of Technology, Japan Joint.
Traveling Salesperson Problem
BackTracking Algorithms
Types of Algorithms.
Great Theoretical Ideas in Computer Science for Some.
Lecture 24 Coping with NPC and Unsolvable problems. When a problem is unsolvable, that's generally very bad news: it means there is no general algorithm.
Study Group Randomized Algorithms 21 st June 03. Topics Covered Game Tree Evaluation –its expected run time is better than the worst- case complexity.
Great Theoretical Ideas in Computer Science.
1 Discrete Structures & Algorithms Graphs and Trees: II EECE 320.
Introduction to Approximation Algorithms Lecture 12: Mar 1.
1 Greedy Algorithms. 2 2 A short list of categories Algorithm types we will consider include: Simple recursive algorithms Backtracking algorithms Divide.
Approximation Algorithms: Concepts Approximation algorithm: An algorithm that returns near-optimal solutions (i.e. is "provably good“) is called an approximation.
CPSC 689: Discrete Algorithms for Mobile and Wireless Systems Spring 2009 Prof. Jennifer Welch.
Network Design Adam Meyerson Carnegie-Mellon University.
Online Algorithms for Network Design Adam Meyerson UCLA.
Graphs and Trees This handout: Trees Minimum Spanning Tree Problem.
Online Oblivious Routing Nikhil Bansal, Avrim Blum, Shuchi Chawla & Adam Meyerson Carnegie Mellon University 6/7/2003.
K-Server on Hierarchical Binary Trees
CSE 421 Algorithms Richard Anderson Lecture 6 Greedy Algorithms.
The k-server Problem Study Group: Randomized Algorithm Presented by Ray Lam August 16, 2003.
DAST 2005 Week 4 – Some Helpful Material Randomized Quick Sort & Lower bound & General remarks…
Solution methods for Discrete Optimization Problems.
1 Introduction to Approximation Algorithms Lecture 15: Mar 5.
Approximation Algorithms Motivation and Definitions TSP Vertex Cover Scheduling.
Backtracking.
A General Approach to Online Network Optimization Problems Seffi Naor Computer Science Dept. Technion Haifa, Israel Joint work: Noga Alon, Yossi Azar,
Online Function Tracking with Generalized Penalties Marcin Bieńkowski Institute of Computer Science, University of Wrocław, Poland Stefan Schmid Deutsche.
0 Course Outline n Introduction and Algorithm Analysis (Ch. 2) n Hash Tables: dictionary data structure (Ch. 5) n Heaps: priority queue data structures.
1/24 Algorithms for Generalized Caching Nikhil Bansal IBM Research Niv Buchbinder Open Univ. Israel Seffi Naor Technion.
Advanced Algorithms Piyush Kumar (Lecture 5: Weighted Matching) Welcome to COT5405 Based on Kevin Wayne’s slides.
Minimal Spanning Trees What is a minimal spanning tree (MST) and how to find one.
Approximation Algorithms for Stochastic Combinatorial Optimization Part I: Multistage problems Anupam Gupta Carnegie Mellon University.
Algorithms for Network Optimization Problems This handout: Minimum Spanning Tree Problem Approximation Algorithms Traveling Salesman Problem.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
Network Aware Resource Allocation in Distributed Clouds.
Advanced Algorithms Piyush Kumar (Lecture 5: Weighted Matching) Welcome to COT5405 Based on Kevin Wayne’s slides.
1 Introduction to Approximation Algorithms. 2 NP-completeness Do your best then.
1 Online algorithms Typically when we solve problems and design algorithms we assume that we know all the data a priori. However in many practical situations.
BackTracking CS335. N-Queens The object is to place queens on a chess board in such as way as no queen can capture another one in a single move –Recall.
Image segmentation Prof. Noah Snavely CS1114
1 Combinatorial Algorithms Local Search. A local search algorithm starts with an arbitrary feasible solution to the problem, and then check if some small,
CSE 589 Part VI. Reading Skiena, Sections 5.5 and 6.8 CLR, chapter 37.
Week 12 - Wednesday.  What did we talk about last time?  Matching  Stable marriage  Started Euler paths.
A polylog competitive algorithm for the k-server problem Nikhil Bansal (IBM) Niv Buchbinder (Open Univ.) Aleksander Madry (MIT) Seffi Naor (Technion)
Hierarchical Well-Separated Trees (HST) Edges’ distances are uniform across a level of the tree Stretch  = factor by which distances decrease from root.
Types of Algorithms. 2 Algorithm classification Algorithms that use a similar problem-solving approach can be grouped together We’ll talk about a classification.
A Optimal On-line Algorithm for k Servers on Trees Author : Marek Chrobak Lawrence L. Larmore 報告人:羅正偉.
Chapter 9 Finding the Optimum 9.1 Finding the Best Tree.
© The McGraw-Hill Companies, Inc., Chapter 12 On-Line Algorithms.
The bin packing problem. For n objects with sizes s 1, …, s n where 0 < s i ≤1, find the smallest number of bins with capacity one, such that n objects.
Approximating Buy-at-Bulk and Shallow-Light k-Steiner Trees Mohammad T. Hajiaghayi (CMU) Guy Kortsarz (Rutgers) Mohammad R. Salavatipour (U. Alberta) Presented.
CSCI 256 Data Structures and Algorithm Analysis Lecture 2 Some slides by Kevin Wayne copyright 2005, Pearson Addison Wesley all rights reserved, and some.
11 -1 Chapter 12 On-Line Algorithms On-Line Algorithms On-line algorithms are used to solve on-line problems. The disk scheduling problem The requests.
1 Chapter 7 Network Flow Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Keeping Binary Trees Sorted. Search trees Searching a binary tree is easy; it’s just a preorder traversal public BinaryTree findNode(BinaryTree node,
Online Bipartite Matching with Augmentations Presentation by Henry Lin Joint work with Kamalika Chaudhuri, Costis Daskalakis, and Robert Kleinberg.
Integer Programming An integer linear program (ILP) is defined exactly as a linear program except that values of variables in a feasible solution have.
BackTracking CS255.
Haim Kaplan and Uri Zwick
Great Theoretical Ideas in Computer Science
Chapter 5. Optimal Matchings
Chapter 7 Network Flow Slides by Kevin Wayne. Copyright © 2005 Pearson-Addison Wesley. All rights reserved.
Discrete Mathematics for Computer Science
Instructor: Shengyu Zhang
Analysis of Algorithms
Optimization Problems Online with Random Demands
Randomized Online Algorithm for Minimum Metric Bipartite Matching
Clustering.
Approximation Algorithms
Presentation transcript:

Randomized Online Algorithm for Minimum Metric Bipartite Matching Adam Meyerson UCLA

A Recent Result Randomized Online Algorithm for Minimum Metric Bipartite Matching. –Joint work with two UCLA students (one recently graduated): Akash Nanavati and Laura Poplawski. –Recently presented at Symposium on Discrete Algorithms (SODA) 2006.

Bipartite Matching Pair up each red node with a blue node. One-to-one pairing. Each pair of nodes must have an edge between them.

Bipartite Matching Pair up each red node with a blue node. One-to-one pairing. Each pair of nodes must have an edge between them.

Min-Cost Bipartite Matching Each edge has a cost. Find a matching of red nodes with blue nodes. Minimize the total cost of the edges between matched pairs.

Importance of Matching Task assignment problems. Measuring data similarity. Relationship to network flow. Subroutine in many other algorithms.

Online Matching We’re given only the red nodes. Blue nodes are designated one at a time. As each blue nodes is designated we must match it to an unmatched red node.

Why Online Matching? Assigning tasks to consultants (or jobs to machines without migration). Updating existing matchings “on the fly” without making major modifications. Possible subroutine for other online problems.

Online Matching is Hard

Simplifying the Problem We can assume that distances between nodes form a metric. –Satisfy symmetry: d(x,y)=d(y,x) –Satisfy triangle inequality: d(x,y)+d(y,z) ≥ d(x,z) This assumption will hold for distances on a surface (for example) and is common for many problems (like traveling salesman).

Measuring Success After all the nodes have arrived, we will have some final matching M. Let the cost of this matching be C(M). If we had known all the red and blue nodes initially, we could compute the minimum-cost matching M*. The competitive ratio of our algorithm is the maximum, over all sequences of blue nodes, of C(M)/C(M*). We would like this to be small.

Some Points about the Model We will allow co-located red and blue nodes. These can be matched for cost zero (they are distance zero apart). Not every node needs to be either red or blue; there could be nodes of the graph which are not supposed to be matched. The underlying metric could be infinite (for example the Euclidean plane) provided distances can be computed easily.

“Obvious” Solution: Greedy When a blue node is designated, match it to the closest unmatched red node. –Simple, easy to implement. –How could we do better? Yet, there is a sequence of inputs (a graph, set of red nodes, and ordered set of blue nodes) such that greedy gives a very bad solution.

Greed is Bad 2248….

Greed is Bad 2248….

Greed is Bad 2248….

Greed is Bad 2248….

How bad is greedy? We pay … = 2 k in total. –k = number of red nodes The best (minimum-cost) matching would pay only a cost of 1. So greedy is really bad in the worst case!

Is there a better algorithm? Permutation algorithm –Khuller, Mitchell, Vazirani ToCS –At worst (2k-1) times the best matching. But isn’t a (2k-1) factor really bad? It seems that no algorithm can do better…

 (2k-1) Competitivity

Eventual Cost Comparison Our Algorithm pays 2k-1. Optimum pays only 1. How did this happen? –“Adversary” always knew which red node we would match up. –Next blue node designated is always really close to the last red node we matched.

Onward to New Results! This matching 2k-1 competitive result is all that was previously known. People started to work on special kinds of graphs… We consider randomized algorithms. –“Adversary” does not know our coin flips in advance! –Randomization has helped in the past for other online problems (paging).

Our Result Joint work with Akash Nanavati and Laura Poplawski. We design a randomized online algorithm which obtains O(log 3 k) competitive matching. –Dramatically better for large k.

Greedy Returns! Our algorithm is a randomized greedy. –As each blue node is designated, match to the closest unmatched red node. –If there’s a tie, break it by choosing uniformly at random.

But greedy doesn’t work… The same bad example from before will kill us (again). But greedy does work on some special graphs. On the star example, randomized greedy will cost an expected O(log k) times the optimum.

Our Main Theorem Randomized greedy works on a  -HST. This is a tree where: –All children of a node equidistant from that node. –Distance from child to parent is (1/  ) times distance from parent to grandparent. 1  22

The Inductive Step Consider the root of the tree. Let m i be the difference between number of red, blue nodes in subtree i. OPT must match at least M=∑m i /2 outside. We will bound the number we match outside subtree online.

The Key Lemma Let m i * be the number of blue nodes from subtree i which our algorithm matches outside subtree i. We will show that ∑m i *≤ 2M ln k + 1 –Here k = total number of blue nodes. Pf: Let  t =number of blue nodes matched outside when t blue nodes yet to arrive. Let  t =∑ 2 ln x. This sum is over i such that a future blue node will arrive which cannot be matched to a red node within its subtree.

Completing Lemma Proof Initially,  t +  t ≤ 2M ln k. At each time step, the value of  t can only change if we match a node outside its subtree. At this point we might “bump” a later node by matching to its designated red node (i.e. we pick the wrong subtree). The new value for the potential is: E[  t-1 ] ≤  t -1. We conclude that at termination, we have the required bound.

Using the Lemma We can bound the total cost of the matching by the cost of nodes matched outside their subtree, plus the cost of matching within the subtrees. This second value can be bounded using induction. The cost of matching within the subtrees cannot be directly compared to optimum, because outside nodes matched within the subtree might be matched to the “wrong” places.

Completing the Proof Cost OPT (T) = ∑ i Cost OPT (S i ) + 2M  Cost US (T) ≤ ∑ i Cost US (S i *) + ∑ i m i *(2  ) Here S i * represents the set of red and blue nodes in S i not matched outside the tree.  is the ratio of the sum of all distances from leaf to root between one level and the next (this can be bounded easily in terms of  ).

Applying Induction If we have a competitive ratio of R, we can conclude that Cost US (S i *)≤R Cost OPT (S i *) inductively. It remains to relate the cost of S i * to that of S i. We could try to match to the same places as OPT does; the problem is “wrong” matches from outside. However, there are at most m i * such wrong matches. Each costs no more than 2  to correct. Cost US (S i *) ≤ R(Cost OPT (S i ) + 2  m i *)

Finishing HST Result Now it’s just a matter of algebra and solving for R. We manage to show that: –O(log k)-competitive on  -HST –Require that  ≥  (log k) –These bounds on performance of randomized greedy on k-HST are tight; to improve the result a new algorithm would be needed.

So what?  -HST is not a very general graph. However, there is series of recent results on metric embeddings. In particular, result of Fakcharoenphol, Rao, and Talwar in STOC –Any metric can be transformed via a randomized mapping into an  -HST, such that no distance is contracted, and the expected expansion of any distance is at most O(  log n).

Randomized Online Matching! Transform the original metric using the [FRT] result into a (log k)-HST. Use randomized greedy to match blue nodes with red nodes as they arrive, based on the (random) distances in the HST. Expected cost of matching will be within O(log 2 k log n) of optimum

A Simple Trick We actually only care about maintaining the relative positions of red nodes. This enables us to use a simple trick (matching blue nodes via the nearest red node neighbor) to improve the result to O(log 3 k).

Future Work (in progress) We hope to design a good randomized algorithm for k-server. –Police cars at various locations, emergencies arise one at a time, must move a police car to each emergency. –Similar to matching, but can move the same police car multiple times. –Again 2k-1 result known (deterministic) but randomized may be able to do better!