Embedding Metrics into Geometric Spaces Anupam Gupta Carnegie Mellon University 3ème cycle romand de Recherche Opérationnelle, 2010 Seminar, Lecture #2
y z x Metric space M = (V, d) set V of points symmetric non-negative distances d(x,y) triangle inequality d(x,y) ≤ d(x,z) + d(z,y) d(x,x) = 0 y z x
in the previous lecture… We saw: embeddings into distributions over trees ¯-padded decompositions Today: padded decompositions and approximation algorithms embeddings into geometric spaces (ℓp spaces) in particular, ℓ1 and ℓ2 (Euclidean) space
Padded decompositions and approximations
recall from last lecture A metric (V,d) admits ¯-padded decompositions, if for every ¢, we can output a random partition V = V1 ] V2 ] … ] Vk each Vj has diameter ≤ ¢ Pr[ x and y in different clusters ] ≤ Theorem: Every n-point metric admits O(log n)-padded decompositions.
multi-cut Given graph G = (V,E) with k source-sink pairs Find the fewest edges to delete to separate all source-sink pairs NP-hard, APX-hard for k ≥ 3. Best known: O(log k) approximation [Garg Vazirani Yannakakis]
relaxation of multi-cut Suppose we assign lengths to edges such that shortest-path-distance(sp, tp) ≥ 1 for all p. One possible setting: length of cut edges in OPT = 1, all others = 0 total length = OPT. So, find (fractional) setting that minimizes total length at most OPT. and can be found by linear programming.
algorithm idea Given such fractional edge-lengths (with total length L ≤ OPT) Use these lengths to figure out which edges to cut and E[ number of edges cut ] ≤ O(log n) × L we’d have a logarithmic approximation !
randomized algorithm for multi-cut Given lengths on edges shortest-path-distance(sp, tp) ≥ 1 for all p. Take a O(log n)-padded decomposition of this metric with ¢ = 0.99 Facts: Each terminal pair separated. Pr[ edge e cut ] ≤ length(e) × O(log n)
Embeddings into ℓp spaces
ℓp spaces Consider real space Rm with the ℓp metrics: for x, y in Rm, and 1 ≤ p < 1 |x – y|p |x – y|1 Since we will deal with finite (n-point) metrics we can (and will) give embeddings into finite dimensions. In this lecture, will ignore the dimensionality, just say ℓp.
inter-relationships For 1 · p · q · 2: ℓq embeds into ℓp (Next lecture, we’ll see some ideas behind ℓ2 into ℓ1.) For 2 · q · 1: ℓ2 embeds into ℓq Everything embeds into ℓ1
Focus on ℓ1: the “Manhattan” or “taxicab” metric ℓ2: Euclidean space And introducing--- ℓ2-square: the square of the Euclidean distance. (does not satisfy triangle inequality) Henceforth, consider only metrics in ℓ2-square
recall: distortion Let M = (V,d) Let M’ = (V’,d’) Let f: V V’ stretch(f) = maxx, y in V d’(f(x), f(y)) d(x, y) d’(f(x), f(y)) d(x, y) contraction(f) = maxx, y in V distortion(f) = stretch(f) × contraction(f) M is “close” to M’ if there exists a “low”-distortion map f: M M’
Some jargon We say that a metric (V,d) “embeds in ℓp” or “is in ℓp” if it embeds isometrically (i.e., with distortion 1) into ℓp Hence, we are using ℓp to denote both the metric space, and also the class of metrics that embed into ℓp. (I’ll be careful when it’s ambiguous.)
Is every metric in ℓ2?
Is every metric in ℓ1?
central question today Given a metric (V,d) how well does it embed into ℓp spaces?
general n-point metrics Upper bounds: Lower bounds:
tree metrics (the simplest class of metrics)
planar metrics (very common subclass of metrics)
algorithmic question These were all “uniform” results. What about the algorithmic question: Given a metric (X,d), what is the smallest distortion D possible for embedding this metric? ℓ2 : use semi-definite programming ℓ1 : NP-hard open question: o((log n)½) approximation for this problem.
today’s menu padded decompositions and multicut the sparsest cut and ℓ1 embeddings general metric ℓ1 with distortion O(log n) exists a metric ℓ2 requires distortion (log n)
finding balanced separators Given an edge-weighted graph G, divide the vertex-set into two parts such that 1) each part contains ≈ half the vertices 2) weight of edges between two parts is small. Useful for divide-and-conquer algorithms.
the sparsest cut problem Given an edge-weighted graph G, divide the vertex-set into two parts (S, V\S) such that Weight of edges between S and V\S is minimized |S| × |V\S|
the sparsest cut problem Given an edge-weighted graph G, divide the vertex-set into two parts (S, V\S) such that Weight of edges between S and V\S is minimized |S| × |V\S| Theorem: If every n-point metric embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using LPs. Theorem: every n-point metric in ℓ2-squared embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using SDPs.
the general idea Weight of edges between S and V\S min cut S |S| × |V\S| Define xuv = 1 if exactly one of u,v lie in S, = 0 otherwise Note that x is a metric uv in E xuv min cut-based x NP-hard uv in V × V xuv uv in E xuv can be computed by an linear prog. min metric x uv in V × V xuv
the general idea uv in E xuv min cut-based x uv in V × V xuv NP-hard uv in V × V xuv uv in E xuv = min x in ℓ1 [Avis Deza 89] uv in V × V xuv Theorem: If every n-point metric embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using LPs. uv in E xuv can be computed by an linear prog. min metric x uv in V × V xuv
relaxations and LP rounding
today’s menu padded decompositions and multicut the sparsest cut and ℓ1 embeddings general metric ℓ1 with distortion O(log n) exists a metric ℓ2 requires distortion (log n)
trees into ℓ1 Use a new dimension for each edge.
ℓ1 forms a convex cone
now use FRT…
a different way If the metric admits a ¯-padded decomposition easier result [Rao 99] an O(¯ sqrt{log n})-distortion embedding into ℓ2. more involved [KLMN 03] an O(sqrt{¯ log n})-distortion embedding into ℓ2.
today’s menu padded decompositions and multicut the sparsest cut and ℓ1 embeddings general metric ℓ1 with distortion O(log n) exists a metric ℓ2 requires distortion (log n)
a lower bound Big picture: suppose we have non-negative values aij and bij For a metric d, define R(d) = we will show metrics d for which we can set these a’s, b’s such that R(d) is 1/(n log2 n) but R(d’) for any d’ in ℓ2 is at least 1/n. Hence must need (log n) distortion for this metric d ℓ2. aij dij2 bij dij2
to start off, some background (edge)-expander graphs. logarithmic diameter most points are log distance apart spectral properties of expander graphs
expander graphs °-expanders: Graphs such that for any node subset S with |S|≤ |V|/2, | edges from S to V \ S | > ° |S| ° is called the “(edge)-expansion” of G. Interesting when degree(G), ° are both constant indep’t of G. Theorem: A random 3-regular graph is a (1)-expander whp. Explicit constructions of O(1)-degree, (1)-expanders known
expander graphs facts(1) Fact #1: Any pair of nodes is at most O(log n) apart.
expander graphs facts(2) Fact #2: Most pairs (x,y) of nodes are (log n) apart.
Metric d for which R(d) is small aij = 1 if edge (i,j) present in expander, 0 otherwise bij = 1 dij shortest-path distance between i&j in expander. Fact #3: R(d) = 1/(n log2 n) aij dij2 bij dij2
R(d’) is large for Euclidean metrics Fact #4: R(d’) = (1/n) for every Euclidean metric d’
R(d’) is large for Euclidean metrics Fact #4: R(d’) = (1/n) for every Euclidean metric d’
today’s menu the sparsest cut and ℓ1 embeddings general metric ℓ1 with distortion O(log n) general metric ℓ2 with distortion O(log3/2 n) exists a metric ℓ2 requires distortion (log n) Finally, some (newer) extensions of these ideas…
better sparsest cut via SDPs Theorem: If every n-point metric embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using LPs. Theorem: every n-point metric in ℓ2-squared embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using SDPs.
the general idea uv in E xuv min cut-based x uv in V × V xuv NP-hard uv in V × V xuv uv in E xuv = min x in ℓ1 [Avis Deza 89] uv in V × V xuv Theorem: every n-point metric in ℓ2-squared embeds into ℓ1 with distortion ®, ®-approximation algorithm for sparsest cut using SDPs. uv in E xuv can be computed by an linear prog. min metric x uv in V × V xuv
thank you!