Presentation on theme: "What have we learnt about graph expansion in the new millenium? Sanjeev Arora Princeton University & Center for Computational Intractability."— Presentation transcript:
What have we learnt about graph expansion in the new millenium? Sanjeev Arora Princeton University & Center for Computational Intractability
Overview Last millenium: Central role of expansion and expanders Recognizing expander graphs via eigenvalues (Cheeger71,Alon-Milman85) O(log n)-approximation via flows (Leighton-Rao88) ; region-growing technique; Close connection to metric embeddings; O(log n) approximation for general sparsest cut via Bourgain’s Embedding Theorem (Linial-London-Rabinovich, Aumann-Rabani) This millenium (so far): O(√log n )-approximation (A., Rao, Vazirani 04) via both SDP and flows; Better metric embeddings; O(√log n )-approximation for general sparsest cut (Chawla-Gupta-Raecke05, A.-Lee-Naor06) Inapproximability results via Unique Games Conjecture (CKKRS06; KV06) Lowerbounds for metric embeddings (inspired by PCPs) [KV06; others] ; lowerbounds for SDPs; Progress in relating full eigenvalue spectrum to (small-set) expansion (A., Barak, Steurer10) (Will not talk about: New understanding of expansion in Cayley graphs of groups, new algebra-free constructions of optimal expanders, etc.)
d-regular graph G d vertex set S Graph Expansion expansion(S) = # edges leaving S d |S| Important concept: derandomization, network routing, coding theory, Markov chains, differential geometry, group theory -expander: expansion(S) ≥ for all S (co-NP-hard to recognize) (often will restrict attention wlog to “balanced” sets: |S|, |S c | > (n)) α 2 /2 ≤ λ ≤ 2α [Cheeger, Alon 85, Alon-Milman85]. λ= smallest nonzero eigenvalue of Laplacian of G. Allows us to recognize graphs with α = Ω(1) (“expander”)
Approximating expansion via flows [Leighton-Rao’88] O(log n)-approximation. Find largest s.t. we can simultaneously route n units of flow between every vertex pair. (“embed a complete graph”)
S Why α ≥ β/2 : Total flow out of each subset S is β⁄n × |S| (n - |S|) ≥ β|S|/2 β⁄n units of flow bet. each vtx pair Why α ≤ O(log n) β: The LP expressing existence of flow is feasible if graph diameter is O(1/β). (uses duality theorem) In a graph with expansion α, diameter is O(log n/α). S (Region growing argument: BFS from S one step at a time; # of edges increases by (1+α) factor each step; reach >1/2 the edges in O(log n/α) steps.)
Approximating expansion via expander flows (A.,Rao, Vazirani 2004) S β units of flow originating at each vtx Route a flow with some demand graph W= (w ij ) (w ij = flow between i and j) s.t. W is β-regular and has expansion 0.01 (“expander flow”) Maximise β. Easy: α ≥ 0.01 β (Amount of flow leaving each set S is at least 0.01 β |S|.) Main claim: α≤ O(β √log n) Next: Geometry of cuts and how efficiently they can be crossed
Geometry of cuts S ScSc Cut semimetric d S (i,j) = 1 if i, j on opposite sides of the cut, = 0 else. 0 1 (gives embedding into a line) Convex combination of cut semimetrics d(i, j) = Σ S α S d S (i, j) (Gives embedding into l 1 : i v i |v i – v j | 1 = fraction of cuts i, j are across)
Approximating expansion via flows (A.,Rao, Vazirani 2004) S β units of flow originating at each vtx Route a flow with demand graph W= (w ij ) (w ij = flow between i and j) W is β-regular and has expansion 0.01 Maximise β. Main claim: α≤ O(β √log n) LP formulation: Duality Thm Feasibility follows if for every distribution (α S ) on balanced cuts, there are Ω(n) disjoint vertex pairs (i 1, j 1 ), (i 2, j 2 ), … s.t. (a) d(i r, j r ) = O(√log n/ α) (b) i r, j r are across Ω(1) fraction of cuts. Check by computing eigenvalue (“separation oracle”) Open: Replace √log n with o(√log n )? (Best lowerbound: log log n [DKSV06]) 1 st structure theorem
[ARV04] If G is an α-expander then for every distribution (α S ) on balanced cuts, there are Ω(n) disjoint vertex pairs (i 1, j 1 ), (i 2, j 2 ), … s.t. (a) d(i r, j r ) = O(√log n/ α) (b) i r, j r are across Ω(1) fraction of cuts. Warmup: If max degree= O(1) and given a single balanced cut, above is true with O(1/α) instead of O(√log n/ α) S Pf: Max-Flow Min Cut Thm 1 1 source 1 1 sink (all other edges capacity 4/α ) 4/α α-expansion Min Cut = Ω(|S|) = Ω(n) =Max-Flow Total capacity = O(n/α) Flow decomposition Ω(n) flowpaths of length O(1/α) with one endpoint in S and one in S c Thoughts on Structure Thm
A flow-based O(√log n)-approximation algorithm for expansion For β = 1/n, 2/n, 4/n,… do Try to solve above LP to route a β-regular expander flow in G If succeed, have verified that expansion ≥ 0.01β If fail, use [ARV] technique to find a cut of expansion < O(β√log n) (note: before finding this cut had already verified expansion ≥ 0.01 β/2) (Note: Can be implemented in O(n 1.5 ) time using matrix multiplicative weight method [Sh09,AK07,AHK05]. Satyen Kale’s talk.)
Suggested research directions Nothing special about routing an expander flow; could use any graph family whose expansion can be verified up to O(1) factor. (Suffices to solve LP.) Example: Graphs with a few small nonzero eigenvalues (generalizes expanders, which have no small eigenvalues) [A., Barak, Steurer’10] Could also try for o(√log n) approximation in subexponential time. See David Steurer’s talk….
View 2: Use of math programming relaxations S ScSc Cut semimetric d S (i,j) = 1 if i, j on opposite sides of the cut, = 0 else. Recall: Integer program for c-balanced separator (expansion of sets of size ≥ cn) Linear [LR88]; O(log n)-approximation Semidefinite [ARV04]: O(√log n) –approximation. (Main obstacle: understanding vectors satisfying triangle inequality condition).
How to round the SDP: 2 nd Structure Theorem v 1, v 2, v 3, … : unit vectors in R n, s.t. avg |v i –v j | 2 = Ω(1) (“well-spread”) |v i –v j | 2 + |v j -v k | 2 ≥ |v i –v k | 2 (l 2 2 property; angle subtended by any two points on the third is nonobtuse; includes l 1 as subcase) THM: For Δ = Ω(1/√log n) there exist sets S, T of size Ω(n) s.t. |v i –v j | 2 ≥ Δ for all i in S, j in T (S, T are Δ-separated) Δ NB: Implies weak version of 1 st Structure Thm: Maxflow Mincut applied to S, T yields Ω(n) paths of length O(1/α) that cross Ω(1/√log n) fraction of cuts.
Rounding the SDP S S T T S, T: Δ-separated sets of size Ω(n) Do BFS wrt distance function d(i,j) = |v i –v j | 2, starting from S and going until you hit T Output the level of the BFS tree with least expansion. S S vivi vjvj d(S, i) d(S, j) d(S, j) – d(S, i) ≤ |v i –v j | 2 Edge (i,j) contributes to cut for ≤ |v i –v j | 2 levels, and each level cuts at least |E(O, O c )| edges. Claim: This gives a balanced cut (O, O c ) s.t. |E(O, O c )| ≤ SDP OPT /Δ = O(√log n) SDP OPT
O(√log n)-approximation for other cut-like problems MIN-2-CNF deletion and several graph deletion problems. [Agarwal, Charikar, Makarychev, Makarychev04] [’04]. Weighted version of S MIN-LINEAR ARRANGEMENT [Charikar, Karloff, Rao’05] General SPARSEST CUT [Chawla-Gupta-Raecke05, A. Lee Naor’06] 0 Min-ratio VERTEX SEPARATORS and Balanced VERTEX SEPARATORS[Feige, Hajiaghayi, Lee’04] All Method: SDP rounding using a generalized structure theorem…
Suggested future direction SDP with triangle inequality corresponds to level 2 of Lasserre, Lovasz-Schrijver, etc. (see Madhur Tulsiani’s talk) Use more powerful SDP relaxations from higher levels? * May need to allow superpolynomial time (r th level n r time) * Not currently ruled out under UGC.
Cut problems and embeddings General Sparsest Cut: Cost matrix (c ij ) c ij ≥0; Demand matrix (d ij ) d ij ≥ 0; Find SDP relaxation: [LLR94,AR94]: Integrality gap = Minimum distortion incurred when embedding l 2 2 metrics into l 1 (= convex combination of cut semi-metrics)
Finite metric space (X, d) x y d(x,y) Geometric space, eg l 1 f(x) f(y) f Distortion of f : Minimum C s.t. d(x, y) ≤ |f(x) –f(y)| ≤ C d(x,y) [Bourgain’85, LLR94]: Distortion O(log n) into l 1, l 2 [Chawla-Gupta-Raecke05, A.-Lee-Naor06]: Distortion O(√log n log log n) for embedding l 2 2 into l 1 ; and embedding l 1 into l 2 Geometric Embeddings of Metric Spaces What if X is itself geometric?
Embedding theorems in one slide Tool 1: Padded decompositions [Krauthgamer,Lee, Mendel,Naor04] Metric space (X, d) Scale S, padding parameter p: Partition probabilistically into pieces of diameter ≤ S, s.t. for all x Pr[x’s partition contains Ball(x, S/p)] ≥ ½ x x Line embedding 0 Map each block to 0 with probability ¼; Map x to d(x, zero-block) Tool 2: Use ARV structure theorem to produce padded decompositions at different scales; combine line embeddings into a single embedding using “measured descent.”
Proving lowerbounds on distortion [Khot-Vishnoi05] log log n lowerbound; construction inspired by PCPs (hypercontractivity of noisy hypercubes) [Lee-Naor],[Cheeger,Kleiner,Naor] (log n) ε lowerbound; construction based upon Heisenberg group; new notion of differentiation [Lee-Muharrami] √log n lowerbound; only for embedding weak l 2 2 spaces into l 1. Elementary construction and analysis. Open: √log n lowerbound for l 2 2 spaces; (log n) ε lowerbound for SDP integrality gap of uniform sparsest cut (ie edge expansion).
Proof of Structure Theorems Recall: v 1, v 2, v 3, … : unit vectors in R d, s.t. avg |v i –v j | 2 = Ω(1) (“well-spread”) |v i –v j | 2 + |v j -v k | 2 ≥ |v i –v k | 2 Δ (Recall: 1 st Structure Theorem concerned distributions of cuts, which correspond to l 1 metrics, which are a subcase of l 2 2 ) For Δ = Ω(1/√log n) there exist sets S, T of size Ω(n) s.t. |v i –v j | 2 ≥ Δ for all i in S, j in T (S, T are Δ-separated)
Algorithm to produce two Δ-separated sets (Δ= 1/√log n) 0.01/√d Easy: S u and T u are likely to have size Ω(n). SuSu TuTu u Delete any v i in S u and v j in T u s.t. |v i – v j | 2 < Δ (repeat until no such pair remains) If S u and T u still have size Ω(n) output them. Main difficulty: Show that whp only o(n) points get deleted. Obs: Deleted pairs were “stretched”, i.e., |v i – v j | 2 | > 0.01/√d Fact: Pr[| | > 0.01 √Δ/√d] = exp(-1/Δ) = exp(-√log n). Too large for union bound
Walks in l 2 2 space |v i –v j | 2 + |v j -v k | 2 ≥ |v i –v k | 2 r steps of squared-length Δ only take you a total squared distance rΔ (i.e., distance √r √Δ) Main proof step: Use measure concentration to prove that for most directions u there is a walk of length r on stretched edges (v 1, v 2 ), (v 2, v 3 ),.. (v r, v r+1 ) so that | | > 0.001 r/√d Pr[such v 1, v r+1 exist in the point set] < exp(- r/Δ) < 1/n 2 Δ Δ Δ Δ
Unique Games Conjecture [Khot03] Given m equations in n variables x 1, x 2, …, x n of the type ax i + b X j = a (mod 113) s.t. (1-ε) fraction are simultaneously satisfiable, it is NP-hard to satisfy ½ of them simultaneously. Used to prove best inapproximability results for host of problems, including expansion problems. Inspired SDP integrality gaps (aka embedding lowerbounds). (See Khot’s talk) (Expansion strikes back) The Achilles heel of UGC appears to be expansion. Better understanding of small-set expansion may disprove UGC. (see Steurer’s talk)
Looking forward to more insight in the next decade! Thank you!