Presentation is loading. Please wait.

Presentation is loading. Please wait.

Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,

Similar presentations


Presentation on theme: "Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,"— Presentation transcript:

1 Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman, Srivastava and Teng

2 Approximating Dense Objects by Sparse Ones Floor joists Image compression

3 Approximating Dense Graphs by Sparse Ones Spanners: Approximate distances to within ® using only = O(n 1+2/ ® ) edges Low-stretch trees: Approximate most distances to within O(log n) using only n-1 edges (n = # vertices)

4 Overview Definitions – Cut & Spectral Sparsifiers Cut Sparsifiers – A combinatorial construction Spectral Sparsifiers – A random sampling construction – Derandomization

5 Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E ! R + Output: A subgraph H=(V,F) of G with weights w : F ! R + such that |F| is small and w( ± H (U)) = (1 § ² ) u( ± G (U)) 8 U µ V weight of edges between U and V\U in Gweight of edges between U and V\U in H UU (Karger ‘94)

6 Cut Sparsifiers Input: An undirected graph G=(V,E) with weights u : E ! R + Output: A subgraph H=(V,F) of G with weights w : F ! R such that |F| is small and w( ± H (U)) = (1 § ² ) u( ± G (U)) 8 U µ V weight of edges between U and V\U in Gweight of edges between U and V\U in H

7 Generic Application of Cut Sparsifiers (Dense) Input graph G Exact/Approx Output (Slow) Algorithm A for some problem P Sparse graph H approx preserving solution of P Algorithm A (now faster) Approximate Output (Efficient) Sparsification Algorithm S Min s-t cut, Sparsest cut, Max cut, …

8 Relation to Expander Graphs Graph H on V is an expander if, for some constant c, | ± H (U)| ¸ c ¢ |U| 8 U µ V, |U| · n/2 Let G be the complete graph on V. Note that | ± G (U)| = |U| ¢ |V n U| · n ¢ |U| If we give all edges of H weight w=n, then w( ± H (U)) ¸ c ¢ | ± G (U)| 8 U µ V, |U| · n/2 Expanders are similar to sparsifiers of complete graph HG

9 Relation to Expander Graphs Fact: Pick a random graph where each edge appears independently with probability p=  (log(n)/n). Gives an expander with O(n log n) edges with high probability. HG

10 Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E ! R + Def: The Laplacian is the matrix L G such that x T L G x =  st 2 E u st (x s -x t ) 2 8 x 2 R V. L G is positive semidefinite since this is ¸ 0. Example: Electrical Networks – View edge st as resistor of resistance 1/u st. – Impose voltage x v at every vertex v. – Ohm’s Power Law: P = V 2 /R. – Power consumed on edge st is u st (x s -x t ) 2. – Total power consumed is x T L G x. (Spielman-Teng ‘04)

11 Spectral Sparsifiers Input: An undirected graph G=(V,E) with weights u : E ! R + Def: The Laplacian is the matrix L G such that x T L G x =  st 2 E u st (x s -x t ) 2 8 x 2 R V. Output: A subgraph H=(V,F) of G with weights w : F ! R such that |F| is small and x T L H x = (1 § ² ) x T L G x 8 x 2 R V w( ± H (U)) = (1 § ² ) u( ± G (U)) 8 U µ V Spectral Sparsifier Cut Sparsifier ) ) (Spielman-Teng ‘04)

12 Cut vs Spectral Sparsifiers Number of Constraints: – Cut: w( ± H (U)) = (1 § ² ) u( ± G (U)) 8 U µ V (2 n constraints) – Spectral: x T L H x = (1 § ² ) x T L G x 8 x 2 R V ( 1 constraints) Spectral constraints are SDP feasibility constraints: (1- ² ) x T L G x · x T L H x · (1+ ² ) x T L G x 8 x 2 R V, (1- ² ) L G ¹ L H ¹ (1+ ² ) L G Spectral constraints are actually easier to handle – Checking “Is H is a spectral sparsifier of G?” is in P – Checking “Is H is a cut sparsifier of G?” is non-uniform sparsest cut, so NP-hard Here X ¹ Y means Y-X is positive semidefinite

13 Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1- ² ) L G ¹ L H ¹ (1+ ² ) L G ) y has low multiplicative error: k y-x k L G · 2 ² k x k L G Computing y is fast since H is sparse: conjugate gradient method takes O(n|F|) time (where |F| = # nonzero entries of L H )

14 Application of Spectral Sparsifiers Consider the linear system L G x = b. Actual solution is x := L G -1 b. Instead, compute y := L H -1 b, where H is a spectral sparsifier of G. We know: (1- ² ) L G ¹ L H ¹ (1+ ² ) L G ) y has low multiplicative error: k y-x k L G · 2 ² k x k L G Theorem: [Spielman-Teng ‘04, Koutis-Miller-Peng ‘10] Can compute a vector y with low multiplicative error in O(m log n (log log n) 2 ) time. (m = # edges of G)

15 Results on Sparsifiers Cut SparsifiersSpectral Sparsifiers Combinatorial Linear Algebraic Karger ‘94 Benczur-Karger ‘96 Fung-Hariharan- Harvey-Panigrahi ‘11 Spielman-Teng ‘04 Spielman-Srivastava ‘08 Batson-Spielman-Srivastava ‘09 de Carli Silva-Harvey-Sato ‘11 These construct sparsifiers with n log O(1) n / ² 2 edges These construct sparsifiers with O(n / ² 2 ) edges

16 Sparsifiers by Random Sampling The complete graph is easy! Random sampling gives an expander (ie. sparsifier) with O(n log n) edges.

17 Sparsifiers by Random Sampling Can’t sample edges with same probability! Idea [BK’96] Sample low-connectivity edges with high probability, and high-connectivity edges with low probability Keep this Eliminate most of these

18 Non-uniform sampling algorithm [BK’96] Input: Graph G=(V,E), weights u : E ! R + Output: A subgraph H=(V,F) with weights w : F ! R + Choose parameter ½ Compute probabilities { p e : e 2 E } For i=1 to ½ For each edge e 2 E With probability p e, Add e to F Increase w e by u e /( ½ p e ) Note: E[|F|] · ½ ¢  e p e Note: E[ w e ] = u e 8 e 2 E ) For every U µ V, E[ w( ± H (U)) ] = u( ± G (U)) Can we do this so that the cut values are tightly concentrated and E[|F|]=n log O(1) n?

19 Benczur-Karger ‘96 Input: Graph G=(V,E), weights u : E ! R + Output: A subgraph H=(V,F) with weights w : F ! R + Choose parameter ½ Compute probabilities { p e : e 2 E } For i=1 to ½ For each edge e 2 E With probability p e, Add e to F Increase w e by u e /( ½ p e ) Can we do this so that the cut values are tightly concentrated and E[|F|]=n log O(1) n? Set ½ = O(log n/ ² 2 ). Let p e = 1/“strength” of edge e. Cuts are preserved to within (1 § ² ) and E[|F|] = O(n log n/ ² 2 ) Can approximate all values in m log O(1) n time.

20 Fung-Hariharan-Harvey-Panigrahi ‘11 Input: Graph G=(V,E), weights u : E ! R + Output: A subgraph H=(V,F) with weights w : F ! R + Choose parameter ½ Compute probabilities { p e : e 2 E } For i=1 to ½ For each edge e 2 E With probability p e, Add e to F Increase w e by u e /( ½ p e ) Can we do this so that the cut values are tightly concentrated and E[|F|]=n log O(1) n? Set ½ = O(log 2 n/ ² 2 ). Let p st = 1/(min cut separating s and t) Cuts are preserved to within (1 § ² ) and E[|F|] = O(n log 2 n/ ² 2 ) Can approximate all values in O(m + n log n) time

21 Let k uv = min size of a cut separating u and v. Recall sampling probability is p e = 1/k e Partition edges into connectivity classes E = E 1 [ E 2 [... E log n where E i = { e : 2 i-1 · k e <2 i } Prove weight of sampled edges that each cut takes from each connectivity class has low error Key point: Edges in ± (U) Å E i have roughly same p e This yields a sparsifier U

22 Prove weight of sampled edges that each cut takes from each connectivity class has low error Notation: C = ± (U) is a cut C i = ± (U) Å E i is a cut-induced set Need to prove: for every C i C1C1 C2C2 C3C3 C4C4

23 Notation: C i = ± (U) Å E i is a cut-induced set C1C1 C2C2 C3C3 C4C4 Prove 8 cut-induced set C i Key Ingredients Hoeffding bound: Prove small Bound on # small cut-induced sets: For most of these events, u(C) is large. In other words, #{ cut-induced sets C i induced by a small cut C } is small.

24 Counting Small Cut-Induced Sets Theorem: [Fung-Hariharan-Harvey-Panigrahi ‘11] Let G=(V,E) be a graph. Fix any B µ E. Suppose k e ¸ K for all e in B. (k uv = min size of a cut separating u and v) Then, for every ® ¸ 1, |{ ± (U) Å B : | ± (U)| · ® K }| < n 2 ®. Corollary: [Karger ‘93] Let G=(V,E) be a graph. Let K be the edge-connectivity of G. (i.e., global min cut value) Then, for every ® ¸ 1, |{ ± (U) : | ± (U)| · ® K }| < n 2 ®.

25 Summary for Cut Sparsifiers Do non-uniform sampling of edges, with probabilities based on “connectivity” Analysis involves: – Decomposing the graph – Hoeffding bounds to analyze each “cut” – Cut-counting theorem: “few small cuts” BK’96 had weaker cut-counting theorem, but had more complicated “connectivity” notion. Can get sparsifiers with O(n log n / ² 2 ) edges – Optimal for any independent sampling algorithm

26 Spectral Sparsification Input: Graph G=(V,E), weights u : E ! R + Recall: x T L G x =  st 2 E u st (x s -x t ) 2 Goal: Find weights w : E ! R + such that very few w e are non-zero, and (1- ² ) x T L G x ·  e 2 E w e x T L e x · (1+ ² ) x T L G x 8 x 2 R V, (1- ² ) L G ¹  e 2 E w e L e ¹ (1+ ² ) L G General Problem: Given matrices L e satisfying  e L e = L G, find coefficients w e, mostly zero, such that (1- ² ) L G ¹  e w e L e ¹ (1+ ² ) L G Call this x T L st x

27 The General Problem: Sparsifying Sums of PSD Matrices General Problem: Given PSD matrices L e s.t.  e L e = L G, find coefficients w e, mostly zero, such that (1- ² ) L G ¹  e w e L e ¹ (1+ ² ) L G Theorem: [Ahlswede-Winter ’02] Randomized alg gives w with O( n log n/ ² 2 ) non-zeros. Theorem: [de Carli Silva-Harvey-Sato ‘11], building on [Batson-Spielman-Srivastava ‘09] Deterministic alg gives w with O( n/ ² 2 ) non-zeros. – Cut & spectral sparsifiers with O(n/ ² 2 ) edges [BSS’09] – Sparsifiers with more properties and O(n/ ² 2 ) edges [dHS’11]

28 Vector Case General Problem: Given PSD matrices L e s.t.  e L e = L, find coefficients w e, mostly zero, such that (1- ² ) L ¹  e w e L e ¹ (1+ ² ) L Vector Case Vector problem: Given vectors v e 2 [0,1] n s.t.  e v e = v, find coefficients w e, mostly zero, such that k  e w e v e - v k 1 · ² Theorem [Althofer ‘94, Lipton-Young ‘94]: There is a w with O(log n/ ² 2 ) non-zeros. Proof: Random sampling & Hoeffding inequality. Multiplicative version: There is a w with O(n log n/ ² 2 ) non-zeros such that (1- ² ) v ·  e w e v e · (1+ ² ) v

29 Concentration Inequalities Theorem: [Chernoff ‘52, Hoeffding ‘63] Let Y 1,…,Y k be i.i.d. random non-negative real numbers s.t. E[ Y i ] = Z and Y i · uZ. Then Theorem: [Ahlswede-Winter ‘02] Let Y 1,…,Y k be i.i.d. random PSD n x n matrices s.t. E[ Y i ] = Z and Y i ¹ uZ. Then The only difference

30 “Balls & Bins” Example Problem: Throw k balls into n bins. Want max load / min load · 1+ ². How big should k be? AW Theorem: Let Y 1,…,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹ uZ. Then Solution: Let Y i be all zeros, except for a single n in a random diagonal entry. Then E[ Y i ] = I =: Z, and ¸ max (Y i Z -1 ) = n =: u. Set k = £ (n log n / ² 2 ). Then, with high probability, every diagonal entry of  i Y i /k is in [1- ²,1+ ² ].

31 Solving the General Problem General Problem: Given PSD matrices L e s.t.  e L e = L G, find coefficients w e, mostly zero, such that (1- ² ) L G ¹  e w e L e ¹ (1+ ² ) L G AW Theorem: Let Y 1,…,Y k be i.i.d. random PSD matrices such that E[ Y i ] = Z and Y i ¹ uZ. Then Solve General Problem with O(n log n/ ² 2 ) non-zeros Repeat k:= £ (n log n / ² 2 ) times Pick an edge e with probability p e := Tr(L e L G -1 ) / n Increment w e by 1/k ¢ p e

32 Derandomization Vector problem: Given vectors v e 2 [0,1] n s.t.  e v e = v, find coefficients w e, mostly zero, such that k  e w e v e - v k 1 · ² Theorem [Young ‘94]: The multiplicative weights method deterministically gives w with O(log n/ ² 2 ) non-zeros – Or, use pessimistic estimators on the Hoeffding proof General Problem: Given PSD matrices L e s.t.  e L e = L G, find coefficients w e, mostly zero, such that (1- ² ) L G ¹  e w e L e ¹ (1+ ² ) L G Theorem [de Carli Silva-Harvey-Sato ‘11]: The matrix multiplicative weights method (Arora-Kale ‘07) deterministically gives w with O(n log n/ ² 2 ) non-zeros – Or, use matrix pessimistic estimators (Wigderson-Xiao ‘06)

33 MWUM for “Balls & Bins” 01 ¸ values: l u Let ¸ i = load in bin i. Initially ¸ =0. Want: 1 · ¸ i and ¸ i · 1. Introduce penalty functions “exp( l - ¸ i )” and “exp( ¸ i -u)” Find a bin ¸ i to throw a ball into such that, increasing l by ± l and u by ± u, the penalties don’t grow.  i exp( l+ ± l - ¸ i ’) ·  i exp( l - ¸ i )    i exp( ¸ i ’-(u+ ± u )) ·  i exp( ¸ i -u) Careful analysis shows O(n log n/ ² 2 ) balls is enough

34 MMWUM for General Problem 01 ¸ values: l u Let A=0 and ¸ its eigenvalues. Want: 1 · ¸ i and ¸ i · 1. Use penalty functions Tr exp( l I -A) and Tr exp(A-u I ) Find a matrix L e such that adding ® L e to A, increasing l by ± l and u by ± u, the penalties don’t grow. Tr exp(( l+ ± l ) I - (A+ ® L e )) · Tr exp( l I -A)  Tr exp((A+ ® L e )-(u+ ± u ) I ) · Tr exp(A-u I ) Careful analysis shows O(n log n/ ² 2 ) matrices is enough

35 Beating Sampling & MMWUM 01 ¸ values: l u To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr ( A- l I ) -1 and Tr (u I -A ) -1 Find a matrix L e such that adding ® L e to A, increasing l by ± l and u by ± u, the penalties don’t grow. Tr ((A+ ® L e )-( l+ ± l ) I ) -1 · Tr (A- l I ) -1  Tr ((u+ ± u ) I - (A+ ® L e )) -1 · Tr (u I - A) -1 All eigenvalues stay within [ l, u]

36 Beating Sampling & MMWUM To get a better bound, try changing the penalty functions to be steeper! Use penalty functions Tr ( A- l I ) -1 and Tr (u I -A ) -1 Find a matrix L e such that adding ® L e to A, increasing l by ± l and u by ± u, the penalties don’t grow. Tr ((A+ ® L e )-( l+ ± l ) I ) -1 · Tr (A- l I ) -1  Tr ((u+ ± u ) I - (A+ ® L e )) -1 · Tr (u I - A) -1 General Problem: Given PSD matrices L e s.t.  e L e = L G, find coefficients w e, mostly zero, such that (1- ² ) L G ¹  e w e L e ¹ (1+ ² ) L G Theorem: [Batson-Spielman-Srivastava ‘09] in rank-1 case, [de Carli Silva-Harvey-Sato ‘11] for general case This gives a solution w with O( n/ ² 2 ) non-zeros.

37 Applications Theorem: [de Carli Silva-Harvey-Sato ‘11] Given PSD matrices L e s.t.  e L e = L, there is an algorithm to find w with O( n/ ² 2 ) non-zeros such that (1- ² ) L ¹  e w e L e ¹ (1+ ² ) L Application 1: Spectral Sparsifiers with Costs Given costs on edges of G, can find sparsifier H whose cost is at most (1+ ² ) the cost of G. Application 2: Simultaneous Spectral Sparsifiers Given two graphs G 1 & G 2 with a bijection on their edges, can choose edges that simultaneously sparsify G 1 & G 2. Application 3: Sparse SDP Solutions min { c T y :  i y i A i º B, y ¸ 0 } where A i ’s and B are PSD has nearly optimal solution with O(n/ ² 2 ) non-zeros.

38 Open Questions Use of sparsifiers in other areas (infoviz, etc.) Sparsifiers for directed graphs Construction of expander graphs More control of the weights w e A combinatorial proof of spectral sparsifiers More applications of our general theorem


Download ppt "Graph Sparsifiers: A Survey Nick Harvey Based on work by: Batson, Benczur, de Carli Silva, Fung, Hariharan, Harvey, Karger, Panigrahi, Sato, Spielman,"

Similar presentations


Ads by Google