Download presentation
Presentation is loading. Please wait.
1
Optimization on Graphs
Speaker Rasmus Kyng ETH Zurich Dept. of Computer Science inf.ethz.ch
2
Optimization on Graphs
Graph πΊ= π,πΈ pairwise interactions on π π’ ,π π£ for π’,π£ βπΈ Edge Constraints For all π’,π£ βπΈ π π’,π£ π π’ ,π π£ β€0 ( ) πβ β π Objective min πβ β π π’,π£ βπΈ π π’,π£ π π’ ,π π£
3
Optimization on Graphs
Graph πΊ= π,πΈ pairwise interactions on π π’ ,π π£ for π’,π£ βπΈ Edge Constraints For all π’,π£ βπΈ π π’,π£ π π’ ,π π£ β€0 ( ) πβ β π Objective min πβ β π π .π‘. π’,π£ βπΈ π π’,π£ π π’ ,π π£ ( )
4
Optimization on Graphs: An Example
5
Isotonic Regression Predict childβs height from motherβs height Model?
Height of child Height of mother
6
Isotonic Regression Predict childβs height from motherβs height Model? Increasing function? Height of child Height of mother
7
Isotonic Regression Predict childβs height from motherβs height Model? Increasing function? Height of child Height of mother
8
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
π£ π£ 2 β¦
9
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
π£ π£ 2 β¦
10
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
Constraint: for all π’,π£ βπΈ π(π’)β€π π£ Cost: min π π£βπ π π£ βchild_height(π£) 2 π(π£) Height of child πΊ= π,πΈ Height of mother π£ π£ 2 β¦
11
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
Constraint: for all π’,π£ βπΈ π(π’)β€π π£ Cost: min π π£βπ π π£ βchild_height(π£) 2 π(π£) Height of child πΊ= π,πΈ Height of mother π£ π£ 2 β¦
12
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
Constraint: for all π’,π£ βπΈ π(π’)β€π π£ Cost: min π π£βπ π π£ βchild_height(π£) 2 π(π£) Height of child πΊ= π,πΈ Height of mother π£ π£ 2 β¦
13
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
Constraint: for all π’,π£ βπΈ π(π’)β€π π£ Cost: min π π£βπ π π£ βchild_height(π£) 2 π(π£) Height of child πΊ= π,πΈ Height of mother π£ π£ 2 β¦
14
Isotonic Regression π(π£) Height of child πΊ= π,πΈ Height of mother
Constraint: for all π’,π£ βπΈ π(π’)β€π π£ Cost: min π π£βπ π π£ βchild_height(π£) 2 π(π£) Height of child πΊ= π,πΈ Height of mother π£ π£ 2 β¦
15
Isotonic Regression Constraint Graph πΊ= π,πΈ Age of child
Height of mother Height of child Height of mother Height of child
16
Isotonic Regression Constraint Graph πΊ= π,πΈ π π£ π(π’) Age of child
Height of mother Height of mother π(π’)β€π π£ Taller mother AND older child
17
Isotonic Regression Constraint Graph πΊ= π,πΈ Age of child
for all π’,π£ βπΈ π(π’)β€π π£ Age of child Cost: Height of mother min π π£βπ π π£ βchild_height(π£) 2
18
Isotonic Regression Constraint Graph πΊ= π,πΈ Age of child
for all π’,π£ βπΈ π(π’)β€π π£ Age of child Cost: Height of mother min π π£βπ π π£ βchild_height(π£) 2 Height of child
19
Optimization on Graphs
Graph πΊ= π,πΈ πβ β π Functions & constraints on pairs π π£ ,π(π’) π π£ : predicted height of child π’,π£ βπΈ Constraint: for all π’,π£ βπΈ π(π’)β€π π£ min π π£βπ π π£ βobserved_child_height(π£) 2 Cost:
20
Optimization on Graphs
Graph πΊ= π,πΈ πβ β π Functions & constraints on pairs π π£ ,π(π’) (+ barrier trick) Unconstrained problem with modified objective π’,π£ βπΈ TIME Aug-05 Mon 1 12:05 at end min πβ β π π’,π£ βπΈ π π’,π£ π π’ ,π π£
21
Optimization on Graphs: Fast Algorithms
22
Optimization Primer min πβπ c(π) π π β
23
Optimization Primer min πβπ c(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 + dc π₯ 0 dπ₯ Ξ d 2 c π₯ 0 d π₯ 2 Ξ 2 c π 0 +π« 2nd order 1st order π 0
24
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β + π dc π₯ 0 dπ₯ π Ξ π π,π d 2 c π₯ 0 dπ₯ π dπ₯ π Ξ π Ξ(π) c π 0 +π« βc π 0
25
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 +βc π 0 β
π« π«β
β 2 c π 0 π«
26
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 +πβ
π« π«β
π»π« π»π«=βπ c π 0 +π« βc π 0 +πβ
π« π«β
π»π«
27
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 +πβ
π« π«β
π»π« π»π«=βπ c π 0 +π« βc π 0 βπβ
π» β1 π π«β
π»π«
28
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 +πβ
π« π«β
π»π« π»π«=βπ c π 0 +π« βc π 0 βπβ
π» β1 π π» β1 πβ
π» π» β1 π
29
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β c π 0 +π« βc π 0 +πβ
π« π«β
π»π« π»π«=βπ c π 0 +π« βc π 0 β 1 2 πβ
π» β1 π
30
Optimization Primer min πβπ π(π) Second Order Methods β βNewton Stepsβ π π π π π π π π β TIME Aug-05 Mon 1 12:05 at end TIME Aug-05 Mon 2 10:00 at end c π 0 +π« βc π 0 +πβ
π« π«β
π»π« π»π«=βπ π 1 = π 0 +π« A good update step!
31
Second Order Methods Usually finding step π« that solves linear equation π»π«=βπ is too expensive! But for optimization on graphs, we can find the Newton step much faster than we can solve general linear equations
32
Graphs and Hessian Linear Equations
Newton Step: find π« s.t. π»π«=βπ Gaussian Elimination: π π 3 time for πΓπ matrix π». βFasterβ methods: π π time Hessian π» from sum of convex functions on two variables = Symmetric M-matrix β Laplacian Spielman-Teng β04: Laplacian linear equations can be solved in π #edges time
33
Graphs and Hessian Linear Equations
Newton Step: find π« s.t. π»π«=βπ Gaussian Elimination: π π 3 time for πΓπ matrix π». βFasterβ methods: π π time Hessian π» from sum of convex functions on two variables = Symmetric M-matrix β Laplacian Daitch-Spielman β08: Symmetric M-matrix linear equations can be solved in π #edges time
34
Convex Functions c(π) π π 0 + πβ
π«+ 1 2 π«β
π»π« c π 0 +π« βπ π 0 + πβ
π« π 0
π π 0 + πβ
π«+ 1 2 π«β
π»π« c π 0 +π« βπ π 0 + πβ
π« π 0 c π 0 +π« βπ π 0 + πβ
π«+ 1 2 π«β
π»π« β₯0
35
Convex Functions c(π) π π 0 + πβ
π«+ 1 2 π«β
π»π« c π 0 +π« βπ π 0 + πβ
π« π 0
π π 0 + πβ
π«+ 1 2 π«β
π»π« TIME sun 1 Approx 26:00 at end? c π 0 +π« βπ π 0 + πβ
π« π 0 c π 0 +π« βπ π 0 + πβ
π«+ 1 2 π«β
π»π« No negative eigenvalues!
36
Hessians & Graphs π= π¦ π§ π€
π= π¦ π§ π€ Graph-Structured Cost Function c π = c 1 π¦,π§ + c 2 π¦,π€ + c 3 π§,π€ π§ c 1 π¦,π§ π¦ c 3 π§,π€ c 2 π¦,π€ π€ β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π§,π€ + β 2 c 3 π€,π¦
37
non-negative eigenvalues
Second Derivatives β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β1 β1 β0 β β0 β0 β0 β0 2-by-2 PSD non-negative eigenvalues If every term looks like the sum is an M-matrix
38
Second Derivatives β0 β0 β0 β0 β0 β0 β0 β0 β0
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ β0 β0 β0 β0 β0 β0 β0 β0 β0 π§ π€
39
Second Derivatives β0 β0 β0 β0 β0 β0 β0 β0 β0
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β0 β0 β0 β0 β0 β0 β0 β0 β0 β1 β1 β0 β β0 β0 β0 β0 π§ π€
40
Second Derivatives β1 β1 β0 β1 β1 β0 β0 β0 β0
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β1 β1 β0 β1 β1 β0 β0 β0 β0 β1 β1 β0 β β0 β0 β0 β0 π§ π€
41
Second Derivatives β1 β1 β0 β1 β1 β0 β0 β0 β0
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ β1 β1 β0 β1 β1 β0 β0 β0 β0 π§ π€
42
Second Derivatives β1 β1 β0 β1 β1 β0 β0 β0 β0
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β1 β1 β0 β1 β1 β0 β0 β0 β0 β4 β0 β2 β0 β0 β0 β2 β0 β1 π§ π€
43
Second Derivatives β5 β1 β2 β1 β1 β0 β2 β0 β1
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β5 β1 β2 β1 β1 β0 β2 β0 β1 β4 β0 β2 β0 β0 β0 β2 β0 β1 π§ π€
44
Second Derivatives β5 β1 β2 β1 β1 β0 β2 β0 β1
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ β5 β1 β2 β1 β1 β0 β2 β0 β1 π§ π€
45
Second Derivatives β5 β1 β2 β1 β1 β0 β2 β0 β1
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ β5 β1 β2 β1 β1 β0 β2 β0 β1 π§ π€
46
Second Derivatives β5 β1 β2 β1 β1 β0 β2 β0 β1
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β5 β1 β2 β1 β1 β0 β2 β0 β1 β0 β0 β0 β0 β1 β1 β0 β1 β1 π§ π€
47
Second Derivatives β5 β1 β2 β1 β2 β1 β2 β1 β2
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ π§ π€ π¦ β5 β1 β2 β1 β2 β1 β2 β1 β2 β0 β0 β0 β0 β1 β1 β0 β1 β1 π§ π€
48
Second Derivatives β5 β1 β2 β1 β2 β1 β2 β1 β2
β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β 2 c(π) π¦ π§ π€ π¦ β5 β1 β2 β1 β2 β1 β2 β1 β2 π§ π€
49
non-negative eigenvalues
Second Derivatives β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β1 β1 β0 β β0 β0 β0 β0 2-by-2 PSD non-negative eigenvalues If every term looks like the sum is an M-matrix
50
non-negative eigenvalues
Second Derivatives β 2 c π = β 2 c 1 π¦,π§ + β 2 c 2 π¦,π€ + β 2 c 3 π§,π€ π§ β 2 c 1 π¦,π§ π¦ β 2 c 3 π§,π€ β 2 c 2 π¦,π€ π€ β1 β1 β0 β β0 β0 β0 β0 2-by-2 PSD non-negative eigenvalues If every term looks like Newton Step can be computed in nearly linear time!
51
Optimization on Graphs
Variants of this framework have been used for many optimization problems: Maximum flow [DS08, CKMST11,KMP12, Mad13, Mad16] Minimum cost flow [LS14] Negative weight shortest paths [CMSV16] Isotonic regression [KRS15] Regularized Lipschitz learning on graphs [KRSS15] P-norm flow [KPSW19]
52
Optimization on Graphs
M-matrix & Laplacian matrix solvers used for many other problems in TCS Learning on graphs [ZGL03, ZS04, ZBLWS04] Graph partitioning [OSV12] Sampling random spanning trees [KM09,MST15,DKPRS17,DPR17,S18] Graph sparsification [SS08,LKP12,KPPS17]
53
Solving M-matrices Daitch-Spielman β08 gave reduction to Laplacian matrices Spielman-Teng β04 gave fast Laplacian solver Since, much work on trying to get a practical solver: [KMP10, KMP11, KOSZ13, LS13, CKM+14, PS14, LPS16] We finally succeeded? Kyng-Sachdeva β16
54
Optimization on Graphs: Toward Practical Algorithms
55
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt)
56
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt)
57
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient
58
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky
59
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky
60
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Min Cost Flow 1 5.9 4.3 >60 Min Cost Flow 2 6.8 23 29 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky
61
Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Min Cost Flow 1 5.9 4.3 >60 Min Cost Flow 2 6.8 23 29 Summary: Approximate Elimination processes between 300k and 500k entries per second, for 8 digit accuracy Others vary widely
62
Thanks! github.com/danspielman/Laplacians.jl/ Slides: rasmuskyng.com/#talks References Approximate Gaussian Elimination for Laplacians (R. Kyng, S. Sachdeva 2016) Faster Approximate Lossy Generalized Flow via Interior Point Algorithms (S. Daitch, D. Spielman 2008) Fast, Provable Algorithms for Isotonic Regression in all Lp norms (R. Kyng, A.B. Rao, S. Sachdeva 2015) Nearly-Linear Time Algorithms for Preconditioning and Solving SDD Linear Systems (D. Spielman, S.-H. Teng 2004)
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.