Presentation is loading. Please wait.

Presentation is loading. Please wait.

Optimization on Graphs

Similar presentations


Presentation on theme: "Optimization on Graphs"β€” Presentation transcript:

1 Optimization on Graphs
Speaker Rasmus Kyng ETH Zurich Dept. of Computer Science inf.ethz.ch

2 Optimization on Graphs
Graph 𝐺= 𝑉,𝐸 pairwise interactions on 𝒙 𝑒 ,𝒙 𝑣 for 𝑒,𝑣 ∈𝐸 Edge Constraints For all 𝑒,𝑣 ∈𝐸 𝑔 𝑒,𝑣 𝒙 𝑒 ,𝒙 𝑣 ≀0 ( ) π’™βˆˆ ℝ 𝑉 Objective min π’™βˆˆ ℝ 𝑉 𝑒,𝑣 ∈𝐸 𝑓 𝑒,𝑣 𝒙 𝑒 ,𝒙 𝑣

3 Optimization on Graphs
Graph 𝐺= 𝑉,𝐸 pairwise interactions on 𝒙 𝑒 ,𝒙 𝑣 for 𝑒,𝑣 ∈𝐸 Edge Constraints For all 𝑒,𝑣 ∈𝐸 𝑔 𝑒,𝑣 𝒙 𝑒 ,𝒙 𝑣 ≀0 ( ) π’™βˆˆ ℝ 𝑉 Objective min π’™βˆˆ ℝ 𝑉 𝑠.𝑑. 𝑒,𝑣 ∈𝐸 𝑓 𝑒,𝑣 𝒙 𝑒 ,𝒙 𝑣 ( )

4 Optimization on Graphs: An Example

5 Isotonic Regression Predict child’s height from mother’s height Model?
Height of child Height of mother

6 Isotonic Regression Predict child’s height from mother’s height Model? Increasing function? Height of child Height of mother

7 Isotonic Regression Predict child’s height from mother’s height Model? Increasing function? Height of child Height of mother

8 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
𝑣 𝑣 2 …

9 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
𝑣 𝑣 2 …

10 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Cost: min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother 𝑣 𝑣 2 …

11 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Cost: min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother 𝑣 𝑣 2 …

12 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Cost: min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother 𝑣 𝑣 2 …

13 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Cost: min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother 𝑣 𝑣 2 …

14 Isotonic Regression 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother
Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Cost: min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 𝒙(𝑣) Height of child 𝐺= 𝑉,𝐸 Height of mother 𝑣 𝑣 2 …

15 Isotonic Regression Constraint Graph 𝐺= 𝑉,𝐸 Age of child
Height of mother Height of child Height of mother Height of child

16 Isotonic Regression Constraint Graph 𝐺= 𝑉,𝐸 𝒙 𝑣 𝒙(𝑒) Age of child
Height of mother Height of mother 𝒙(𝑒)≀𝒙 𝑣 Taller mother AND older child

17 Isotonic Regression Constraint Graph 𝐺= 𝑉,𝐸 Age of child
for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Age of child Cost: Height of mother min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2

18 Isotonic Regression Constraint Graph 𝐺= 𝑉,𝐸 Age of child
for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 Age of child Cost: Height of mother min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’child_height(𝑣) 2 Height of child

19 Optimization on Graphs
Graph 𝐺= 𝑉,𝐸 π’™βˆˆ ℝ 𝑉 Functions & constraints on pairs 𝒙 𝑣 ,𝒙(𝑒) 𝒙 𝑣 : predicted height of child 𝑒,𝑣 ∈𝐸 Constraint: for all 𝑒,𝑣 ∈𝐸 𝒙(𝑒)≀𝒙 𝑣 min 𝒙 π‘£βˆˆπ‘‰ 𝒙 𝑣 βˆ’observed_child_height(𝑣) 2 Cost:

20 Optimization on Graphs
Graph 𝐺= 𝑉,𝐸 π’™βˆˆ ℝ 𝑉 Functions & constraints on pairs 𝒙 𝑣 ,𝒙(𝑒) (+ barrier trick) Unconstrained problem with modified objective 𝑒,𝑣 ∈𝐸 TIME Aug-05 Mon 1 12:05 at end min π’™βˆˆ ℝ 𝑉 𝑒,𝑣 ∈𝐸 𝑓 𝑒,𝑣 𝒙 𝑒 ,𝒙 𝑣

21 Optimization on Graphs: Fast Algorithms

22 Optimization Primer min π’™βˆˆπ’Ÿ c(𝒙) π’Ÿ 𝒙 βˆ—

23 Optimization Primer min π’™βˆˆπ’Ÿ c(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 + dc π‘₯ 0 dπ‘₯ Ξ” d 2 c π‘₯ 0 d π‘₯ 2 Ξ” 2 c 𝒙 0 +𝚫 2nd order 1st order 𝒙 0

24 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— + 𝑖 dc π‘₯ 0 dπ‘₯ 𝑖 Ξ” 𝑖 𝑖,𝑗 d 2 c π‘₯ 0 dπ‘₯ 𝑖 dπ‘₯ 𝑗 Ξ” 𝑖 Ξ”(𝑗) c 𝒙 0 +𝚫 β‰ˆc 𝒙 0

25 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +βˆ‡c 𝒙 0 β‹…πš« πš«β‹… βˆ‡ 2 c 𝒙 0 𝚫

26 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš« 𝐻𝚫=βˆ’π’ˆ c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš«

27 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš« 𝐻𝚫=βˆ’π’ˆ c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 βˆ’π’ˆβ‹… 𝐻 βˆ’1 π’ˆ πš«β‹…π»πš«

28 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš« 𝐻𝚫=βˆ’π’ˆ c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 βˆ’π’ˆβ‹… 𝐻 βˆ’1 π’ˆ 𝐻 βˆ’1 π’ˆβ‹…π» 𝐻 βˆ’1 π’ˆ

29 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš« 𝐻𝚫=βˆ’π’ˆ c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 βˆ’ 1 2 π’ˆβ‹… 𝐻 βˆ’1 π’ˆ

30 Optimization Primer min π’™βˆˆπ’Ÿ 𝑐(𝒙) Second Order Methods – β€œNewton Steps” π’Ÿ 𝒙 𝟎 𝒙 𝟏 𝒙 𝟐 𝒙 βˆ— TIME Aug-05 Mon 1 12:05 at end TIME Aug-05 Mon 2 10:00 at end c 𝒙 0 +𝚫 β‰ˆc 𝒙 0 +π’ˆβ‹…πš« πš«β‹…π»πš« 𝐻𝚫=βˆ’π’ˆ 𝒙 1 = 𝒙 0 +𝚫 A good update step!

31 Second Order Methods Usually finding step 𝚫 that solves linear equation 𝐻𝚫=βˆ’π’ˆ is too expensive! But for optimization on graphs, we can find the Newton step much faster than we can solve general linear equations

32 Graphs and Hessian Linear Equations
Newton Step: find 𝚫 s.t. 𝐻𝚫=βˆ’π’ˆ Gaussian Elimination: 𝑂 𝑛 3 time for 𝑛×𝑛 matrix 𝐻. β€œFaster” methods: 𝑂 𝑛 time Hessian 𝐻 from sum of convex functions on two variables = Symmetric M-matrix β‰ˆ Laplacian Spielman-Teng ’04: Laplacian linear equations can be solved in 𝑂 #edges time

33 Graphs and Hessian Linear Equations
Newton Step: find 𝚫 s.t. 𝐻𝚫=βˆ’π’ˆ Gaussian Elimination: 𝑂 𝑛 3 time for 𝑛×𝑛 matrix 𝐻. β€œFaster” methods: 𝑂 𝑛 time Hessian 𝐻 from sum of convex functions on two variables = Symmetric M-matrix β‰ˆ Laplacian Daitch-Spielman ’08: Symmetric M-matrix linear equations can be solved in 𝑂 #edges time

34 Convex Functions c(𝒙) 𝒄 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš« 𝒙 0
𝒄 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš« 𝒙 0 c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« β‰₯0

35 Convex Functions c(𝒙) 𝒄 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš« 𝒙 0
𝒄 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« TIME sun 1 Approx 26:00 at end? c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš« 𝒙 0 c 𝒙 0 +𝚫 β‰ˆπ’„ 𝒙 0 + π’ˆβ‹…πš«+ 1 2 πš«β‹…π»πš« No negative eigenvalues!

36 Hessians & Graphs 𝒙= 𝑦 𝑧 𝑀
𝒙= 𝑦 𝑧 𝑀 Graph-Structured Cost Function c 𝒙 = c 1 𝑦,𝑧 + c 2 𝑦,𝑀 + c 3 𝑧,𝑀 𝑧 c 1 𝑦,𝑧 𝑦 c 3 𝑧,𝑀 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑧,𝑀 + βˆ‡ 2 c 3 𝑀,𝑦

37 non-negative eigenvalues
Second Derivatives βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ’1 βˆ’1 βˆ’0 βˆ’ βˆ’0 βˆ’0 βˆ’0 βˆ’0 2-by-2 PSD non-negative eigenvalues If every term looks like the sum is an M-matrix

38 Second Derivatives βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 𝑧 𝑀

39 Second Derivatives βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’ βˆ’0 βˆ’0 βˆ’0 βˆ’0 𝑧 𝑀

40 Second Derivatives βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’ βˆ’0 βˆ’0 βˆ’0 βˆ’0 𝑧 𝑀

41 Second Derivatives βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0 𝑧 𝑀

42 Second Derivatives βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’4 βˆ’0 βˆ’2 βˆ’0 βˆ’0 βˆ’0 βˆ’2 βˆ’0 βˆ’1 𝑧 𝑀

43 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1 βˆ’4 βˆ’0 βˆ’2 βˆ’0 βˆ’0 βˆ’0 βˆ’2 βˆ’0 βˆ’1 𝑧 𝑀

44 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1 𝑧 𝑀

45 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1 𝑧 𝑀

46 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’1 βˆ’0 βˆ’2 βˆ’0 βˆ’1 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 𝑧 𝑀

47 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’0 βˆ’0 βˆ’0 βˆ’0 βˆ’1 βˆ’1 βˆ’0 βˆ’1 βˆ’1 𝑧 𝑀

48 Second Derivatives βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2
βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ‡ 2 c(𝒙) 𝑦 𝑧 𝑀 𝑦 βˆ’5 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 βˆ’1 βˆ’2 𝑧 𝑀

49 non-negative eigenvalues
Second Derivatives βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ’1 βˆ’1 βˆ’0 βˆ’ βˆ’0 βˆ’0 βˆ’0 βˆ’0 2-by-2 PSD non-negative eigenvalues If every term looks like the sum is an M-matrix

50 non-negative eigenvalues
Second Derivatives βˆ‡ 2 c 𝒙 = βˆ‡ 2 c 1 𝑦,𝑧 + βˆ‡ 2 c 2 𝑦,𝑀 + βˆ‡ 2 c 3 𝑧,𝑀 𝑧 βˆ‡ 2 c 1 𝑦,𝑧 𝑦 βˆ‡ 2 c 3 𝑧,𝑀 βˆ‡ 2 c 2 𝑦,𝑀 𝑀 βˆ’1 βˆ’1 βˆ’0 βˆ’ βˆ’0 βˆ’0 βˆ’0 βˆ’0 2-by-2 PSD non-negative eigenvalues If every term looks like Newton Step can be computed in nearly linear time!

51 Optimization on Graphs
Variants of this framework have been used for many optimization problems: Maximum flow [DS08, CKMST11,KMP12, Mad13, Mad16] Minimum cost flow [LS14] Negative weight shortest paths [CMSV16] Isotonic regression [KRS15] Regularized Lipschitz learning on graphs [KRSS15] P-norm flow [KPSW19]

52 Optimization on Graphs
M-matrix & Laplacian matrix solvers used for many other problems in TCS Learning on graphs [ZGL03, ZS04, ZBLWS04] Graph partitioning [OSV12] Sampling random spanning trees [KM09,MST15,DKPRS17,DPR17,S18] Graph sparsification [SS08,LKP12,KPPS17]

53 Solving M-matrices Daitch-Spielman β€˜08 gave reduction to Laplacian matrices Spielman-Teng ’04 gave fast Laplacian solver Since, much work on trying to get a practical solver: [KMP10, KMP11, KOSZ13, LS13, CKM+14, PS14, LPS16] We finally succeeded? Kyng-Sachdeva ’16

54 Optimization on Graphs: Toward Practical Algorithms

55 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt)

56 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt)

57 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient

58 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky

59 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky

60 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Min Cost Flow 1 5.9 4.3 >60 Min Cost Flow 2 6.8 23 29 Apx Elim = Approximate Elimination (K & Spielman) CMG = Combinatorial Multigrid (Koutis) LAMG = Lean Algebraic Multigrid (Livne & Brandt) CG = Conjugate Gradient, ICC = Incomplete Cholesky

61 Julia Package: Laplacians.jl
Graph Apx Elim CMG LAMG Other 1000x1000 grid 6.3 3.0 15 100x100x100 grid 8.7 3.4 Rand 4-regular 17.2 12 13.7 1.7 (CG) Pref attach 9.5 10.8 5.1 3.2 (ICC) Chimera ( ,11) 28 350 74 Min Cost Flow 1 5.9 4.3 >60 Min Cost Flow 2 6.8 23 29 Summary: Approximate Elimination processes between 300k and 500k entries per second, for 8 digit accuracy Others vary widely

62 Thanks! github.com/danspielman/Laplacians.jl/ Slides: rasmuskyng.com/#talks References Approximate Gaussian Elimination for Laplacians (R. Kyng, S. Sachdeva 2016) Faster Approximate Lossy Generalized Flow via Interior Point Algorithms (S. Daitch, D. Spielman 2008) Fast, Provable Algorithms for Isotonic Regression in all Lp norms (R. Kyng, A.B. Rao, S. Sachdeva 2015) Nearly-Linear Time Algorithms for Preconditioning and Solving SDD Linear Systems (D. Spielman, S.-H. Teng 2004)


Download ppt "Optimization on Graphs"

Similar presentations


Ads by Google