Discrete Optimization in Computer Vision M. Pawan Kumar Slides will be available online

Slides:

Advertisements

Similar presentations

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

Advertisements

MAP Estimation Algorithms in

POSE–CUT Simultaneous Segmentation and 3D Pose Estimation of Humans using Dynamic Graph Cuts Mathieu Bray Pushmeet Kohli Philip H.S. Torr Department of.

Mean-Field Theory and Its Applications In Computer Vision1 1.

Primal-dual Algorithm for Convex Markov Random Fields Vladimir Kolmogorov University College London GDR (Optimisation Discrète, Graph Cuts et Analyse d'Images)

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London Tutorial at GDR (Optimisation Discrète, Graph Cuts.

Solving Markov Random Fields using Second Order Cone Programming Relaxations M. Pawan Kumar Philip Torr Andrew Zisserman.

Solving Markov Random Fields using Dynamic Graph Cuts & Second Order Cone Programming Relaxations M. Pawan Kumar, Pushmeet Kohli Philip Torr.

Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,

Introduction to Markov Random Fields and Graph Cuts Simon Prince

ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.

Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l’information et vision artiﬁcielle.

An Analysis of Convex Relaxations (PART I) Minimizing Higher Order Energy Functions (PART 2) Philip Torr Work in collaboration with: Pushmeet Kohli, Srikumar.

1 s-t Graph Cuts for Binary Energy Minimization  Now that we have an energy function, the big question is how do we minimize it? n Exhaustive search is.

Graph-based image segmentation Václav Hlaváč Czech Technical University in Prague Faculty of Electrical Engineering Department of Cybernetics Prague, Czech.

Probabilistic Inference Lecture 1

Learning with Inference for Discrete Graphical Models Nikos Komodakis Pawan Kumar Nikos Paragios Ramin Zabih (presenter)

1 Fast Primal-Dual Strategies for MRF Optimization (Fast PD) Robot Perception Lab Taha Hamedani Aug 2014.

Robust Higher Order Potentials For Enforcing Label Consistency

An Analysis of Convex Relaxations M. Pawan Kumar Vladimir Kolmogorov Philip Torr for MAP Estimation.

P 3 & Beyond Solving Energies with Higher Order Cliques Pushmeet Kohli Pawan Kumar Philip H. S. Torr Oxford Brookes University CVPR 2007.

Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.

2010/5/171 Overview of graph cuts. 2010/5/172 Outline Introduction S-t Graph cuts Extension to multi-label problems Compare simulated annealing and alpha-

Message Passing Algorithms for Optimization

Efficiently Solving Convex Relaxations M. Pawan Kumar University of Oxford for MAP Estimation Philip Torr Oxford Brookes University.

Stereo Computation using Iterative Graph-Cuts

What Energy Functions Can be Minimized Using Graph Cuts? Shai Bagon Advanced Topics in Computer Vision June 2010.

Relaxations and Moves for MAP Estimation in MRFs M. Pawan Kumar STANFORDSTANFORD Vladimir KolmogorovPhilip TorrDaphne Koller.

Hierarchical Graph Cuts for Semi-Metric Labeling M. Pawan Kumar Joint work with Daphne Koller.

Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.

Graph-Cut Algorithm with Application to Computer Vision Presented by Yongsub Lim Applied Algorithm Laboratory.

Computer vision: models, learning and inference

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Phil.

Probabilistic Inference Lecture 4 – Part 2 M. Pawan Kumar Slides available online

CS774. Markov Random Field : Theory and Application Lecture 13 Kyomin Jung KAIST Oct

Planar Cycle Covering Graphs for inference in MRFS The Typhon Algorithm A New Variational Approach to Ground State Computation in Binary Planar Markov.

Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.

Rounding-based Moves for Metric Labeling M. Pawan Kumar Center for Visual Computing Ecole Centrale Paris.

Discrete Optimization Lecture 2 – Part I M. Pawan Kumar Slides available online

Discrete Optimization Lecture 5 – Part 2 M. Pawan Kumar Slides available online

Discrete Optimization Lecture 4 – Part 2 M. Pawan Kumar Slides available online

Probabilistic Inference Lecture 3 M. Pawan Kumar Slides available online

Algorithms for MAP estimation in Markov Random Fields Vladimir Kolmogorov University College London.

Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online

1 Markov Random Fields with Efficient Approximations Yuri Boykov, Olga Veksler, Ramin Zabih Computer Science Department CORNELL UNIVERSITY.

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Dynamic Tree Block Coordinate Ascent Daniel Tarlow 1, Dhruv Batra 2 Pushmeet Kohli 3, Vladimir Kolmogorov 4 1: University of Toronto3: Microsoft Research.

Machine Learning – Lecture 15

Efficient Discriminative Learning of Parts-based Models M. Pawan Kumar Andrew Zisserman Philip Torr

Tractable Higher Order Models in Computer Vision (Part II) Slides from Carsten Rother, Sebastian Nowozin, Pusohmeet Khli Microsoft Research Cambridge Presented.

Discrete Optimization Lecture 2 – Part 2 M. Pawan Kumar Slides available online

Machine Learning – Lecture 15

Inference for Learning Belief Propagation. So far... Exact methods for submodular energies Approximations for non-submodular energies Move-making ( N_Variables.

Probabilistic Inference Lecture 2 M. Pawan Kumar Slides available online

Discrete Optimization Lecture 1 M. Pawan Kumar Slides available online

Pushmeet Kohli. E(X) E: {0,1} n → R 0 → fg 1 → bg Image (D) n = number of pixels [Boykov and Jolly ‘ 01] [Blake et al. ‘04] [Rother, Kolmogorov and.

A global approach Finding correspondence between a pair of epipolar lines for all pixels simultaneously Local method: no guarantee we will have one to.

Graph Algorithms for Vision Amy Gale November 5, 2002.

MAP Estimation of Semi-Metric MRFs via Hierarchical Graph Cuts M. Pawan Kumar Daphne Koller Aim: To obtain accurate, efficient maximum a posteriori (MAP)

Markov Random Fields in Vision

Rounding-based Moves for Metric Labeling M. Pawan Kumar École Centrale Paris INRIA Saclay, Île-de-France.

Energy minimization Another global approach to improve quality of correspondences Assumption: disparities vary (mostly) smoothly Minimize energy function:

Introduction of BP & TRW-S

Alexander Shekhovtsov and Václav Hlaváč

Markov Random Fields with Efficient Approximations

Efficiently Selecting Regions for Scene Understanding

Discrete Inference and Learning

Graphical Models and Learning

MAP Estimation of Semi-Metric MRFs via Hierarchical Graph Cuts

Presentation transcript:

Discrete Optimization in Computer Vision M. Pawan Kumar Slides will be available online

About the Tutorial Emphasis on ‘optimization’ A bit math-y M. Ade-Up, N. Ames. A large-scale study of receptibility to math. The Fake Journal of Convenient Results, September, Best time for math is now (9am – 11am) Ask questions anytime

Outline Problem Formulation –Energy Function –Energy Minimization Graph Cuts Algorithms – Part I Message Passing Algorithms – Part II

Markov Random Field VaVa VbVb VcVc VdVd Label l 0 Label l 1 Random Variables V = {V a, V b, ….} Labels L = {l 0, l 1, ….} Labelling f: {a, b, …. }  {0,1, …}

Energy Function VaVa VbVb VcVc VdVd Q(f) = ∑ a  a;f(a) Unary Potential Label l 0 Label l 1 Easy to minimize Neighbourhood

Energy Function VaVa VbVb VcVc VdVd E : (a,b)  E iff V a and V b are neighbours E = { (a,b), (b,c), (c,d) } Label l 0 Label l 1

Energy Function VaVa VbVb VcVc VdVd +∑ (a,b)  ab;f(a)f(b) Pairwise Potential Label l 0 Label l 1 Q(f) = ∑ a  a;f(a)

Energy Function VaVa VbVb VcVc VdVd Parameter Label l 0 Label l 1 +∑ (a,b)  ab;f(a)f(b) Q(f;  )= ∑ a  a;f(a)

Outline Problem Formulation –Energy Function –Energy Minimization Graph Cuts Algorithms – Part I Message Passing Algorithms – Part II

Energy Minimization VaVa VbVb VcVc VdVd Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1

Energy Minimization VaVa VbVb VcVc VdVd Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) = 13 Label l 0 Label l 1

Energy Minimization VaVa VbVb VcVc VdVd Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1

Energy Minimization VaVa VbVb VcVc VdVd Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) = 27 Label l 0 Label l 1

Energy Minimization VaVa VbVb VcVc VdVd Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) f* = arg min Q(f;  ) In general, NP-hard Label l 0 Label l 1

Energy Minimization f(a)f(b)f(c)f(d) Q(f;  ) possible labellings f(a)f(b)f(c)f(d) Q(f;  ) f* = {1, 0, 0, 1}

Outline Problem Formulation Graph Cuts Algorithms – Part I Message Passing Algorithms – Part II

Interactive Binary Segmentation

Foreground histogram of RGB values FG Background histogram of RGB values BG ‘1’ indicates foreground and ‘0’ indicates background

Interactive Binary Segmentation More likely to be foreground than background

Interactive Binary Segmentation More likely to be background than foreground θ a;0 proportional to -log(BG(d a )) θ a;1 proportional to -log(FG(d a ))

Interactive Binary Segmentation More likely to belong to same label

Interactive Binary Segmentation Less likely to belong to same label θ ab;ik proportional to exp(-(d a -d b ) 2 ) if i ≠ k θ ab;ik = 0 if i = k

Outline – Graph Cuts Algorithms Minimum Cut Problem Two-Label Submodular Energy Functions Move-Making Algorithms

Directed Graph n1n1 n2n2 n3n3 n4n Positive arc lengths D = (N, A)

Cut n1n1 n2n2 n3n3 n4n Let N 1 and N 2 such that N 1 “union” N 2 = N N 1 “intersection” N 2 = Φ C is a set of arcs such that (n 1,n 2 )  A n 1  N 1 n 2  N 2 D = (N, A) C is a cut in the digraph D

Cut n1n1 n2n2 n3n3 n4n What is C? D = (N, A) N1N1 N2N2 {(n 1,n 2 ),(n 1,n 4 )} ? {(n 1,n 4 ),(n 3,n 2 )} ? {(n 1,n 4 )} ? ✓

Cut n1n1 n2n2 n3n3 n4n What is C? D = (N, A) N1N1 N2N2 {(n 1,n 2 ),(n 1,n 4 ),(n 3,n 2 )} ? {(n 1,n 4 ),(n 3,n 2 )} ? {(n 4,n 3 )} ? ✓

Cut n1n1 n2n2 n3n3 n4n What is C? D = (N, A) N2N2 N1N1 {(n 1,n 2 ),(n 1,n 4 ),(n 3,n 2 )} ? {(n 1,n 4 ),(n 3,n 2 )} ? {(n 3,n 2 )} ? ✓

Cut n1n1 n2n2 n3n3 n4n Let N 1 and N 2 such that N 1 “union” N 2 = N N 1 “intersection” N 2 = Φ C is a set of arcs such that (n 1,n 2 )  A n 1  N 1 n 2  N 2 D = (N, A) C is a cut in the digraph D

Weight of a Cut n1n1 n2n2 n3n3 n4n Sum of length of all arcs in C D = (N, A)

Weight of a Cut n1n1 n2n2 n3n3 n4n w(C) = Σ (n 1,n 2 )  C l(n 1,n 2 ) D = (N, A)

Weight of a Cut n1n1 n2n2 n3n3 n4n What is w(C)? D = (N, A) N1N1 N2N2 3

Weight of a Cut n1n1 n2n2 n3n3 n4n What is w(C)? D = (N, A) N1N1 N2N2 5

Weight of a Cut n1n1 n2n2 n3n3 n4n What is w(C)? D = (N, A) N2N2 N1N1 15

st-Cut n1n1 n2n2 n3n3 n4n A source “s” C is a cut such that s  N 1 t  N 2 D = (N, A) C is an st-cut s t A sink “t” 12 73

Weight of an st-Cut n1n1 n2n2 n3n3 n4n D = (N, A) s t w(C) = Σ (n 1,n 2 )  C l(n 1,n 2 )

Weight of an st-Cut n1n1 n2n2 n3n3 n4n D = (N, A) s t What is w(C)? 3

Weight of an st-Cut n1n1 n2n2 n3n3 n4n D = (N, A) s t What is w(C)? 15

Minimum Cut Problem n1n1 n2n2 n3n3 n4n D = (N, A) s t Find a cut with the minimum weight !! C* = argmin C w(C)

[Slide credit: Andrew Goldberg] Augmenting Path and Push-Relabel n: #nodes m: #arcs U: maximum arc length Solvers for the Minimum-Cut Problem

Cut n1n1 n2n2 n3n3 n4n Let N 1 and N 2 such that N 1 “union” N 2 = N N 1 “intersection” N 2 = Φ C is a set of arcs such that (n 1,n 2 )  A n 1  N 1 n 2  N 2 D = (N, A) C is a cut in the digraph D

st-Cut n1n1 n2n2 n3n3 n4n A source “s” C is a cut such that s  N 1 t  N 2 D = (N, A) C is an st-cut s t A sink “t” 12 73

Minimum Cut Problem n1n1 n2n2 n3n3 n4n D = (N, A) s t Find a cut with the minimum weight !! C* = argmin C w(C) w(C) = Σ (n 1,n 2 )  C l(n 1,n 2 )

Remember … Positive arc lengths

Outline – Graph Cuts Algorithms Minimum Cut Problem Two-Label Submodular Energy Functions Move-Making Algorithms Hammer, 1965; Kolmogorov and Zabih, 2004

Overview Energy Q Digraph D Digraph D One node per random variable N = N 1 U N 2 N = N 1 U N 2 Compute Minimum Cut + Additional nodes “s” and “t” Labeling f* Labeling f* n a  N 1 implies f(a) = 0 n a  N 2 implies f(a) = 1

Outline Minimum Cut Problem Two-Label Submodular Energy Functions Unary Potentials Pairwise Potentials Move-Making Algorithms

Digraph for Unary Potentials VaVa θ a;0 θ a;1 P Q f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P ≥ Q P-Q 0 Q Q + Constant P-Q f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P ≥ Q P-Q 0 Q Q + Constant P-Q f(a) = 1 w(C) = 0 f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P ≥ Q P-Q 0 Q Q + Constant P-Q f(a) = 0 w(C) = P-Q f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P < Q 0 Q-P P P + Constant Q-P f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P < Q 0 Q-P P P + Constant f(a) = 1 w(C) = Q-P Q-P f(a) = 0 f(a) = 1

Digraph for Unary Potentials nana P Q s t Let P < Q 0 Q-P P P + Constant f(a) = 0 w(C) = 0 Q-P f(a) = 0 f(a) = 1

Outline – Graph Cuts Algorithms Minimum Cut Problem Two-Label Submodular Energy Functions Unary Potentials Pairwise Potentials Move-Making Algorithms

Digraph for Pairwise Potentials VaVa θ ab;11 VbVb θ ab;00 θ ab;01 θ ab;10 PR QS f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1 00 Q-P 0S-Q 0 0R+Q-S-P PP PP

Digraph for Pairwise Potentials nana nbnb PR QS f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1 00 Q-P 0S-Q 0 0R+Q-S-P PP PP s t Constant

Digraph for Pairwise Potentials nana nbnb PR QS 00 Q-P 0S-Q 0 0R+Q-S-P s t Unary Potential f(b) = 1 Q-P f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1

Digraph for Pairwise Potentials nana nbnb PR QS 0S-Q 0 0R+Q-S-P 00 + s t Unary Potential f(a) = 1 Q-PS-Q f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1

Digraph for Pairwise Potentials nana nbnb PR QS 0R+Q-S-P 00 s t Pairwise Potential f(a) = 1, f(b) = 0 Q-PS-Q f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1 R+Q-S-P

Digraph for Pairwise Potentials nana nbnb PR QS s t Q-PS-Q f(a) = 0f(a) = 1 f(b) = 0 f(b) = 1 R+Q-S-P R+Q-S-P ≥ 0 General 2-label MAP estimation is NP-hard

Overview Energy Q Digraph D Digraph D One node per random variable N = N 1 U N 2 N = N 1 U N 2 Compute Minimum Cut + Additional nodes “s” and “t” Labeling f* Labeling f* n a  N 1 implies f(a) = 0 n a  N 2 implies f(a) = 1

Outline – Graph Cuts Algorithms Minimum Cut Problem Two-Label Submodular Energy Functions Move-Making Algorithms

Stereo Correspondence Disparity Map

Stereo Correspondence L = {disparities} Pixel (x a,y a ) in left corresponds to pixel (x a +v a,y a ) in right

Stereo Correspondence L = {disparities} θ a;i is proportional to the difference in RGB values

Stereo Correspondence L = {disparities} θ ab;ik = w ab d(i,k) w ab proportional to exp(-(d a -d b ) 2 )

Move-Making Algorithms Space of All Labelings f

Expansion Algorithm Variables take label l α or retain current label Slide courtesy Pushmeet Kohli

Expansion Algorithm Sky House Tree Ground Initialize with TreeStatus:Expand GroundExpand HouseExpand Sky Slide courtesy Pushmeet Kohli Variables take label l α or retain current label

Expansion Algorithm Initialize labeling f = f 0 (say f 0 (a) = 0, for all V a ) For α = 0, 2, …, h-1 End f α = argmin f’ Q(f’) s.t. f’(a)  {f(a)} U {l α } Update f = f α Boykov, Veksler and Zabih, 2001 Repeat until convergence

Expansion Algorithm Restriction on pairwise potentials? θ ab;ik + θ ab;αα ≤ θ ab;iα + θ ab;αk Metric Labeling

Outline Problem Formulation –Energy Function –Energy Minimization Graph Cuts Algorithms – Part I Message Passing Algorithms – Part II

Pose Estimation Courtesy Pedro Felzenszwalb

Pose Estimation Courtesy Pedro Felzenszwalb

Pose Estimation Variables are body partsLabels are positions

Pose Estimation Unary potentials θ a;i proportional to fraction of foreground pixels Variables are body partsLabels are positions

Pose Estimation Pairwise potentials θ ab;ik proportional to d 2 Head Torso Joint location according to ‘head’ part Joint location according to ‘torso’ part d

Pose Estimation Pairwise potentials θ ab;ik proportional to d 2 Head Torso d Head Torso d >

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition

Energy Function VaVa VbVb VcVc VdVd Label l 0 Label l 1 +∑ (a,b)  ab;f(a)f(b) Q(f;  )= ∑ a  a;f(a)

Outline – Message Passing Preliminaries –Min-Marginals –Reparameterization Energy Minimization for Trees Dual Decomposition

Min-Marginals VaVa VbVb VcVc VdVd f* = arg min Q(f;  ) such that f(a) = i Min-marginal q a;i Label l 0 Label l 1

Min-Marginals 16 possible labellings q a;0 = 15 f(a)f(b)f(c)f(d) Q(f;  ) f(a)f(b)f(c)f(d) Q(f;  )

Min-Marginals 16 possible labellings q a;1 = 13 f(a)f(b)f(c)f(d) Q(f;  ) f(a)f(b)f(c)f(d) Q(f;  )

Min-Marginals and MAP Minimum min-marginal of any variable = energy of MAP labelling min f Q(f;  ) such that f(a) = i q a;i min i min i ( ) V a has to take one label min f Q(f;  )

Outline – Message Passing Preliminaries –Computing min-marginals –Reparameterization Energy Minimization for Trees Dual Decomposition

Reparameterization VaVa VbVb f(a)f(b) Q(f;  ) Add a constant to all  a;i Subtract that constant from all  b;k

Reparameterization f(a)f(b) Q(f;  ) Add a constant to all  a;i Subtract that constant from all  b;k Q(f;  ’) = Q(f;  ) VaVa VbVb

Reparameterization VaVa VbVb f(a)f(b) Q(f;  ) Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ - 3

Reparameterization VaVa VbVb f(a)f(b) Q(f;  ) Q(f;  ’) = Q(f;  ) Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’

Reparameterization VaVa VbVb VaVa VbVb VaVa VbVb  ’ a;i =  a;i  ’ b;k =  b;k  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Q(f;  ’) = Q(f;  )

Reparameterization Q(f;  ’) = Q(f;  ), for all f  ’ is a reparameterization of , iff  ’    ’ b;k =  b;k  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Equivalently Kolmogorov, PAMI, 2006 VaVa VbVb

Recap Energy Minimization f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Q(f;  ’) = Q(f;  ), for all f  ’   Reparameterization

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition Pearl, 1988

Belief Propagation Belief Propagation is exact for chains Some problems are easy Exact MAP for trees Clever Reparameterization

Outline – Message Passing Preliminaries Energy Minimization for Trees –Two Variables –Three Variables –Chains –Trees Dual Decomposition

Two Variables VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k  a;0 +  ab;00 =  a;1 +  ab;10 = min M ab;0 = Two Variables

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k Two Variables

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 Two Variables Potentials along the red path add up to 0

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k  a;0 +  ab;01 =  a;1 +  ab;11 = min M ab;1 = Two Variables

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 Minimum of min-marginals = MAP estimate Two Variables

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 f*(b) = 0f*(a) = 1 Two Variables

VaVa VbVb VaVa VbVb Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 We get all the min-marginals of V b Two Variables

Recap We only need to know two sets of equations General form of Reparameterization  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i  ’ b;k =  b;k Reparameterization of (a,b) in Belief Propagation M ab;k = min i {  a;i +  ab;ik } M ba;i = 0

Outline – Message Passing Preliminaries Energy Minimization for Trees –Two Variables –Three Variables –Chains –Trees Dual Decomposition

Three Variables VaVa VbVb VcVc Reparameterize the edge (a,b) as before l0l0 l1l1

VaVa VbVb VcVc Reparameterize the edge (a,b) as before f(a) = Three Variables l0l0 l1l1

VaVa VbVb VcVc Reparameterize the edge (a,b) as before f(a) = 1 Potentials along the red path add up to Three Variables l0l0 l1l1

VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to Three Variables l0l0 l1l1

VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = Three Variables l0l0 l1l1

VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 q c;0 q c; Three Variables l0l0 l1l1

VaVa VbVb VcVc f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0f*(a) = 1 Generalizes to any length chain Three Variables l0l0 l1l1

Outline – Message Passing Preliminaries Energy Minimization for Trees –Two Variables –Three Variables –Chains –Trees Dual Decomposition

Belief Propagation on Chains Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik }  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k Repeat till the end of the chain

Belief Propagation on Chains A way of computing reparam constants Generalizes to chains of any length Forward Pass - Start to End MAP estimate Min-marginals of final variable Backward Pass - End to start All other min-marginals

Computational Complexity Each constant takes O(|L|) Number of constants - O(|E||L|) O(|E||L| 2 ) Memory required ? O(|E||L|)

Outline – Message Passing Preliminaries Energy Minimization for Trees –Two Variables –Three Variables –Chains –Trees Dual Decomposition

Belief Propagation on Trees VbVb VaVa Forward Pass: Leaf  Root All min-marginals are computed Backward Pass: Root  Leaf VcVc VdVd VeVe VgVg VhVh

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition

min x ∑ i g i (x) s.t. x  C Minimize the energy of an MRF Each variable is assigned exactly one label

Dual Decomposition min x,x i ∑ i g i (x i ) s.t. x i  C x i = x

Dual Decomposition min x,x i ∑ i g i (x i ) s.t. x i  C

Dual Decomposition min x,x i ∑ i g i (x i ) + ∑ i λ i T (x i -x) s.t. x i  C max λ i KKT Condition: ∑ i λ i = 0

Dual Decomposition min x,x i ∑ i g i (x i ) + ∑ i λ i T x i s.t. x i  C max λ i

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Projected Supergradient Ascent max λ i Supergradient s of h(z) at z 0 h(z) - h(z 0 ) ≤ s T (z-z 0 ), for all z in the feasible region

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Initialize λ i 0 = 0 max λ i

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Compute supergradients max λ i s i = argmin x i ∑ i (g i (x i ) + (λ i t ) T x i )

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Project supergradients max λ i p i = s i - ∑ j s j /m where ‘m’ = number of subproblems (slaves)

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Update dual variables max λ i λ i t+1 = λ i t + η t p i where η t = learning rate = 1/(t+1) for example

Dual Decomposition Initialize λ i 0 = 0 Compute projected supergradients s i = argmin x i ∑ i (g i (x i ) + (λ i t ) T x i ) p i = s i - ∑ j s j /m Update dual variables λ i t+1 = λ i t + η t p i REPEAT

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition –Example 1 –Example 2 –Energy Minimization –Choice of Subproblems

DD – Example 1 VaVa VbVb VbVb VcVc VcVc VaVa Strong Tree Agreement f 1 (a) = 0f 1 (b) = 0f 2 (b) = 0f 2 (c) = 0f 3 (c) = 0f 3 (a) = 0 l0l0 l1l1

DD – Example 1 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l

DD – Example 1 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l x a;0 x a;1 x b;0 x b;1 x c;0 x c;1 Optimal solution

Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l s a;0 s a;1 s b;0 s b;1 s c;0 s c;

Projected Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l p a;0 p a;1 p b;0 p b;1 p c;0 p c;

Objective VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l1 No further increase in dual objective 6.5 7

DD VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l1 Strong Tree Agreement implies DD stops No further increase in dual objective 6.5 7

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition –Example 1 –Example 2 –Energy Minimization –Choice of Subproblems

DD – Example 2 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l Weak Tree Agreement f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) =

DD – Example 2 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l

DD – Example 2 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l x a;0 x a;1 x b;0 x b;1 x c;0 x c;1 Optimal solution -0.2

Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l s a;0 s a;1 s b;0 s b;1 s c;0 s c;

Projected Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l p a;0 p a;1 p b;0 p b;1 p c;0 p c;

Update with Learning Rate η t = 1 VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l p a;0 p a;1 p b;0 p b;1 p c;0 p c;

Objective VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l Decrease in dual objective

Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l s a;0 s a;1 s b;0 s b;1 s c;0 s c;

Projected Supergradients VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l p a;0 p a;1 p b;0 p b;1 p c;0 p c;

VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l p a;0 p a;1 p b;0 p b;1 p c;0 p c; Update with Learning Rate η t = 1/2

VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l Updated Subproblems

VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l Objective Increase in dual objective

VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l DD Increase in dual objective

VaVa VbVb VbVb VcVc VcVc VaVa l0l0 l1l DD DD eventually converges Satisfies weak tree agreement

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition –Example 1 –Example 2 –Energy Minimization –Choice of Subproblems

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 1010 s 4 a = Slaves agree on label for V a

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 1010 s 4 a = 0000 p 1 a = 0000 p 4 a =

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 0101 s 4 a = Slaves disagree on label for V a

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 0101 s 4 a = p 1 a = p 4 a = Unary cost increases

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 0101 s 4 a = p 1 a = p 4 a = Unary cost decreases

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 6 s 1 a = 0101 s 4 a = p 1 a = p 4 a = Push the slaves towards agreement

Outline – Message Passing Preliminaries Energy Minimization for Trees Dual Decomposition –Example 1 –Example 2 –Energy Minimization –Choice of Subproblems

Subproblems VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi Black edges submodular Red edges supermodular Binary labeling problem VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi

Subproblems VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi Black edges submodular Red edges supermodular Binary labeling problem VaVa VbVb VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi Remains submodular over iterations

Tighter Relaxations VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VdVd VeVe VbVb VcVc VeVe VfVf VdVd VeVe VgVg VhVh VeVe VfVf VhVh ViVi Relaxation that is tight for the above 4-cycles

High-Order Potentials VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VdVd VeVe VgVg VhVh ViVi VaVa VdVd VeVe VfVf VgVg VhVh ViVi VbVb VcVc VeVe VfVf

VbVb VcVc VeVe VfVf Labeling y for Clique Value of Potential θ c;y Subproblem: min y θ c;y + λ T y O(h |C| )!!

Sparse High-Order Potentials VbVb VcVc VeVe VfVf Labeling y for Clique Value of Potential θ c;y Subproblem: min y θ c;y + λ T y O(h|C|)!! Σ a y a;0 = 0 Σ a y a;0 > 0

Sparse High-Order Potentials Many useful potentials are sparse P n Potts Model Pattern-based Potentials Uniqueness constraints Covering constraints And now you can solve them efficiently !!