MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I

Aim of the Tutorial Description of some successful algorithms Computational issues Enough details to implement Some proofs will be skipped :-( But references to them will be given :-)

A Vision Application Binary Image Segmentation How ? Cost functionModels our knowledge about natural images Optimize cost function to obtain the segmentation

Object - white, Background - green/grey Graph G = (V,E) Each vertex corresponds to a pixel Edges define a 4-neighbourhood grid graph Assign a label to each vertex from L = {obj,bkg} A Vision Application Binary Image Segmentation

Graph G = (V,E) Cost of a labelling f : V  L Per Vertex Cost Cost of label ‘obj’ low Cost of label ‘bkg’ high Object - white, Background - green/grey A Vision Application Binary Image Segmentation

Graph G = (V,E) Cost of a labelling f : V  L Cost of label ‘obj’ high Cost of label ‘bkg’ low Per Vertex Cost UNARY COST Object - white, Background - green/grey A Vision Application Binary Image Segmentation

Graph G = (V,E) Cost of a labelling f : V  L Per Edge Cost Cost of same label low Cost of different labels high Object - white, Background - green/grey A Vision Application Binary Image Segmentation

Graph G = (V,E) Cost of a labelling f : V  L Cost of same label high Cost of different labels low Per Edge Cost PAIRWISE COST Object - white, Background - green/grey A Vision Application Binary Image Segmentation

Graph G = (V,E) Problem: Find the labelling with minimum cost f* Object - white, Background - green/grey A Vision Application Binary Image Segmentation

Graph G = (V,E) Problem: Find the labelling with minimum cost f* A Vision Application Binary Image Segmentation

Another Vision Application Object Detection using Parts-based Models How ? Once again, by defining a good cost function

HT L1 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ 1 Edges define a TREE Assign a label to each vertex from L = {positions} Graph G = (V,E) L2L3L4 Another Vision Application Object Detection using Parts-based Models

2 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

3 Each vertex corresponds to a part - ‘Head’, ‘Torso’, ‘Legs’ Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

Cost of a labelling f : V  L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

Yet Another Vision Application Stereo Correspondence Disparity Map How ? Minimizing a cost function

Yet Another Vision Application Stereo Correspondence Graph G = (V,E) Vertex corresponds to a pixel Edges define grid graph L = {disparities}

Yet Another Vision Application Stereo Correspondence Cost of labelling f : Unary cost + Pairwise Cost Find minimum cost f*

The General Problem b a e d c f Graph G = ( V, E ) Discrete label set L = {1,2,…,h} Assign a label to each vertex f: V  L 1 12 22 3 Cost of a labelling Q(f) Unary CostPairwise Cost Find f* = arg min Q(f)

Outline Problem Formulation –Energy Function –MAP Estimation –Computing min-marginals Reparameterization Belief Propagation Tree-reweighted Message Passing

Energy Function VaVa VbVb VcVc VdVd Label l 0 Label l 1 DaDa DbDb DcDc DdDd Random Variables V = {V a, V b, ….} Labels L = {l 0, l 1, ….}Data D Labelling f: {a, b, …. }  {0,1, …}

Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd Q(f) = ∑ a  a;f(a) Unary Potential 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Easy to minimize Neighbourhood

Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd E : (a,b)  E iff V a and V b are neighbours E = { (a,b), (b,c), (c,d) } 2 5 4 2 6 3 3 7 Label l 0 Label l 1

Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd +∑ (a,b)  ab;f(a)f(b) Pairwise Potential 0 1 1 0 0 2 1 1 41 0 3 2 5 4 2 6 3 3 7 Label l 0 Label l 1 Q(f) = ∑ a  a;f(a)

Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd 0 1 1 0 0 2 1 1 41 0 3 Parameter 2 5 4 2 6 3 3 7 Label l 0 Label l 1 +∑ (a,b)  ab;f(a)f(b) Q(f;  )= ∑ a  a;f(a)

MAP Estimation VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1

MAP Estimation VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 2 + 1 + 2 + 1 + 3 + 1 + 3 = 13 Label l 0 Label l 1

MAP Estimation VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Label l 0 Label l 1

MAP Estimation VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) 5 + 1 + 4 + 0 + 6 + 4 + 7 = 27 Label l 0 Label l 1

MAP Estimation VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) f* = arg min Q(f;  ) q* = min Q(f;  ) = Q(f*;  ) Label l 0 Label l 1

MAP Estimation f(a)f(b)f(c)f(d) Q(f;  ) 000018 000115 001027 001120 010022 010119 011027 011120 16 possible labellings f(a)f(b)f(c)f(d) Q(f;  ) 100016 100113 101025 101118 1100 110115 111023 111116 f* = {1, 0, 0, 1} q* = 13

Computational Complexity Segmentation 2 |V| |V| = number of pixels ≈ 320 * 480 = 153600

Computational Complexity |L| = number of pixels ≈ 153600 Detection |L| |V|

Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Can we do better than brute-force? MAP Estimation is NP-hard !!

Computational Complexity |V| = number of pixels ≈ 153600 Stereo |L| |V| Exact algorithms do exist for special cases Good approximate algorithms for general case But first … two important definitions

Min-Marginals VaVa VbVb VcVc VdVd 2 5 4 2 6 3 3 7 0 1 1 0 0 2 1 1 41 0 3 f* = arg min Q(f;  ) such that f(a) = i Min-marginal q a;i Label l 0 Label l 1 Not a marginal (no summation)

Min-Marginals 16 possible labellings q a;0 = 15 f(a)f(b)f(c)f(d) Q(f;  ) 000018 000115 001027 001120 010022 010119 011027 011120 f(a)f(b)f(c)f(d) Q(f;  ) 100016 100113 101025 101118 1100 110115 111023 111116

Min-Marginals 16 possible labellings q a;1 = 13 f(a)f(b)f(c)f(d) Q(f;  ) 100016 100113 101025 101118 1100 110115 111023 111116 f(a)f(b)f(c)f(d) Q(f;  ) 000018 000115 001027 001120 010022 010119 011027 011120

Min-Marginals and MAP Minimum min-marginal of any variable = energy of MAP labelling min f Q(f;  ) such that f(a) = i q a;i min i min i ( ) V a has to take one label min f Q(f;  )

Summary MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Energy Function

Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing

Reparameterization VaVa VbVb 2 5 4 2 0 1 1 0 f(a)f(b) Q(f;  ) 007 0110 105 116 2 + - 2 Add a constant to all  a;i Subtract that constant from all  b;k

Reparameterization f(a)f(b) Q(f;  ) 007 + 2 - 2 0110 + 2 - 2 105 + 2 - 2 116 + 2 - 2 Add a constant to all  a;i Subtract that constant from all  b;k Q(f;  ’) = Q(f;  ) VaVa VbVb 2 5 4 2 0 0 2 + - 2 1 1

Reparameterization VaVa VbVb 2 5 4 2 0 1 1 0 f(a)f(b) Q(f;  ) 007 0110 105 116 - 3 + 3 Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’ - 3

Reparameterization VaVa VbVb 2 5 4 2 0 1 1 0 f(a)f(b) Q(f;  ) 007 0110 - 3 + 3 105 116 - 3 + 3 - 3 + 3 - 3 Q(f;  ’) = Q(f;  ) Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’

Reparameterization VaVa VbVb 2 5 4 2 31 0 1 2 VaVa VbVb 2 5 4 2 31 1 0 1 - 2 + 2 + 1 - 1 VaVa VbVb 2 5 4 2 31 2 1 0 - 4+ 4 - 4  ’ a;i =  a;i  ’ b;k =  b;k  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Q(f;  ’) = Q(f;  )

Reparameterization Q(f;  ’) = Q(f;  ), for all f  ’ is a reparameterization of , iff  ’    ’ b;k =  b;k  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Equivalently Kolmogorov, PAMI, 2006 VaVa VbVb 2 5 4 2 0 0 2 + - 2 11

Recap MAP Estimation f* = arg min Q(f;  ) Q(f;  ) = ∑ a  a;f(a) + ∑ (a,b)  ab;f(a)f(b) Min-marginals q a;i = min Q(f;  ) s.t. f(a) = i Q(f;  ’) = Q(f;  ), for all f  ’   Reparameterization

Outline Problem Formulation Reparameterization Belief Propagation – Exact MAP for Chains and Trees – Approximate MAP for general graphs – Computational Issues and Theoretical Properties Tree-reweighted Message Passing

Belief Propagation Belief Propagation gives exact MAP for chains Remember, some MAP problems are easy Exact MAP for trees Clever Reparameterization

Two Variables VaVa VbVb 2 5 2 1 0 VaVa VbVb 2 5 40 1 Choose the right constant  ’ b;k = q b;k Add a constant to one  b;k Subtract that constant from  ab;ik for all ‘i’

VaVa VbVb 2 5 2 1 0 VaVa VbVb 2 5 40 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;00 = 5 + 0  a;1 +  ab;10 = 2 + 1 min M ab;0 = Two Variables

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 40 1 Choose the right constant  ’ b;k = q b;k Two Variables

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 40 1 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 Two Variables Potentials along the red path add up to 0

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 40 1 Choose the right constant  ’ b;k = q b;k  a;0 +  ab;01 = 5 + 1  a;1 +  ab;11 = 2 + 0 min M ab;1 = Two Variables

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 6-2 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 Minimum of min-marginals = MAP estimate Two Variables

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 6-2 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 f*(b) = 0f*(a) = 1 Two Variables

VaVa VbVb 2 5 5 -2 -3 VaVa VbVb 2 5 6-2 Choose the right constant  ’ b;k = q b;k f(a) = 1  ’ b;0 = q b;0 f(a) = 1  ’ b;1 = q b;1 We get all the min-marginals of V b Two Variables

Recap We only need to know two sets of equations General form of Reparameterization  ’ a;i =  a;i  ’ ab;ik =  ab;ik + M ab;k - M ab;k + M ba;i - M ba;i  ’ b;k =  b;k Reparameterization of (a,b) in Belief Propagation M ab;k = min i {  a;i +  ab;ik } M ba;i = 0

Three Variables VaVa VbVb 2 5 2 1 0 VcVc 460 1 0 1 3 2 3 Reparameterize the edge (a,b) as before l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 660 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 -22 3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 660 1 -2 3 Reparameterize the edge (a,b) as before f(a) = 1 Potentials along the red path add up to 0 -22 3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 660 1 -2 3 Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 -22 3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 -2-4 -3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 q c;0 q c;1 -2-4 -3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0f*(a) = 1 Generalizes to any length chain -2-4 -3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0f*(a) = 1 Only Dynamic Programming -2-4 -3 Three Variables l0l0 l1l1

Why Dynamic Programming? 3 variables  2 variables + book-keeping n variables  (n-1) variables + book-keeping Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik }  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k Repeat

Why Dynamic Programming? Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik }  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k Repeat MessagesMessage Passing Why stop at dynamic programming?

VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 Reparameterize the edge (c,b) as before -2-4 -3 Three Variables l0l0 l1l1

VaVa VbVb 2 5 9 -3 VcVc 1112-11 -9 -2 9 Reparameterize the edge (c,b) as before -2-9 -7  ’ b;i = q b;i Three Variables l0l0 l1l1

VaVa VbVb 2 5 9 -3 VcVc 1112-11 -9 -2 9 Reparameterize the edge (b,a) as before -2-9 -7 Three Variables l0l0 l1l1

VaVa VbVb 9 11 9 -9 VcVc 1112-11 -9 9 Reparameterize the edge (b,a) as before -9-7-9 -7  ’ a;i = q a;i Three Variables l0l0 l1l1

VaVa VbVb 9 11 9 -9 VcVc 1112-11 -9 9 Forward Pass   Backward Pass -9-7-9 -7 All min-marginals are computed Three Variables l0l0 l1l1

Belief Propagation on Chains Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i {  a;i +  ab;ik }  ’ ab;ik =  ab;ik + M ab;k - M ab;k  ’ b;k =  b;k Repeat till the end of the chain Start from right, go to left Repeat till the end of the chain

Belief Propagation on Chains A way of computing reparam constants Generalizes to chains of any length Forward Pass - Start to End MAP estimate Min-marginals of final variable Backward Pass - End to start All other min-marginals Won’t need this.. But good to know

Computational Complexity Each constant takes O(|L|) Number of constants - O(|E||L|) O(|E||L| 2 ) Memory required ? O(|E||L|)

Belief Propagation on Trees VbVb VaVa Forward Pass: Leaf  Root All min-marginals are computed Backward Pass: Root  Leaf VcVc VdVd VeVe VgVg VhVh

Belief Propagation on Cycles VaVa VbVb VdVd VcVc Where do we start? Arbitrarily  a;0  a;1  b;0  b;1  d;0  d;1  c;0  c;1 Reparameterize (a,b)

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Pick minimum min-marginal. Follow red path. -  a;0 -  a;1  ’ a;0 -  a;0 = q a;0  ’ a;1 -  a;1 = q a;1

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  c;0  c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  d;0  d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  a;0  a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Potentials along the red path add up to 0 -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≤

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Problem Not Solved -  a;0 -  a;1  ’ a;1 -  a;1 = q a;1  ’ a;0 -  a;0 ≤ q a;0 ≥

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’ b;0  ’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 -  a;0 -  a;1 Reparameterize (a,b) again

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again But doesn’t this overcount some potentials?

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Reparameterize (a,b) again Yes. But we will do it anyway

Belief Propagation on Cycles VaVa VbVb VdVd VcVc  ’ a;0  ’ a;1  ’’ b;0  ’’ b;1  ’ d;0  ’ d;1  ’ c;0  ’ c;1 Keep reparameterizing edges in some order Hope for convergence and a good solution

Belief Propagation Generalizes to any arbitrary random field Complexity per iteration ? O(|E||L| 2 ) Memory required ? O(|E||L|)

Computational Issues of BP Complexity per iteration O(|E||L| 2 ) Special Pairwise Potentials  ab;ik = w ab d(|i-k|) i - k d Potts i - k d Truncated Linear i - k d Truncated Quadratic O(|E||L|) Felzenszwalb & Huttenlocher, 2004

Computational Issues of BP Memory requirements O(|E||L|) Half of original BP Kolmogorov, 2006 Some approximations exist But memory still remains an issue Yu, Lin, Super and Tan, 2007 Lasserre, Kannan and Winn, 2007

Computational Issues of BP Order of reparameterization Randomly Residual Belief Propagation In some fixed order The one that results in maximum change Elidan et al., 2006

Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Run BP. Assume it converges.

Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a tree. Change their labels. Value of energy does not decrease

Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a cycle. Change their labels. Value of energy does not decrease

Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? For cycles, if BP converges then exact MAP Weiss and Freeman, 2001

Results Object Detection Felzenszwalb and Huttenlocher, 2004 H TA1A2 L1L2 Labels - Poses of parts Unary Potentials: Fraction of foreground pixels Pairwise Potentials: Favour Valid Configurations

Results Object Detection Felzenszwalb and Huttenlocher, 2004

Results Binary Segmentation Szeliski et al., 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Belief Propagation

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Global optimum Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Belief Propagation Stereo Correspondence

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Global optimum Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

Summary of BP Exact for chains Exact for trees Approximate MAP for general cases Not even convergence guaranteed So can we do something better?

Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

TRW Message Passing Convex (not Combinatorial) Optimization A different look at the same problem A similar solution Combinatorial (not Convex) Optimization We will look at the most general MAP estimation Not trees No assumption on potentials

Things to Remember Forward-pass computes min-marginals of root BP is exact for trees Every iteration provides a reparameterization Basics of Mathematical Optimization

Mathematical Optimization min g 0 (x) subject to g i (x) ≤ 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) ≤ 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

Integer Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) ≤ 0} x* = arg Optimal Solution g 0 (x*) Optimal Value x k  Z

Feasible Region Generally NP-hard to optimize

Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) ≤ 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

Linear Programming min g 0 (x) subject to g i (x) ≤ 0 i=1, …, N Linear objective function Linear constraints Feasible region = {x | g i (x) ≤ 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

Linear Programming min c T x subject to Ax ≤ b i=1, …, N Linear objective function Linear constraints Feasible region = {x | Ax ≤ b} x* = arg Optimal Solution c T x* Optimal Value Polynomial-time Solution

Feasible Region Polynomial-time Solution

Feasible Region Optimal solution lies on a vertex (obj func linear)

Integer Programming Formulation VaVa VbVb Label l 0 Label l 1 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0y a;1 = 1 y b;0 = 1y b;1 = 0 Any f(.) has equivalent boolean variables y a;i

Integer Programming Formulation VaVa VbVb 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0y a;1 = 1 y b;0 = 1y b;1 = 0 Find the optimal variables y a;i Label l 0 Label l 1

Integer Programming Formulation VaVa VbVb 2 5 4 2 0 1 1 0 2 Unary Potentials  a;0 = 5  a;1 = 2  b;0 = 2  b;1 = 4 Sum of Unary Potentials ∑ a ∑ i  a;i y a;i y a;i  {0,1}, for all V a, l i ∑ i y a;i = 1, for all V a Label l 0 Label l 1

Integer Programming Formulation VaVa VbVb 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y a;i y b;k y a;i  {0,1} ∑ i y a;i = 1 Label l 0 Label l 1

Integer Programming Formulation VaVa VbVb 2 5 4 2 0 1 1 0 2 Pairwise Potentials  ab;00 = 0  ab;10 = 1  ab;01 = 1  ab;11 = 0 Sum of Pairwise Potentials ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Label l 0 Label l 1

Integer Programming Formulation min ∑ a ∑ i  a;i y a;i + ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k

Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]

One variable, two labels y a;0 y a;1 y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ]

Two variables, two labels  = [  a;0  a;1  b;0  b;1  ab;00  ab;01  ab;10  ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

In General Marginal Polytope

In General   R (|V||L| + |E||L| 2 ) y  {0,1} (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L| 2 y a;i  {0,1} ∑ i y a;i = 1y ab;ik = y a;i y b;k

Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]

Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Solve to obtain MAP labelling y*

Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k But we can’t solve it in general

Linear Programming Relaxation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Two reasons why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 y ab;ik = y a;i y b;k One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = ∑ k y a;i y b;k One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 One reason why we can’t solve this = 1 ∑ k y ab;ik = y a;i ∑ k y b;k

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this * * memory requirements, time complexity

One variable, two labels y a;0 y a;1 y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ]

One variable, two labels y a;0 y a;1 y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ]  = [  a;0  a;1 ]

Two variables, two labels  = [  a;0  a;1  b;0  b;1  ab;00  ab;01  ab;10  ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  {0,1} y a;1  {0,1} y a;0 + y a;1 = 1 y b;0  {0,1} y b;1  {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

Two variables, two labels  = [  a;0  a;1  b;0  b;1  ab;00  ab;01  ab;10  ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

Two variables, two labels  = [  a;0  a;1  b;0  b;1  ab;00  ab;01  ab;10  ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

Two variables, two labels  = [  a;0  a;1  b;0  b;1  ab;00  ab;01  ab;10  ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0  [0,1] y a;1  [0,1] y a;0 + y a;1 = 1 y b;0  [0,1] y b;1  [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 + y ab;11 = y a;1

In General Marginal Polytope Local Polytope

In General   R (|V||L| + |E||L| 2 ) y  [0,1] (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L|

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this

Linear Programming Relaxation Extensively studied Optimization Schlesinger, 1976 Koster, van Hoesel and Kolen, 1998 Theory Chekuri et al, 2001Archer et al, 2004 Machine Learning Wainwright et al., 2001

Linear Programming Relaxation Many interesting Properties Global optimal MAP for trees Wainwright et al., 2001 But we are interested in NP-hard cases Preserves solution for reparameterization

Linear Programming Relaxation Large class of problems Metric Labelling Semi-metric Labelling Many interesting Properties - Integrality Gap Manokaran et al., 2008 Most likely, provides best possible integrality gap

Linear Programming Relaxation A computationally useful dual Many interesting Properties - Dual Optimal value of dual = Optimal value of primal Easier-to-solve

Dual of the LP Relaxation Wainwright et al., 2001 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i

Dual of the LP Relaxation Wainwright et al., 2001 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 11 22 33 44 55 66   i  i =   i ≥ 0

Dual of the LP Relaxation Wainwright et al., 2001 11 22 33 44 55 66 q*(  1 )   i  i =  q*(  2 ) q*(  3 ) q*(  4 )q*(  5 )q*(  6 )   i q*(  i ) Dual of LP  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  i ≥ 0 max

Dual of the LP Relaxation Wainwright et al., 2001 11 22 33 44 55 66 q*(  1 )  ii   ii   q*(  2 ) q*(  3 ) q*(  4 )q*(  5 )q*(  6 ) Dual of LP  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  i ≥ 0   i q*(  i ) max

Dual of the LP Relaxation Wainwright et al., 2001  ii   ii   max   i q*(  i ) I can easily compute q*(  i ) I can easily maintain reparam constraint So can I easily solve the dual?

TRW Message Passing Kolmogorov, 2006 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 11 22 33 44 55 66 44 55 66  ii   ii     i q*(  i ) Pick a variable VaVa

TRW Message Passing Kolmogorov, 2006  ii   ii     i q*(  i ) VcVc VbVb VaVa  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1 VaVa VdVd VgVg  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1

TRW Message Passing Kolmogorov, 2006  1  1 +  4  4 +  rest    1 q*(  1 ) +  4 q*(  4 ) + K VcVc VbVb VaVa VaVa VdVd VgVg Reparameterize to obtain min-marginals of V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1

TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest VcVc VbVb VaVa  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1 VaVa VdVd VgVg  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 One pass of Belief Propagation  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K

TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg Remain the same  1 q*(  ’ 1 ) +  4 q*(  ’ 4 ) + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1

TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest    1 min{  ’ 1 a;0,  ’ 1 a;1 } +  4 min{  ’ 4 a;0,  ’ 4 a;1 } + K VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1

TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg Compute weighted average of min-marginals of V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0,  ’ 1 a;1 } +  4 min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006  1  ’ 1 +  4  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0,  ’ 1 a;1 } +  4 min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0,  ’ 1 a;1 } +  4 min{  ’ 4 a;0,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  1 min{  ’ 1 a;0,  ’ 1 a;1 } +  4 min{  ’ 4 a;0,  ’ 4 a;1 } + K  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg  1 min{  ’’ a;0,  ’’ a;1 } +  4 min{  ’’ a;0,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg (  1 +  4 ) min{  ’’ a;0,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  1  ’ 1 a;0 +  4  ’ 4 a;0  1 +  4  ’’ a;1 =  1  ’ 1 a;1 +  4  ’ 4 a;1  1 +  4

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg (  1 +  4 ) min{  ’’ a;0,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min {p 1 +p 2, q 1 +q 2 }min {p 1, q 1 } + min {p 2, q 2 } ≥

TRW Message Passing Kolmogorov, 2006  1  ’’ 1 +  4  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg Objective function increases or remains constant  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 (  1 +  4 ) min{  ’’ a;0,  ’’ a;1 } + K

TRW Message Passing Initialize  i. Take care of reparam constraint Choose random variable V a Compute min-marginals of V a for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006 Can also do edge-averaging

Example 1 VaVa VbVb 0 1 1 0 2 5 4 2 l0l0 l1l1 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa 1 4 1 0 6 3 6 4  2 =1  3 =1  1 =1 56 7 Pick variable V a. Reparameterize.

Example 1 VaVa VbVb -3 -2 -2 5 7 4 2 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa -3 1 6 3 10 7  2 =1  3 =1  1 =1 56 7 Average the min-marginals of V a l0l0 l1l1

Example 1 VaVa VbVb -3 -2 -2 7.5 7 4 2 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa -3 1 6 3 7.5 7  2 =1  3 =1  1 =1 76 7 Pick variable V b. Reparameterize. l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.5 7 VbVb VcVc -5 -3 -3 9 6 6 3 VcVc VaVa 1 6 3 7.5 7  2 =1  3 =1  1 =1 76 7 Average the min-marginals of V b l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 7 Value of dual does not increase l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7  2 =1  3 =1  1 =1 6.5 7 Maybe it will increase for V c NO l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7  2 =1  3 =1  1 =1 Strong Tree Agreement Exact MAP Estimate f 1 (a) = 0f 1 (b) = 0f 2 (b) = 0f 2 (c) = 0f 3 (c) = 0f 3 (a) = 0 l0l0 l1l1

Example 2 VaVa VbVb 0 1 1 0 2 5 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 1 1 0 0 3 4 8  2 =1  3 =1  1 =1 40 4 Pick variable V a. Reparameterize. l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 7 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 9  2 =1  3 =1  1 =1 40 4 Average the min-marginals of V a l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8  2 =1  3 =1  1 =1 40 4 Value of dual does not increase l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8  2 =1  3 =1  1 =1 40 4 Maybe it will decrease for V b or V c NO l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8  2 =1  3 =1  1 =1 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1 Weak Tree Agreement Not Exact MAP Estimate l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8  2 =1  3 =1  1 =1 Weak Tree Agreement Convergence point of TRW l0l0 l1l1 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1

Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  ’ =   i  i   Fix the label Of V a

Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  ’ =   i  i   Fix the label Of V b Continue in some fixed order Meltzer et al., 2006

Computational Issues of TRW Speed-ups for some pairwise potentials Basic Component is Belief Propagation Felzenszwalb & Huttenlocher, 2004 Memory requirements cut down by half Kolmogorov, 2006 Further speed-ups using monotonic chains Kolmogorov, 2006

Theoretical Properties of TRW Always converges, unlike BP Kolmogorov, 2006 Strong tree agreement implies exact MAP Wainwright et al., 2001 Optimal MAP for two-label submodular problems Kolmogorov and Wainwright, 2005  ab;00 +  ab;11 ≤  ab;01 +  ab;10

Results Binary Segmentation Szeliski et al., 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels TRW

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

Results Stereo Correspondence Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels TRW Stereo Correspondence

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

Results Non-submodular problems Kolmogorov, 2006 BP TRW-S 30x30 grid K 50 BP TRW-S BP outperforms TRW-S

Summary Trees can be solved exactly - BP No guarantee of convergence otherwise - BP Strong Tree Agreement - TRW-S Submodular energies solved exactly - TRW-S TRW-S solves an LP relaxation of MAP estimation Loopier graphs give worse results Rother and Kolmogorov, 2006

Related New(er) Work Solving the Dual Globerson and Jaakkola, 2007 Komodakis, Paragios and Tziritas 2007 Weiss et al., 2006 Schlesinger and Giginyak, 2007 Solving the Primal Ravikumar, Agarwal and Wainwright, 2008

Related New(er) Work More complex relaxations Sontag and Jaakkola, 2007 Komodakis and Paragios, 2008 Kumar, Kolmogorov and Torr, 2007 Werner, 2008 Sontag et al., 2008 Kumar and Torr, 2008

Questions on Part I ? Code + Standard Data http://vision.middlebury.edu/MRF

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

Similar presentations

Presentation on theme: "MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

Similar presentations

Presentation on theme: "MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I."— Presentation transcript:

Similar presentations

About project

Feedback