Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Probabilistic Inference Lecture 5 M. Pawan Kumar pawan.kumar@ecp.fr Slides available online http://cvc.centrale-ponts.fr/personnel/pawan/

Open Book –Textbooks –Research Papers –Course Slides –No Electronic Devices Easy Questions – 10 points Hard Questions – 10 points What to Expect in the Final Exam

Easy Question – BP Compute the reparameterization constants for (a,b) and (c,b) such that the unary potentials of b are equal to its min-marginals. VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 -4 -3

Hard Question – BP Provide an O(h) algorithm to compute the reparameterization constants of BP for an edge whose pairwise potentials are specified by a truncated linear model.

Easy Question – Minimum Cut Provide the graph corresponding to the MAP estimation problem in the following MRF. VaVa VbVb 2 5 5 -3 VcVc 612-6 -5 -2 9 -4 -3

Hard Question – Minimum Cut Show that the expansion algorithm provides a bound of 2M for the truncated linear metric, where M is the value of the truncation.

Easy Question – Relaxations Using an example, show that the LP-S relaxation is not tight for a frustrated cycle (cycle with an odd number of supermodular pairwise potentials).

Hard Question – Relaxations Prove or disprove that the LP-S and SOCP-MS relaxations are invariant to reparameterization.

Integer Programming Formulation min ∑ a ∑ i  a;i y a;i + ∑ (a,b) ∑ ik  ab;ik y ab;ik y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k

Integer Programming Formulation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k  = [ …  a;i …. ; …  ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]

Linear Programming Relaxation min  T y y a;i  {0,1} ∑ i y a;i = 1 y ab;ik = y a;i y b;k Two reasons why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 y ab;ik = y a;i y b;k One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = ∑ k y a;i y b;k One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 One reason why we can’t solve this = 1 ∑ k y ab;ik = y a;i ∑ k y b;k

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i One reason why we can’t solve this

Linear Programming Relaxation min  T y y a;i  [0,1] ∑ i y a;i = 1 ∑ k y ab;ik = y a;i No reason why we can’t solve this * * memory requirements, time complexity

Dual of the LP Relaxation Wainwright et al., 2001 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66   i = 

Dual of the LP Relaxation Wainwright et al., 2001 q*(  1 )   i =  q*(  2 ) q*(  3 ) q*(  4 )q*(  5 )q*(  6 )  q*(  i ) Dual of LP  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi max

Dual of the LP Relaxation Wainwright et al., 2001 q*(  1 )  i   i   q*(  2 ) q*(  3 ) q*(  4 )q*(  5 )q*(  6 ) Dual of LP  VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  q*(  i ) max

Dual of the LP Relaxation Wainwright et al., 2001  i   i   max  q*(  i ) I can easily compute q*(  i ) I can easily maintain reparam constraint So can I easily solve the dual?

TRW Message Passing Dual Decomposition Outline

Things to Remember Forward-pass computes min-marginals of root BP is exact for trees Every iteration provides a reparameterization

TRW Message Passing Kolmogorov, 2006 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66  i   i    q*(  i ) Pick a variable VaVa

TRW Message Passing Kolmogorov, 2006  i   i    q*(  i ) VcVc VbVb VaVa  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1 VaVa VdVd VgVg  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1

TRW Message Passing Kolmogorov, 2006  1 +  4 +  rest   q*(  1 ) + q*(  4 ) + K VcVc VbVb VaVa VaVa VdVd VgVg Reparameterize to obtain min-marginals of V a  1 c;0  1 c;1  1 b;0  1 b;1  1 a;0  1 a;1  4 a;0  4 a;1  4 d;0  4 d;1  4 g;0  4 g;1

TRW Message Passing Kolmogorov, 2006  ’ 1 +  ’ 4 +  rest VcVc VbVb VaVa  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1 VaVa VdVd VgVg  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 One pass of Belief Propagation q*(  ’ 1 ) + q*(  ’ 4 ) + K

TRW Message Passing Kolmogorov, 2006  ’ 1 +  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg Remain the same q*(  ’ 1 ) + q*(  ’ 4 ) + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1

TRW Message Passing Kolmogorov, 2006  ’ 1 +  ’ 4 +  rest   min{  ’ 1 a;0,  ’ 1 a;1 } + min{  ’ 4 a;0,  ’ 4 a;1 } + K VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1

TRW Message Passing Kolmogorov, 2006  ’ 1 +  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg Compute average of min-marginals of V a  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min{  ’ 1 a;0,  ’ 1 a;1 } + min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006  ’ 1 +  ’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg  ’’ a;0 =  ’ 1 a;0 +  ’ 4 a;0 2  ’’ a;1 =  ’ 1 a;1 +  ’ 4 a;1 2  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’ 1 a;0  ’ 1 a;1  ’ 4 a;0  ’ 4 a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min{  ’ 1 a;0,  ’ 1 a;1 } + min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006  ’’ 1 +  ’’ 4 +  rest VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  ’ 1 a;0 +  ’ 4 a;0 2  ’’ a;1 =  ’ 1 a;1 +  ’ 4 a;1 2 min{  ’ 1 a;0,  ’ 1 a;1 } + min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006  ’’ 1 +  ’’ 4 +  rest   VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ a;0 =  ’ 1 a;0 +  ’ 4 a;0 2  ’’ a;1 =  ’ 1 a;1 +  ’ 4 a;1 2 min{  ’ 1 a;0,  ’ 1 a;1 } + min{  ’ 4 a;0,  ’ 4 a;1 } + K

TRW Message Passing Kolmogorov, 2006 VcVc VbVb VaVa VaVa VdVd VgVg 2 min{  ’’ a;0,  ’’ a;1 } + K  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1  ’’ 1 +  ’’ 4 +  rest    ’’ a;0 =  ’ 1 a;0 +  ’ 4 a;0 2  ’’ a;1 =  ’ 1 a;1 +  ’ 4 a;1 2

TRW Message Passing Kolmogorov, 2006 VcVc VbVb VaVa VaVa VdVd VgVg  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 min {p 1 +p 2, q 1 +q 2 }min {p 1, q 1 } + min {p 2, q 2 } ≥ 2 min{  ’’ a;0,  ’’ a;1 } + K  ’’ 1 +  ’’ 4 +  rest  

TRW Message Passing Kolmogorov, 2006 VcVc VbVb VaVa VaVa VdVd VgVg Objective function increases or remains constant  ’ 1 c;0  ’ 1 c;1  ’ 1 b;0  ’ 1 b;1  ’’ a;0  ’’ a;1  ’’ a;0  ’’ a;1  ’ 4 d;0  ’ 4 d;1  ’ 4 g;0  ’ 4 g;1 2 min{  ’’ a;0,  ’’ a;1 } + K  ’’ 1 +  ’’ 4 +  rest  

TRW Message Passing Initialize  i. Take care of reparam constraint Choose random variable V a Compute min-marginals of V a for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006 Can also do edge-averaging

Example 1 VaVa VbVb 0 1 1 0 2 5 4 2 l0l0 l1l1 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa 1 4 1 0 6 3 6 4 56 7 Pick variable V a. Reparameterize.

Example 1 VaVa VbVb -3 -2 -2 5 7 4 2 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa -3 1 6 3 10 7 56 7 Average the min-marginals of V a l0l0 l1l1

Example 1 VaVa VbVb -3 -2 -2 7.5 7 4 2 VbVb VcVc 0 2 3 1 4 2 6 3 VcVc VaVa -3 1 6 3 7.5 7 76 7 Pick variable V b. Reparameterize. l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.5 7 VbVb VcVc -5 -3 -3 9 6 6 3 VcVc VaVa 1 6 3 7.5 7 76 7 Average the min-marginals of V b l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7 6.5 7 Value of dual does not increase l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7 6.5 7 Maybe it will increase for V c NO l0l0 l1l1

Example 1 VaVa VbVb -7.5 -7 -5.5 -7 7.5 7 8.75 6.5 VbVb VcVc -5 -3 -3 8.75 6.5 6 3 VcVc VaVa -3 1 6 3 7.5 7 Strong Tree Agreement Exact MAP Estimate f 1 (a) = 0f 1 (b) = 0f 2 (b) = 0f 2 (c) = 0f 3 (c) = 0f 3 (a) = 0 l0l0 l1l1

Example 2 VaVa VbVb 0 1 1 0 2 5 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 1 1 0 0 3 4 8 40 4 Pick variable V a. Reparameterize. l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 7 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 9 40 4 Average the min-marginals of V a l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8 40 4 Value of dual does not increase l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8 40 4 Maybe it will decrease for V b or V c NO l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1 Weak Tree Agreement Not Exact MAP Estimate l0l0 l1l1

Example 2 VaVa VbVb -2 -2 4 8 2 2 VbVb VcVc 1 0 0 1 0 0 0 0 VcVc VaVa 0 0 1 0 3 4 8 Weak Tree Agreement Convergence point of TRW l0l0 l1l1 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1

Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  ’ =   i   Fix the label Of V a

Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi  ’ =   i   Fix the label Of V b Continue in some fixed order Meltzer et al., 2006

Computational Issues of TRW Speed-ups for some pairwise potentials Basic Component is Belief Propagation Felzenszwalb & Huttenlocher, 2004 Memory requirements cut down by half Kolmogorov, 2006 Further speed-ups using monotonic chains Kolmogorov, 2006

Theoretical Properties of TRW Always converges, unlike BP Kolmogorov, 2006 Strong tree agreement implies exact MAP Wainwright et al., 2001 Optimal MAP for two-label submodular problems Kolmogorov and Wainwright, 2005  ab;00 +  ab;11 ≤  ab;01 +  ab;10

Results Binary Segmentation Szeliski et al., 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels TRW

Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels

Results Stereo Correspondence Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels TRW Stereo Correspondence

Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|d a - d b |), if different labels Stereo Correspondence

Results Non-submodular problems Kolmogorov, 2006 BP TRW-S 30x30 grid K 50 BP TRW-S BP outperforms TRW-S

Code + Standard Data http://vision.middlebury.edu/MRF

TRW Message Passing Dual Decomposition Outline

Dual Decomposition min x ∑ i g i (x) s.t. x  C

Dual Decomposition min x,x i ∑ i g i (x i ) s.t. x i  C x i = x

Dual Decomposition min x,x i ∑ i g i (x i ) s.t. x i  C

Dual Decomposition min x,x i ∑ i g i (x i ) + ∑ i λ i T (x i -x) s.t. x i  C max λ i KKT Condition: ∑ i λ i = 0

Dual Decomposition min x,x i ∑ i g i (x i ) + ∑ i λ i T x i s.t. x i  C max λ i

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Projected Supergradient Ascent max λ i Supergradient s of h(z) at z 0 h(z) - h(z 0 ) ≤ s T (z-z 0 ), for all z in the feasible region

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Initialize λ i 0 = 0 max λ i

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Compute supergradients max λ i s i = argmin x i ∑ i (g i (x i ) + (λ i t ) T x i )

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Project supergradients max λ i p i = s i - ∑ j s j /m where ‘m’ = number of subproblems (slaves)

Dual Decomposition min x i ∑ i (g i (x i ) + λ i T x i ) s.t. x i  C Update dual variables max λ i λ i t+1 = λ i t + η t p i where η t = learning rate = 1/(t+1) for example

Dual Decomposition Initialize λ i 0 = 0 Compute projected supergradients s i = argmin x i ∑ i (g i (x i ) + (λ i t ) T x i ) p i = s i - ∑ j s j /m Update dual variables λ i t+1 = λ i t + η t p i REPEAT

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 1010 s 4 a = Slaves agree on label for V a

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 1010 s 4 a = 0000 p 1 a = 0000 p 4 a =

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 0101 s 4 a = Slaves disagree on label for V a

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 0101 s 4 a = 0.5 -0.5 p 1 a = -0.5 0.5 p 4 a = Unary cost increases

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 0101 s 4 a = 0.5 -0.5 p 1 a = -0.5 0.5 p 4 a = Unary cost decreases

Dual Decomposition Komodakis et al., 2007 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi 11 22 33 44 55 66 1010 s 1 a = 0101 s 4 a = 0.5 -0.5 p 1 a = -0.5 0.5 p 4 a = Push the slaves towards agreement

Comparison TRWDD FastSlow Local MaximumGlobal Maximum Requires Min-Marginals Requires MAP Estimate Other forms of slaves Tighter relaxations Sparse high-order potentials

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Similar presentations

Presentation on theme: "Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online

Similar presentations

Presentation on theme: "Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online"— Presentation transcript:

Similar presentations

About project

Feedback