Presentation is loading. Please wait.

Presentation is loading. Please wait.

MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.

Similar presentations


Presentation on theme: "MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I."— Presentation transcript:

1 MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I

2 Aim of the Tutorial Description of some successful algorithms Computational issues Enough details to implement Some proofs will be skipped :-( But references to them will be given :-)

3 A Vision Application Binary Image Segmentation How ? Cost functionModels our knowledge about natural images Optimize cost function to obtain the segmentation

4 Object - white, Background - green/grey Graph G = (V,E) Each vertex corresponds to a pixel Edges define a 4-neighbourhood grid graph Assign a label to each vertex from L = {obj,bkg} A Vision Application Binary Image Segmentation

5 Graph G = (V,E) Cost of a labelling f : V L Per Vertex Cost Cost of label obj low Cost of label bkg high Object - white, Background - green/grey A Vision Application Binary Image Segmentation

6 Graph G = (V,E) Cost of a labelling f : V L Cost of label obj high Cost of label bkg low Per Vertex Cost UNARY COST Object - white, Background - green/grey A Vision Application Binary Image Segmentation

7 Graph G = (V,E) Cost of a labelling f : V L Per Edge Cost Cost of same label low Cost of different labels high Object - white, Background - green/grey A Vision Application Binary Image Segmentation

8 Graph G = (V,E) Cost of a labelling f : V L Cost of same label high Cost of different labels low Per Edge Cost PAIRWISE COST Object - white, Background - green/grey A Vision Application Binary Image Segmentation

9 Graph G = (V,E) Problem: Find the labelling with minimum cost f* Object - white, Background - green/grey A Vision Application Binary Image Segmentation

10 Graph G = (V,E) Problem: Find the labelling with minimum cost f* A Vision Application Binary Image Segmentation

11 Another Vision Application Object Detection using Parts-based Models How ? Once again, by defining a good cost function

12 HT L1 Each vertex corresponds to a part - Head, Torso, Legs 1 Edges define a TREE Assign a label to each vertex from L = {positions} Graph G = (V,E) L2L3L4 Another Vision Application Object Detection using Parts-based Models

13 2 Each vertex corresponds to a part - Head, Torso, Legs Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

14 3 Each vertex corresponds to a part - Head, Torso, Legs Assign a label to each vertex from L = {positions} Graph G = (V,E) Edges define a TREE HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

15 Cost of a labelling f : V L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

16 Cost of a labelling f : V L Unary cost : How well does part match image patch? Pairwise cost : Encourages valid configurations Find best labelling f* Graph G = (V,E) 3 HT L1L2L3L4 Another Vision Application Object Detection using Parts-based Models

17 Yet Another Vision Application Stereo Correspondence Disparity Map How ? Minimizing a cost function

18 Yet Another Vision Application Stereo Correspondence Graph G = (V,E) Vertex corresponds to a pixel Edges define grid graph L = {disparities}

19 Yet Another Vision Application Stereo Correspondence Cost of labelling f : Unary cost + Pairwise Cost Find minimum cost f*

20 The General Problem b a e d c f Graph G = ( V, E ) Discrete label set L = {1,2,…,h} Assign a label to each vertex f: V L Cost of a labelling Q(f) Unary CostPairwise Cost Find f* = arg min Q(f)

21 Outline Problem Formulation –Energy Function –MAP Estimation –Computing min-marginals Reparameterization Belief Propagation Tree-reweighted Message Passing

22 Energy Function VaVa VbVb VcVc VdVd Label l 0 Label l 1 DaDa DbDb DcDc DdDd Random Variables V = {V a, V b, ….} Labels L = {l 0, l 1, ….}Data D Labelling f: {a, b, …. } {0,1, …}

23 Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd Q(f) = a a;f(a) Unary Potential Label l 0 Label l 1 Easy to minimize Neighbourhood

24 Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd E : (a,b) E iff V a and V b are neighbours E = { (a,b), (b,c), (c,d) } Label l 0 Label l 1

25 Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd + (a,b) ab;f(a)f(b) Pairwise Potential Label l 0 Label l 1 Q(f) = a a;f(a)

26 Energy Function VaVa VbVb VcVc VdVd DaDa DbDb DcDc DdDd Parameter Label l 0 Label l 1 + (a,b) ab;f(a)f(b) Q(f; )= a a;f(a)

27 Outline Problem Formulation –Energy Function –MAP Estimation –Computing min-marginals Reparameterization Belief Propagation Tree-reweighted Message Passing

28 MAP Estimation VaVa VbVb VcVc VdVd Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) Label l 0 Label l 1

29 MAP Estimation VaVa VbVb VcVc VdVd Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) = 13 Label l 0 Label l 1

30 MAP Estimation VaVa VbVb VcVc VdVd Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) Label l 0 Label l 1

31 MAP Estimation VaVa VbVb VcVc VdVd Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) = 27 Label l 0 Label l 1

32 MAP Estimation VaVa VbVb VcVc VdVd Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) f* = arg min Q(f; ) q* = min Q(f; ) = Q(f*; ) Label l 0 Label l 1

33 MAP Estimation f(a)f(b)f(c)f(d) Q(f; ) possible labellings f(a)f(b)f(c)f(d) Q(f; ) f* = {1, 0, 0, 1} q* = 13

34 Computational Complexity Segmentation 2 |V| |V| = number of pixels 320 * 480 =

35 Computational Complexity |L| = number of pixels Detection |L| |V|

36 Computational Complexity |V| = number of pixels Stereo |L| |V| Can we do better than brute-force? MAP Estimation is NP-hard !!

37 Computational Complexity |V| = number of pixels Stereo |L| |V| Exact algorithms do exist for special cases Good approximate algorithms for general case But first … two important definitions

38 Outline Problem Formulation –Energy Function –MAP Estimation –Computing min-marginals Reparameterization Belief Propagation Tree-reweighted Message Passing

39 Min-Marginals VaVa VbVb VcVc VdVd f* = arg min Q(f; ) such that f(a) = i Min-marginal q a;i Label l 0 Label l 1 Not a marginal (no summation)

40 Min-Marginals 16 possible labellings q a;0 = 15 f(a)f(b)f(c)f(d) Q(f; ) f(a)f(b)f(c)f(d) Q(f; )

41 Min-Marginals 16 possible labellings q a;1 = 13 f(a)f(b)f(c)f(d) Q(f; ) f(a)f(b)f(c)f(d) Q(f; )

42 Min-Marginals and MAP Minimum min-marginal of any variable = energy of MAP labelling min f Q(f; ) such that f(a) = i q a;i min i min i ( ) V a has to take one label min f Q(f; )

43 Summary MAP Estimation f* = arg min Q(f; ) Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) Min-marginals q a;i = min Q(f; ) s.t. f(a) = i Energy Function

44 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing

45 Reparameterization VaVa VbVb f(a)f(b) Q(f; ) Add a constant to all a;i Subtract that constant from all b;k

46 Reparameterization f(a)f(b) Q(f; ) Add a constant to all a;i Subtract that constant from all b;k Q(f; ) = Q(f; ) VaVa VbVb

47 Reparameterization VaVa VbVb f(a)f(b) Q(f; ) Add a constant to one b;k Subtract that constant from ab;ik for all i - 3

48 Reparameterization VaVa VbVb f(a)f(b) Q(f; ) Q(f; ) = Q(f; ) Add a constant to one b;k Subtract that constant from ab;ik for all i

49 Reparameterization VaVa VbVb VaVa VbVb VaVa VbVb a;i = a;i b;k = b;k ab;ik = ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Q(f; ) = Q(f; )

50 Reparameterization Q(f; ) = Q(f; ), for all f is a reparameterization of, iff b;k = b;k a;i = a;i ab;ik = ab;ik + M ab;k - M ab;k + M ba;i - M ba;i Equivalently Kolmogorov, PAMI, 2006 VaVa VbVb

51 Recap MAP Estimation f* = arg min Q(f; ) Q(f; ) = a a;f(a) + (a,b) ab;f(a)f(b) Min-marginals q a;i = min Q(f; ) s.t. f(a) = i Q(f; ) = Q(f; ), for all f Reparameterization

52 Outline Problem Formulation Reparameterization Belief Propagation – Exact MAP for Chains and Trees – Approximate MAP for general graphs – Computational Issues and Theoretical Properties Tree-reweighted Message Passing

53 Belief Propagation Belief Propagation gives exact MAP for chains Remember, some MAP problems are easy Exact MAP for trees Clever Reparameterization

54 Two Variables VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k Add a constant to one b;k Subtract that constant from ab;ik for all i

55 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k a;0 + ab;00 = a;1 + ab;10 = min M ab;0 = Two Variables

56 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k Two Variables

57 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k f(a) = 1 b;0 = q b;0 Two Variables Potentials along the red path add up to 0

58 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k a;0 + ab;01 = a;1 + ab;11 = min M ab;1 = Two Variables

59 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k f(a) = 1 b;0 = q b;0 f(a) = 1 b;1 = q b;1 Minimum of min-marginals = MAP estimate Two Variables

60 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k f(a) = 1 b;0 = q b;0 f(a) = 1 b;1 = q b;1 f*(b) = 0f*(a) = 1 Two Variables

61 VaVa VbVb VaVa VbVb Choose the right constant b;k = q b;k f(a) = 1 b;0 = q b;0 f(a) = 1 b;1 = q b;1 We get all the min-marginals of V b Two Variables

62 Recap We only need to know two sets of equations General form of Reparameterization a;i = a;i ab;ik = ab;ik + M ab;k - M ab;k + M ba;i - M ba;i b;k = b;k Reparameterization of (a,b) in Belief Propagation M ab;k = min i { a;i + ab;ik } M ba;i = 0

63 Three Variables VaVa VbVb VcVc Reparameterize the edge (a,b) as before l0l0 l1l1

64 VaVa VbVb VcVc Reparameterize the edge (a,b) as before f(a) = Three Variables l0l0 l1l1

65 VaVa VbVb VcVc Reparameterize the edge (a,b) as before f(a) = 1 Potentials along the red path add up to Three Variables l0l0 l1l1

66 VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to Three Variables l0l0 l1l1

67 VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = Three Variables l0l0 l1l1

68 VaVa VbVb VcVc Reparameterize the edge (b,c) as before f(a) = 1 Potentials along the red path add up to 0 f(b) = 1 f(b) = 0 q c;0 q c; Three Variables l0l0 l1l1

69 VaVa VbVb VcVc f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0f*(a) = 1 Generalizes to any length chain Three Variables l0l0 l1l1

70 VaVa VbVb VcVc f(a) = 1 f(b) = 1 f(b) = 0 q c;0 q c;1 f*(c) = 0 f*(b) = 0f*(a) = 1 Only Dynamic Programming Three Variables l0l0 l1l1

71 Why Dynamic Programming? 3 variables 2 variables + book-keeping n variables (n-1) variables + book-keeping Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i { a;i + ab;ik } ab;ik = ab;ik + M ab;k - M ab;k b;k = b;k Repeat

72 Why Dynamic Programming? Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i { a;i + ab;ik } ab;ik = ab;ik + M ab;k - M ab;k b;k = b;k Repeat MessagesMessage Passing Why stop at dynamic programming?

73 VaVa VbVb VcVc Reparameterize the edge (c,b) as before Three Variables l0l0 l1l1

74 VaVa VbVb VcVc Reparameterize the edge (c,b) as before b;i = q b;i Three Variables l0l0 l1l1

75 VaVa VbVb VcVc Reparameterize the edge (b,a) as before Three Variables l0l0 l1l1

76 VaVa VbVb VcVc Reparameterize the edge (b,a) as before a;i = q a;i Three Variables l0l0 l1l1

77 VaVa VbVb VcVc Forward Pass Backward Pass All min-marginals are computed Three Variables l0l0 l1l1

78 Belief Propagation on Chains Start from left, go to right Reparameterize current edge (a,b) M ab;k = min i { a;i + ab;ik } ab;ik = ab;ik + M ab;k - M ab;k b;k = b;k Repeat till the end of the chain Start from right, go to left Repeat till the end of the chain

79 Belief Propagation on Chains A way of computing reparam constants Generalizes to chains of any length Forward Pass - Start to End MAP estimate Min-marginals of final variable Backward Pass - End to start All other min-marginals Wont need this.. But good to know

80 Computational Complexity Each constant takes O(|L|) Number of constants - O(|E||L|) O(|E||L| 2 ) Memory required ? O(|E||L|)

81 Belief Propagation on Trees VbVb VaVa Forward Pass: Leaf Root All min-marginals are computed Backward Pass: Root Leaf VcVc VdVd VeVe VgVg VhVh

82 Outline Problem Formulation Reparameterization Belief Propagation – Exact MAP for Chains and Trees – Approximate MAP for general graphs – Computational Issues and Theoretical Properties Tree-reweighted Message Passing

83 Belief Propagation on Cycles VaVa VbVb VdVd VcVc Where do we start? Arbitrarily a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Reparameterize (a,b)

84 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

85 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

86 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

87 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

88 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0 - a;0 - a;1 a;0 - a;0 = q a;0 a;1 - a;1 = q a;1

89 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Pick minimum min-marginal. Follow red path. - a;0 - a;1 a;0 - a;0 = q a;0 a;1 - a;1 = q a;1

90 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

91 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

92 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0

93 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Potentials along the red path add up to 0 - a;0 - a;1 a;1 - a;1 = q a;1 a;0 - a;0 q a;0

94 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Problem Solved - a;0 - a;1 a;1 - a;1 = q a;1 a;0 - a;0 q a;0

95 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Problem Not Solved - a;0 - a;1 a;1 - a;1 = q a;1 a;0 - a;0 q a;0

96 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 - a;0 - a;1 Reparameterize (a,b) again

97 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Reparameterize (a,b) again But doesnt this overcount some potentials?

98 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Reparameterize (a,b) again Yes. But we will do it anyway

99 Belief Propagation on Cycles VaVa VbVb VdVd VcVc a;0 a;1 b;0 b;1 d;0 d;1 c;0 c;1 Keep reparameterizing edges in some order Hope for convergence and a good solution

100 Belief Propagation Generalizes to any arbitrary random field Complexity per iteration ? O(|E||L| 2 ) Memory required ? O(|E||L|)

101 Outline Problem Formulation Reparameterization Belief Propagation – Exact MAP for Chains and Trees – Approximate MAP for general graphs – Computational Issues and Theoretical Properties Tree-reweighted Message Passing

102 Computational Issues of BP Complexity per iteration O(|E||L| 2 ) Special Pairwise Potentials ab;ik = w ab d(|i-k|) i - k d Potts i - k d Truncated Linear i - k d Truncated Quadratic O(|E||L|) Felzenszwalb & Huttenlocher, 2004

103 Computational Issues of BP Memory requirements O(|E||L|) Half of original BP Kolmogorov, 2006 Some approximations exist But memory still remains an issue Yu, Lin, Super and Tan, 2007 Lasserre, Kannan and Winn, 2007

104 Computational Issues of BP Order of reparameterization Randomly Residual Belief Propagation In some fixed order The one that results in maximum change Elidan et al., 2006

105 Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Run BP. Assume it converges.

106 Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a tree. Change their labels. Value of energy does not decrease

107 Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? Choose variables in a cycle. Change their labels. Value of energy does not decrease

108 Theoretical Properties of BP Exact for Trees Pearl, 1988 What about any general random field? For cycles, if BP converges then exact MAP Weiss and Freeman, 2001

109 Results Object Detection Felzenszwalb and Huttenlocher, 2004 H TA1A2 L1L2 Labels - Poses of parts Unary Potentials: Fraction of foreground pixels Pairwise Potentials: Favour Valid Configurations

110 Results Object Detection Felzenszwalb and Huttenlocher, 2004

111 Results Binary Segmentation Szeliski et al., 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

112 Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Belief Propagation

113 Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Global optimum Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

114 Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

115 Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Belief Propagation Stereo Correspondence

116 Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Global optimum Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

117 Summary of BP Exact for chains Exact for trees Approximate MAP for general cases Not even convergence guaranteed So can we do something better?

118 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

119 TRW Message Passing Convex (not Combinatorial) Optimization A different look at the same problem A similar solution Combinatorial (not Convex) Optimization We will look at the most general MAP estimation Not trees No assumption on potentials

120 Things to Remember Forward-pass computes min-marginals of root BP is exact for trees Every iteration provides a reparameterization Basics of Mathematical Optimization

121 Mathematical Optimization min g 0 (x) subject to g i (x) 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

122 Integer Programming min g 0 (x) subject to g i (x) 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) 0} x* = arg Optimal Solution g 0 (x*) Optimal Value x k Z

123 Feasible Region Generally NP-hard to optimize

124 Linear Programming min g 0 (x) subject to g i (x) 0 i=1, …, N Objective function Constraints Feasible region = {x | g i (x) 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

125 Linear Programming min g 0 (x) subject to g i (x) 0 i=1, …, N Linear objective function Linear constraints Feasible region = {x | g i (x) 0} x* = arg Optimal Solution g 0 (x*) Optimal Value

126 Linear Programming min c T x subject to Ax b i=1, …, N Linear objective function Linear constraints Feasible region = {x | Ax b} x* = arg Optimal Solution c T x* Optimal Value Polynomial-time Solution

127 Feasible Region Polynomial-time Solution

128 Feasible Region Optimal solution lies on a vertex (obj func linear)

129 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

130 Integer Programming Formulation VaVa VbVb Label l 0 Label l Unary Potentials a;0 = 5 a;1 = 2 b;0 = 2 b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0y a;1 = 1 y b;0 = 1y b;1 = 0 Any f(.) has equivalent boolean variables y a;i

131 Integer Programming Formulation VaVa VbVb Unary Potentials a;0 = 5 a;1 = 2 b;0 = 2 b;1 = 4 Labelling f(a) = 1 f(b) = 0 y a;0 = 0y a;1 = 1 y b;0 = 1y b;1 = 0 Find the optimal variables y a;i Label l 0 Label l 1

132 Integer Programming Formulation VaVa VbVb Unary Potentials a;0 = 5 a;1 = 2 b;0 = 2 b;1 = 4 Sum of Unary Potentials a i a;i y a;i y a;i {0,1}, for all V a, l i i y a;i = 1, for all V a Label l 0 Label l 1

133 Integer Programming Formulation VaVa VbVb Pairwise Potentials ab;00 = 0 ab;10 = 1 ab;01 = 1 ab;11 = 0 Sum of Pairwise Potentials (a,b) ik ab;ik y a;i y b;k y a;i {0,1} i y a;i = 1 Label l 0 Label l 1

134 Integer Programming Formulation VaVa VbVb Pairwise Potentials ab;00 = 0 ab;10 = 1 ab;01 = 1 ab;11 = 0 Sum of Pairwise Potentials (a,b) ik ab;ik y ab;ik y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k Label l 0 Label l 1

135 Integer Programming Formulation min a i a;i y a;i + (a,b) ik ab;ik y ab;ik y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k

136 Integer Programming Formulation min T y y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k = [ … a;i …. ; … ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]

137 One variable, two labels y a;0 y a;1 y a;0 {0,1} y a;1 {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ] = [ a;0 a;1 ]

138 Two variables, two labels = [ a;0 a;1 b;0 b;1 ab;00 ab;01 ab;10 ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0 {0,1} y a;1 {0,1} y a;0 + y a;1 = 1 y b;0 {0,1} y b;1 {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

139 In General Marginal Polytope

140 In General R (|V||L| + |E||L| 2 ) y {0,1} (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L| 2 y a;i {0,1} i y a;i = 1y ab;ik = y a;i y b;k

141 Integer Programming Formulation min T y y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k = [ … a;i …. ; … ab;ik ….] y = [ … y a;i …. ; … y ab;ik ….]

142 Integer Programming Formulation min T y y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k Solve to obtain MAP labelling y*

143 Integer Programming Formulation min T y y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k But we cant solve it in general

144 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

145 Linear Programming Relaxation min T y y a;i {0,1} i y a;i = 1 y ab;ik = y a;i y b;k Two reasons why we cant solve this

146 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 y ab;ik = y a;i y b;k One reason why we cant solve this

147 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 k y ab;ik = k y a;i y b;k One reason why we cant solve this

148 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 One reason why we cant solve this = 1 k y ab;ik = y a;i k y b;k

149 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 k y ab;ik = y a;i One reason why we cant solve this

150 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 k y ab;ik = y a;i No reason why we cant solve this * * memory requirements, time complexity

151 One variable, two labels y a;0 y a;1 y a;0 {0,1} y a;1 {0,1} y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ] = [ a;0 a;1 ]

152 One variable, two labels y a;0 y a;1 y a;0 [0,1] y a;1 [0,1] y a;0 + y a;1 = 1 y = [ y a;0 y a;1 ] = [ a;0 a;1 ]

153 Two variables, two labels = [ a;0 a;1 b;0 b;1 ab;00 ab;01 ab;10 ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0 {0,1} y a;1 {0,1} y a;0 + y a;1 = 1 y b;0 {0,1} y b;1 {0,1} y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

154 Two variables, two labels = [ a;0 a;1 b;0 b;1 ab;00 ab;01 ab;10 ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0 [0,1] y a;1 [0,1] y a;0 + y a;1 = 1 y b;0 [0,1] y b;1 [0,1] y b;0 + y b;1 = 1 y ab;00 = y a;0 y b;0 y ab;01 = y a;0 y b;1 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

155 Two variables, two labels = [ a;0 a;1 b;0 b;1 ab;00 ab;01 ab;10 ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0 [0,1] y a;1 [0,1] y a;0 + y a;1 = 1 y b;0 [0,1] y b;1 [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 = y a;1 y b;0 y ab;11 = y a;1 y b;1

156 Two variables, two labels = [ a;0 a;1 b;0 b;1 ab;00 ab;01 ab;10 ab;11 ] y = [ y a;0 y a;1 y b;0 y b;1 y ab;00 y ab;01 y ab;10 y ab;11 ] y a;0 [0,1] y a;1 [0,1] y a;0 + y a;1 = 1 y b;0 [0,1] y b;1 [0,1] y b;0 + y b;1 = 1 y ab;00 + y ab;01 = y a;0 y ab;10 + y ab;11 = y a;1

157 In General Marginal Polytope Local Polytope

158 In General R (|V||L| + |E||L| 2 ) y [0,1] (|V||L| + |E||L| 2 ) Number of constraints |V||L| + |V| + |E||L|

159 Linear Programming Relaxation min T y y a;i [0,1] i y a;i = 1 k y ab;ik = y a;i No reason why we cant solve this

160 Linear Programming Relaxation Extensively studied Optimization Schlesinger, 1976 Koster, van Hoesel and Kolen, 1998 Theory Chekuri et al, 2001Archer et al, 2004 Machine Learning Wainwright et al., 2001

161 Linear Programming Relaxation Many interesting Properties Global optimal MAP for trees Wainwright et al., 2001 But we are interested in NP-hard cases Preserves solution for reparameterization

162 Linear Programming Relaxation Large class of problems Metric Labelling Semi-metric Labelling Many interesting Properties - Integrality Gap Manokaran et al., 2008 Most likely, provides best possible integrality gap

163 Linear Programming Relaxation A computationally useful dual Many interesting Properties - Dual Optimal value of dual = Optimal value of primal Easier-to-solve

164 Dual of the LP Relaxation Wainwright et al., 2001 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi min T y y a;i [0,1] i y a;i = 1 k y ab;ik = y a;i

165 Dual of the LP Relaxation Wainwright et al., 2001 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi i i = i 0

166 Dual of the LP Relaxation Wainwright et al., q*( 1 ) i i = q*( 2 ) q*( 3 ) q*( 4 )q*( 5 )q*( 6 ) i q*( i ) Dual of LP VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi i 0 max

167 Dual of the LP Relaxation Wainwright et al., q*( 1 ) i i q*( 2 ) q*( 3 ) q*( 4 )q*( 5 )q*( 6 ) Dual of LP VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi i 0 i q*( i ) max

168 Dual of the LP Relaxation Wainwright et al., 2001 i i max i q*( i ) I can easily compute q*( i ) I can easily maintain reparam constraint So can I easily solve the dual?

169 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

170 TRW Message Passing Kolmogorov, 2006 VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi i i i q*( i ) Pick a variable VaVa

171 TRW Message Passing Kolmogorov, 2006 i i i q*( i ) VcVc VbVb VaVa 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 VaVa VdVd VgVg 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1

172 TRW Message Passing Kolmogorov, rest 1 q*( 1 ) + 4 q*( 4 ) + K VcVc VbVb VaVa VaVa VdVd VgVg Reparameterize to obtain min-marginals of V a 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1

173 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 VaVa VdVd VgVg 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1 One pass of Belief Propagation 1 q*( 1 ) + 4 q*( 4 ) + K

174 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg Remain the same 1 q*( 1 ) + 4 q*( 4 ) + K 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1

175 TRW Message Passing Kolmogorov, rest 1 min{ 1 a;0, 1 a;1 } + 4 min{ 4 a;0, 4 a;1 } + K VcVc VbVb VaVa VaVa VdVd VgVg 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1

176 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg Compute weighted average of min-marginals of V a 1 c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1 1 min{ 1 a;0, 1 a;1 } + 4 min{ 4 a;0, 4 a;1 } + K

177 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg a;0 = 1 1 a; a; a;1 = 1 1 a; a; c;0 1 c;1 1 b;0 1 b;1 1 a;0 1 a;1 4 a;0 4 a;1 4 d;0 4 d;1 4 g;0 4 g;1 1 min{ 1 a;0, 1 a;1 } + 4 min{ 4 a;0, 4 a;1 } + K

178 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 1 min{ 1 a;0, 1 a;1 } + 4 min{ 4 a;0, 4 a;1 } + K a;0 = 1 1 a; a; a;1 = 1 1 a; a;

179 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 1 min{ 1 a;0, 1 a;1 } + 4 min{ 4 a;0, 4 a;1 } + K a;0 = 1 1 a; a; a;1 = 1 1 a; a;

180 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg 1 min{ a;0, a;1 } + 4 min{ a;0, a;1 } + K 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 a;0 = 1 1 a; a; a;1 = 1 1 a; a;

181 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg ( ) min{ a;0, a;1 } + K 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 a;0 = 1 1 a; a; a;1 = 1 1 a; a;

182 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg ( ) min{ a;0, a;1 } + K 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 min {p 1 +p 2, q 1 +q 2 }min {p 1, q 1 } + min {p 2, q 2 }

183 TRW Message Passing Kolmogorov, rest VcVc VbVb VaVa VaVa VdVd VgVg Objective function increases or remains constant 1 c;0 1 c;1 1 b;0 1 b;1 a;0 a;1 a;0 a;1 4 d;0 4 d;1 4 g;0 4 g;1 ( ) min{ a;0, a;1 } + K

184 TRW Message Passing Initialize i. Take care of reparam constraint Choose random variable V a Compute min-marginals of V a for all trees Node-average the min-marginals REPEAT Kolmogorov, 2006 Can also do edge-averaging

185 Example 1 VaVa VbVb l0l0 l1l1 VbVb VcVc VcVc VaVa =1 3 =1 1 = Pick variable V a. Reparameterize.

186 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Average the min-marginals of V a l0l0 l1l1

187 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Pick variable V b. Reparameterize. l0l0 l1l1

188 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Average the min-marginals of V b l0l0 l1l1

189 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Value of dual does not increase l0l0 l1l1

190 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Maybe it will increase for V c NO l0l0 l1l1

191 Example 1 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 =1 Strong Tree Agreement Exact MAP Estimate f 1 (a) = 0f 1 (b) = 0f 2 (b) = 0f 2 (c) = 0f 3 (c) = 0f 3 (a) = 0 l0l0 l1l1

192 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Pick variable V a. Reparameterize. l0l0 l1l1

193 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Average the min-marginals of V a l0l0 l1l1

194 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Value of dual does not increase l0l0 l1l1

195 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 = Maybe it will decrease for V b or V c NO l0l0 l1l1

196 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 =1 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1 Weak Tree Agreement Not Exact MAP Estimate l0l0 l1l1

197 Example 2 VaVa VbVb VbVb VcVc VcVc VaVa =1 3 =1 1 =1 Weak Tree Agreement Convergence point of TRW l0l0 l1l1 f 1 (a) = 1f 1 (b) = 1f 2 (b) = 1f 2 (c) = 0f 3 (c) = 1f 3 (a) = 1 f 2 (b) = 0f 2 (c) = 1

198 Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi = i i Fix the label Of V a

199 Obtaining the Labelling Only solves the dual. Primal solutions? VaVa VbVb VcVc VdVd VeVe VfVf VgVg VhVh ViVi = i i Fix the label Of V b Continue in some fixed order Meltzer et al., 2006

200 Outline Problem Formulation Reparameterization Belief Propagation Tree-reweighted Message Passing –Integer Programming Formulation –Linear Programming Relaxation and its Dual –Convergent Solution for Dual –Computational Issues and Theoretical Properties

201 Computational Issues of TRW Speed-ups for some pairwise potentials Basic Component is Belief Propagation Felzenszwalb & Huttenlocher, 2004 Memory requirements cut down by half Kolmogorov, 2006 Further speed-ups using monotonic chains Kolmogorov, 2006

202 Theoretical Properties of TRW Always converges, unlike BP Kolmogorov, 2006 Strong tree agreement implies exact MAP Wainwright et al., 2001 Optimal MAP for two-label submodular problems Kolmogorov and Wainwright, 2005 ab;00 + ab;11 ab;01 + ab;10

203 Results Binary Segmentation Szeliski et al., 2008 Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

204 Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels TRW

205 Results Binary Segmentation Labels - {foreground, background} Unary Potentials: -log(likelihood) using learnt fg/bg models Szeliski et al., 2008 Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

206 Results Stereo Correspondence Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels

207 Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels TRW Stereo Correspondence

208 Results Szeliski et al., 2008 Labels - {disparities} Unary Potentials: Similarity of pixel colours Belief Propagation Pairwise Potentials: 0, if same labels 1 - exp(|D a - D b |), if different labels Stereo Correspondence

209 Results Non-submodular problems Kolmogorov, 2006 BP TRW-S 30x30 grid K 50 BP TRW-S BP outperforms TRW-S

210 Summary Trees can be solved exactly - BP No guarantee of convergence otherwise - BP Strong Tree Agreement - TRW-S Submodular energies solved exactly - TRW-S TRW-S solves an LP relaxation of MAP estimation Loopier graphs give worse results Rother and Kolmogorov, 2006

211 Related New(er) Work Solving the Dual Globerson and Jaakkola, 2007 Komodakis, Paragios and Tziritas 2007 Weiss et al., 2006 Schlesinger and Giginyak, 2007 Solving the Primal Ravikumar, Agarwal and Wainwright, 2008

212 Related New(er) Work More complex relaxations Sontag and Jaakkola, 2007 Komodakis and Paragios, 2008 Kumar, Kolmogorov and Torr, 2007 Werner, 2008 Sontag et al., 2008 Kumar and Torr, 2008

213 Questions on Part I ? Code + Standard Data


Download ppt "MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I."

Similar presentations


Ads by Google