Discrete Optimization Lecture 4 – Part 2 M. Pawan Kumar Slides available online

Slides:



Advertisements
Similar presentations
MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.
Advertisements

Mean-Field Theory and Its Applications In Computer Vision1 1.
O BJ C UT M. Pawan Kumar Philip Torr Andrew Zisserman UNIVERSITY OF OXFORD.
Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,
Discrete Optimization Lecture 4 – Part 3 M. Pawan Kumar Slides available online
Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.
Introduction to Markov Random Fields and Graph Cuts Simon Prince
Exact Inference in Bayes Nets
Junction Trees And Belief Propagation. Junction Trees: Motivation What if we want to compute all marginals, not just one? Doing variable elimination for.
ICCV 2007 tutorial Part III Message-passing algorithms for energy minimization Vladimir Kolmogorov University College London.
Discrete Optimization in Computer Vision Nikos Komodakis Ecole des Ponts ParisTech, LIGM Traitement de l’information et vision artificielle.
Probabilistic Inference Lecture 1
Pearl’s Belief Propagation Algorithm Exact answers from tree-structured Bayesian networks Heavily based on slides by: Tomas Singliar,
Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.
Markov Networks.
Belief Propagation on Markov Random Fields Aggeliki Tsoli.
GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.
Global Approximate Inference Eran Segal Weizmann Institute.
Improved Moves for Truncated Convex Models M. Pawan Kumar Philip Torr.
Efficiently Solving Convex Relaxations M. Pawan Kumar University of Oxford for MAP Estimation Philip Torr Oxford Brookes University.
Belief Propagation, Junction Trees, and Factor Graphs
Belief Propagation Kai Ju Liu March 9, Statistical Problems Medicine Finance Internet Computer vision.
Hierarchical Graph Cuts for Semi-Metric Labeling M. Pawan Kumar Joint work with Daphne Koller.
Measuring Uncertainty in Graph Cut Solutions Pushmeet Kohli Philip H.S. Torr Department of Computing Oxford Brookes University.
Computer vision: models, learning and inference
MAP Estimation Algorithms in M. Pawan Kumar, University of Oxford Pushmeet Kohli, Microsoft Research Computer Vision - Part I.
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris École des Ponts ParisTech INRIA Saclay, Île-de-France Joint work with Phil.
Probabilistic Inference Lecture 4 – Part 2 M. Pawan Kumar Slides available online
Computer vision: models, learning and inference
Undirected Models: Markov Networks David Page, Fall 2009 CS 731: Advanced Methods in Artificial Intelligence, with Biomedical Applications.
Multiplicative Bounds for Metric Labeling M. Pawan Kumar École Centrale Paris Joint work with Phil Torr, Daphne Koller.
Rounding-based Moves for Metric Labeling M. Pawan Kumar Center for Visual Computing Ecole Centrale Paris.
Learning a Small Mixture of Trees M. Pawan Kumar Daphne Koller Aim: To efficiently learn a.
Discrete Optimization Lecture 2 – Part I M. Pawan Kumar Slides available online
Probabilistic Inference Lecture 3 M. Pawan Kumar Slides available online
Inference Complexity As Learning Bias Daniel Lowd Dept. of Computer and Information Science University of Oregon Joint work with Pedro Domingos.
Discrete Optimization in Computer Vision M. Pawan Kumar Slides will be available online
Continuous Variables Write message update equation as an expectation: Proposal distribution W t (x t ) for each node Samples define a random discretization.
Discrete Optimization Lecture 3 – Part 1 M. Pawan Kumar Slides available online
Readings: K&F: 11.3, 11.5 Yedidia et al. paper from the class website
Probabilistic Inference Lecture 5 M. Pawan Kumar Slides available online
Daphne Koller Message Passing Belief Propagation Algorithm Probabilistic Graphical Models Inference.
Efficient Discriminative Learning of Parts-based Models M. Pawan Kumar Andrew Zisserman Philip Torr
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:
Discrete Optimization Lecture 2 – Part 2 M. Pawan Kumar Slides available online
Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:
Inference for Learning Belief Propagation. So far... Exact methods for submodular energies Approximations for non-submodular energies Move-making ( N_Variables.
Probabilistic Inference Lecture 2 M. Pawan Kumar Slides available online
An Energy-Efficient Geographic Routing with Location Errors in Wireless Sensor Networks Julien Champ and Clement Saad I-SPAN 2008, Sydney (The international.
Discrete Optimization Lecture 1 M. Pawan Kumar Slides available online
Pattern Recognition and Machine Learning
Mean field approximation for CRF inference
Machine Learning – Lecture 18
Today Graphical Models Representing conditional dependence graphically
MAP Estimation of Semi-Metric MRFs via Hierarchical Graph Cuts M. Pawan Kumar Daphne Koller Aim: To obtain accurate, efficient maximum a posteriori (MAP)
Discrete Optimization Lecture 5 – Part 1 M. Pawan Kumar Slides available online
Daphne Koller Overview Maximum a posteriori (MAP) Probabilistic Graphical Models Inference.
Bayesian Belief Propagation for Image Understanding David Rosenberg.
Markov Random Fields in Vision
Rounding-based Moves for Metric Labeling M. Pawan Kumar École Centrale Paris INRIA Saclay, Île-de-France.
Introduction of BP & TRW-S
Markov Networks.
CSCI 5822 Probabilistic Models of Human and Machine Learning
≠ Particle-based Variational Inference for Continuous Systems
Expectation-Maximization & Belief Propagation
Lecture 3: Exact Inference in GMs
Clique Tree Algorithm: Computation
Markov Networks.
Mean Field and Variational Methods Loopy Belief Propagation
Presentation transcript:

Discrete Optimization Lecture 4 – Part 2 M. Pawan Kumar Slides available online

MRF V1V1 d1d1 V2V2 d2d2 V3V3 d3d3 V4V4 d4d4 V5V5 d5d5 V6V6 d6d6 V7V7 d7d7 V8V8 d8d8 V9V9 d9d9 A is conditionally independent of B given C if there is no path from A to B when C is removed

MRF V1V1 d1d1 V2V2 d2d2 V3V3 d3d3 V4V4 d4d4 V5V5 d5d5 V6V6 d6d6 V7V7 d7d7 V8V8 d8d8 V9V9 d9d9 V a is conditionally independent of V b given V a ’s neighbors

Pairwise MRF V1V1 d1d1 V2V2 d2d2 V3V3 d3d3 V4V4 d4d4 V5V5 d5d5 V6V6 d6d6 V7V7 d7d7 V8V8 d8d8 V9V9 d9d9 Z is known as the partition function Unary Potential ψ 1 (v 1,d 1 ) Pairwise Potential ψ 56 (v 5,v 6 ) Probability P(v,d) = Π a ψ a (v a,d a ) Π (a,b) ψ ab (v a,v b ) Z

Inference max v P(v) Maximum a Posteriori (MAP) Estimation min v Q(v)Energy Minimization P(v a = l i ) = Σ v P(v)δ(v a = l i ) Computing Marginals P(v a = l i, v b = l k ) = Σ v P(v)δ(v a = l i )δ(v b = l k ) P(v) = exp(-Q(v))/Z

Outline Belief Propagation on Chains Belief Propagation on Trees Loopy Belief Propagation

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V d P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) Compute (unnormalized) distribution Ψ a (v a )Ψ ab (v a,v b )ΣvaΣva Function m(v b )

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V d P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) Compute (unnormalized) distribution Ψ b (v b )Ψ bc (v b,v c )m(v b )ΣvbΣvb Function m(v c )

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V d P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) Compute (unnormalized) distribution Ψ c (v c )Ψ cd (v c,v d )m(v c )ΣvcΣvc (Unnormalized) Marginals !!

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V c P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) P(v) = P(v a |v b )P(v b |v c )P(v d |v c )P(v c ) Several common terms !!

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V b P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) P(v) = P(v a |v b )P(v b |v c )P(v d |v c )P(v c ) P(v) = P(v a |v b )P(v c |v b )P(v d |v c )P(v b )

Overview VaVa VbVb VcVc VdVd Compute the marginal probability for V a P(v) = P(v a |v b )P(v b |v c )P(v c |v d )P(v d ) P(v) = P(v a |v b )P(v b |v c )P(v d |v c )P(v c ) P(v) = P(v a |v b )P(v c |v b )P(v d |v c )P(v b ) P(v) = P(v b |v a )P(v c |v b )P(v d |v c )P(v a )

Belief Propagation on Chains Compute exact marginals Avoids re-computing common terms

Two Variables VaVa VbVb VaVa VbVb Unary Potentials ψ a (l i ) Pairwise Potentials ψ ab (l i,l k )

Two Variables VaVa VbVb VaVa VbVb Marginal Probability P(v b = l j ) = Σ i ψ a (l i )ψ b (l j )ψ ab (l i,l j )/Z

Two Variables VaVa VbVb VaVa VbVb Un-normalized Marginal Probability P’(v b = l j ) = Σ i ψ a (l i )ψ b (l j )ψ ab (l i,l j )/Z

Two Variables VaVa VbVb VaVa VbVb Un-normalized Marginal Probability P’(v b = l j ) = Σ i ψ a (l i )ψ b (l j )ψ ab (l i,l j )

Two Variables VaVa VbVb VaVa VbVb Un-normalized Marginal Probability P’(v b = l j ) = ψ b (l j )Σ i ψ a (l i )ψ ab (l i,l j )

Two Variables VaVa VbVb VaVa VbVb x 3

Two Variables VaVa VbVb VaVa VbVb x 3+ 5 x 1 M ab;0 11

Two Variables VaVa VbVb x 1 VaVa VbVb

Two Variables 2 x 1 11 VaVa VbVb VaVa VbVb x 3 M ab;1 17

Two Variables 11 VaVa VbVb Marginal Probability P’(v b = l j ) = ψ b (l j )Σ i ψ a (l i )ψ ab (l i,l j ) VaVa VbVb

Two Variables 11 VaVa VbVb Marginal Probability P’(v b = l j ) = ψ b (l j )M ab;j VaVa VbVb P’(v b = l 0 ) = 22P’(v b = l 1 ) = 68

Two Variables 11 VaVa VbVb Marginal Probability P(v b = l j ) = ψ b (l j )M ab;j /Z VaVa VbVb P’(v b = l 0 ) = 22P’(v b = l 1 ) = 68 Z = Σ j P’(v b = l j ) = 90

Two Variables 11 VaVa VbVb VaVa VbVb P(v b = l 0 ) = 0.244…P(v b = l 1 ) = 0.755… = 90 O(h 2 )!! Marginal Probability P(v b = l j ) = ψ b (l j )M ab;j /Z Z = Σ j P’(v b = l j )

Two Variables 11 VaVa VbVb VaVa VbVb P(v b = l 0 ) = 0.244…P(v b = l 1 ) = 0.755… O(h 2 )!! Same as brute-force

Three Variables VaVa VbVb VcVc P’(v c = l k ) Σ j Σ i ψ a (l i )ψ b (l j )ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j Σ i ψ a (l i )ψ b (l j )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )Σ i ψ a (l i )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )Σ i ψ a (l i )ψ ab (l i,l j ) M ab;j 11 17

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j M bc;k

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j 11 17

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j x 2 x 11

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j x 2 x x 2 x 17

Three Variables VaVa VbVb VcVc P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j x 2 x x 2 x

Three Variables P’(v c = l k ) ψ c (l k )Σ j ψ b (l j )ψ bc (l j,l k )M ab;j VaVa VbVb VcVc

Three Variables P’(v c = l k ) ψ c (l k )M bc;k VaVa VbVb VcVc NOTE: M bc;k “includes” M ab;j 146

Three Variables VaVa VbVb VcVc P(v c = 0) = 0.35 P(v c = 1) = 0.65 Z = 156 x x 6 =

Three Variables VaVa VbVb VcVc O(nh 2 )Better than brute-force 146

Three Variables VaVa VbVb VcVc What about P(v b = l j )? 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) Σ k Σ i ψ a (l i )ψ b (l j )ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k ) 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) ψ b (l j )Σ k Σ i ψ a (l i )ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k ) 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) ψ b (l j )Σ k ψ c (l k )Σ i ψ a (l i )ψ ab (l i,l j )ψ bc (l j,l k ) 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) ψ b (l j )Σ k ψ c (l k )ψ bc (l j,l k )Σ i ψ a (l i )ψ ab (l i,l j ) M ab;j 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) ψ b (l j )M ab;j Σ k ψ c (l k )ψ bc (l j,l k ) M cb;j NOTE: M cb;j does not “include” M bc;k 146

Three Variables VaVa VbVb VcVc P’(v b = l j ) ψ b (l j )M ab;j M cb;j

Three Variables VaVa VbVb VcVc P(v b = 0) = 0.39 P(v b = 1) = 0.61 Z = 11 x 12 x x 24 x 2 = 1344

Three Variables VaVa VbVb VcVc O(nh 2 )Better than brute-force

Three Variables VaVa VbVb VcVc What about P(v a = l i )?

Three Variables VaVa VbVb VcVc P’(v a = l i ) Σ j Σ k ψ a (l i )ψ b (l j )ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v a = l i ) ψ a (l i )Σ j Σ k ψ b (l j )ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v a = l i ) ψ a (l i )Σ j ψ b (l j )Σ k ψ c (l k )ψ ab (l i,l j )ψ bc (l j,l k )

Three Variables VaVa VbVb VcVc P’(v a = l i ) ψ a (l i )Σ j ψ b (l j )ψ ab (l i,l j )Σ k ψ c (l k )ψ bc (l j,l k ) M cb;j

Three Variables VaVa VbVb VcVc P’(v a = l i ) ψ a (l i )Σ j ψ b (l j )ψ ab (l i,l j )M cb;j M ba;i NOTE: M ba;i “includes” M cb;j

Three Variables VaVa VbVb VcVc P’(v a = l i ) ψ a (l i )M ba;i 192

Three Variables VaVa VbVb VcVc P(v a = 0) = 0.71 P(v b = 1) = 0.29 Z = 192 x x 5 = 1344

Three Variables VaVa VbVb VcVc O(nh 2 )Better than brute-force

Belief Propagation on Chains Start from left, go to right For current edge (a,b), compute M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Repeat till the end of the chain Start from right, go to left M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Repeat till the end of the chain

Belief Propagation on Chains P’(v a = l i,v b = l j ) = ? Normalize to compute true marginals P’(v a = l i ) = ? ψ a (l i )ψ b (l j )ψ ab (l i,l j )Π n≠b M na;i Π n≠a M nb;j ψ a (l i )Π n M na;i

Outline Belief Propagation on Chains Belief Propagation on Trees Loopy Belief Propagation Pearl, 1988

Belief Propagation on Trees VcVc VdVd VaVa VbVb Σ k Σ j Σ i ψ a (l i )ψ b (l j )ψ c (l k )ψ d (l o )ψ ac (l i,l k )ψ bc (l j,l k )ψ cd (l k,l o ) P’(v d = l o )

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k Σ j Σ i ψ a (l i )ψ b (l j )ψ c (l k )ψ ac (l i,l k )ψ bc (l j,l k )ψ cd (l k,l o ) P’(v d = l o )

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )Σ j Σ i ψ a (l i )ψ b (l j )ψ ac (l i,l k )ψ bc (l j,l k )ψ cd (l k,l o ) P’(v d = l o )

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )ψ cd (l k,l o )Σ j Σ i ψ a (l i )ψ b (l j )ψ ac (l i,l k )ψ bc (l j,l k ) P’(v d = l o )

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )ψ cd (l k,l o )Σ j ψ b (l j )Σ i ψ a (l i )ψ ac (l i,l k )ψ bc (l j,l k ) P’(v d = l o )

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )ψ cd (l k,l o )Σ j ψ b (l j )ψ bc (l j,l k )Σ i ψ a (l i )ψ ac (l i,l k ) P’(v d = l o ) M ac;k

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )ψ cd (l k,l o )Σ j ψ b (l j )ψ bc (l j,l k )M ac;k P’(v d = l o ) M bc;k M ac;k M bc;k

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )Σ k ψ c (l k )ψ cd (l k,l o )M bc;k M ac;k P’(v d = l o ) M ac;k M bc;k M cd;o

Belief Propagation on Trees VcVc VdVd VaVa VbVb ψ d (l o )M cd;o P’(v d = l o ) M ac;k M bc;k M cd;o

Belief Propagation on Trees VcVc VdVd VaVa VbVb P’(v c = l k ) M ac;k M bc;k M cd;o M dc;k ψ c (l k )M ac;k M bc;k M dc;k

Belief Propagation on Trees VcVc VdVd VaVa VbVb P’(v b = l j ) M ac;k M bc;k M cd;o M dc;k M cb;j ψ b (l j )M cb;j

Belief Propagation on Trees VcVc VdVd VaVa VbVb P’(v a = l i ) M ac;k M bc;k M cd;o M dc;k M cb;j M ca;i ψ a (l i )M ca;i

Belief Propagation on Trees Start from leaf, go towards root For current edge (a,b), compute M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Repeat till the root is reached Start from root, go towards leaves M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Repeat till the leafs are reached

Belief Propagation on Trees P’(v a = l i,v b = l j ) = ? Normalize to compute true marginals P’(v a = l i ) = ? ψ a (l i )ψ b (l j )ψ ab (l i,l j )Π n≠b M na;i Π n≠a M nb;j ψ a (l i )Π n M na;i

Outline Belief Propagation on Chains Belief Propagation on Trees Loopy Belief Propagation Pearl, 1988; Murphy et al., 1999

Loopy Belief Propagation Initialize all messages to 1 In some order of edges, update messages M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Until Convergence Rate of changes in messages < threshold

Loopy Belief Propagation VaVa VbVb VdVd VcVc M ab M bc M bc contains M ab M cd M da M cd contains M bc M da contains M cd Overcounting!!

Loopy Belief Propagation Initialize all messages to 1 In some order of edges, update messages M ab;k = Σ i ψ a (l i )ψ ab (l i,l k )Π n≠b M na;i Until Convergence Rate of changes in messages < threshold Not Guaranteed !!

Loopy Belief Propagation B’ ab (i,j) = Normalize to compute beliefs B a (i), B ab (i,j) B’ a (i) = ψ a (l i )ψ b (l j )ψ ab (l i,l j )Π n≠b M na;i Π n≠a M nb;j ψ a (l i )Π n M na;i At convergence Σ j B ab (i,j) = B a (i)