Bayesian Belief Propagation for Image Understanding David Rosenberg.

Slides:

Advertisements

Similar presentations

Bayesian Belief Propagation

Advertisements

Linear Time Methods for Propagating Beliefs Min Convolution, Distance Transforms and Box Sums Daniel Huttenlocher Computer Science Department December,

Markov Networks Alan Ritter.

Exact Inference. Inference Basic task for inference: – Compute a posterior distribution for some query variables given some observed evidence – Sum out.

Graphical Models BRML Chapter 4 1. the zoo of graphical models Markov networks Belief networks Chain graphs (Belief and Markov ) Factor graphs =>they.

Exact Inference in Bayes Nets

Loopy Belief Propagation a summary. What is inference? Given: –Observabled variables Y –Hidden variables X –Some model of P(X,Y) We want to make some.

Undirected Probabilistic Graphical Models (Markov Nets) (Slides from Sam Roweis)

Introduction to Belief Propagation and its Generalizations. Max Welling Donald Bren School of Information and Computer and Science University of California.

1 Slides for the book: Probabilistic Robotics Authors: Sebastian Thrun Wolfram Burgard Dieter Fox Publisher: MIT Press, Web site for the book & more.

EE462 MLCV Lecture Introduction of Graphical Models Markov Random Fields Segmentation Tae-Kyun Kim 1.

Belief Propagation by Jakob Metzler. Outline Motivation Pearl’s BP Algorithm Turbo Codes Generalized Belief Propagation Free Energies.

Markov Networks.

Belief Propagation on Markov Random Fields Aggeliki Tsoli.

Graphical models, belief propagation, and Markov random fields 1.

Chapter 8-3 Markov Random Fields 1. Topics 1. Introduction 1. Undirected Graphical Models 2. Terminology 2. Conditional Independence 3. Factorization.

GS 540 week 6. HMM basics Given a sequence, and state parameters: – Each possible path through the states has a certain probability of emitting the sequence.

… Hidden Markov Models Markov assumption: Transition model:

Midterm Review. The Midterm Everything we have talked about so far Stuff from HW I won’t ask you to do as complicated calculations as the HW Don’t need.

1 Graphical Models in Data Assimilation Problems Alexander Ihler UC Irvine Collaborators: Sergey Kirshner Andrew Robertson Padhraic Smyth.

Learning Low-Level Vision William T. Freeman Egon C. Pasztor Owen T. Carmichael.

Conditional Random Fields

Graphical Models Lei Tang. Review of Graphical Models Directed Graph (DAG, Bayesian Network, Belief Network) Typically used to represent causal relationship.

Announcements Readings for today:

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Understanding Belief Propagation and its Applications Dan Yuan June 2004.

Bayesian Networks Alan Ritter.

Computer vision: models, learning and inference Chapter 10 Graphical Models.

. Applications and Summary. . Presented By Dan Geiger Journal Club of the Pharmacogenetics Group Meeting Technion.

Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.

Approximate Inference 2: Monte Carlo Markov Chain

Machine Learning CUNY Graduate Center Lecture 21: Graphical Models.

Computer vision: models, learning and inference

CS774. Markov Random Field : Theory and Application Lecture 08 Kyomin Jung KAIST Sep

12/07/2008UAI 2008 Cumulative Distribution Networks and the Derivative-Sum-Product Algorithm Jim C. Huang and Brendan J. Frey Probabilistic and Statistical.

Computer vision: models, learning and inference Chapter 19 Temporal models.

Markov Random Fields Probabilistic Models for Images

CS774. Markov Random Field : Theory and Application Lecture 02

CS Statistical Machine learning Lecture 24

Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)

1 Mean Field and Variational Methods finishing off Graphical Models – Carlos Guestrin Carnegie Mellon University November 5 th, 2008 Readings: K&F:

Tractable Inference for Complex Stochastic Processes X. Boyen & D. Koller Presented by Shiau Hong Lim Partially based on slides by Boyen & Koller at UAI.

Lecture 2: Statistical learning primer for biologists

Belief Propagation and its Generalizations Shane Oldenburger.

Exact Inference in Bayes Nets. Notation U: set of nodes in a graph X i : random variable associated with node i π i : parents of node i Joint probability:

1 CMSC 671 Fall 2001 Class #20 – Thursday, November 8.

Contextual models for object detection using boosted random fields by Antonio Torralba, Kevin P. Murphy and William T. Freeman.

Pattern Recognition and Machine Learning

Reasoning Under Uncertainty: Independence and Inference CPSC 322 – Uncertainty 5 Textbook §6.3.1 (and for HMMs) March 25, 2011.

CS Statistical Machine learning Lecture 25 Yuan (Alan) Qi Purdue CS Nov

Probabilistic Robotics Introduction Probabilities Bayes rule Bayes filters.

Today Graphical Models Representing conditional dependence graphically

Markov Networks: Theory and Applications Ying Wu Electrical Engineering and Computer Science Northwestern University Evanston, IL 60208

Markov Random Fields in Vision

Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.

Probabilistic Robotics Probability Theory Basics Error Propagation Slides from Autonomous Robots (Siegwart and Nourbaksh), Chapter 5 Probabilistic Robotics.

Markov Random Fields with Efficient Approximations

State Estimation Probability, Bayes Filtering

Markov Networks.

Bayesian Models in Machine Learning

CSCI 5822 Probabilistic Models of Human and Machine Learning

Generalized Belief Propagation

Markov Random Fields Presented by: Vladan Radosavljevic.

Expectation-Maximization & Belief Propagation

Readings: K&F: 5.1, 5.2, 5.3, 5.4, 5.5, 5.6, 5.7 Markov networks, Factor graphs, and an unified view Start approximate inference If we are lucky… Graphical.

Approximate Inference by Sampling

Junction Trees 3 Undirected Graphical Models

Markov Networks.

Mean Field and Variational Methods Loopy Belief Propagation

Presentation transcript:

Bayesian Belief Propagation for Image Understanding David Rosenberg

Markov Random Fields Let G be an undirected graph –nodes: {1, …, n} Associate a random variable X_t to each node t in G. (X_1, …, X_n) is a Markov random field on G if –Every r.v. is independent of its nonneighbors conditioned on its neighbors. –P(X_t=x_t | X_s = x_s for all s \neq t} = P(X_t=x_t | X_s = x_s for all s\in N(t)), where N(s) be the set of neighbors of a node s.

Specifying a Markov Random Field Nice if we could just specify P( X | N(X) ) for all r.v.’s X (as with Bayesian networks) Unfortunately, this will overspecify the joint PDF. –E.g. X_1 -- X_2. Joint PDF has 3 degrees of freedom Conditiona PDFs X_1|X_2 and X_2|X_1 have 2 degrees of freedom each The Hammersley-Clifford Theorem helps to specify MRFs

The Gibbs Distribution A Gibbs distribution w.r.t. graph G is a probability mass function that can be expressed in the form –P(x_1, …, x_n) = Prod _ Cliques C V_C(x_1,.., x_n) –where V_C(x_1, …, x_n) depends only on those x_I in C. We can combine potential functions into products from maximal cliques, so –P(x_1, …, x_n) = Prod _ MaxCliques C V_C(x_1,.., x_n) –This may be better in certain circumstances because we don’t have to specify as many potential functions

Hammersley Clifford Theorem Let the r.v’s {X_j} have a positive joint probability mass function. Then the Hammersley Clifford Theorem says that {X_j} is a Markov random field on graph G iff it has a Gibbs distirubtion w.r.t G. –Side Note: Hammserley and Clifford discovered this theorem in 1971, but they didn’t publish it because they kept thinking they should be able to remove or relax the positivity assumption. They couldn’t. Clifford published the result in Specifying the potential functions is equivalent to specifying the joint probability distribution of all variables. Now it’s easy to specify a valid MRF –still not easy to determine the degrees of freedom in the distribution (normalization)

A Typical MRF Vision Problem We have –hidden “scene” variables: X_j –observed “image” variables Y_j Given X_j, Y_j is independent of everything else Show Picture The Problems –Given: Some instantiations of the Y_j’s –Find: The aposteriori distribution over the X_j’s Find the MAP estimate for each X_j Find the least squares estimate of each X_j

Straightforward Exact Inference Given the joint PDF –typically specified using potential functions We can just marginalize out to –get the aposteriori distribution for each X_j We can immediately extract the –MAP estimate -- just the mode of the aposteriori distriubtion –Least squares estimate -- just the expected value of the aposteriori distribution

Inference by Message Passing The resulting aposteriori distributions are exact for graphs without loops (Pearl?) Weiss and Freeman show that for arbitrary graph topologies, when belief propagation converges, it gives the correct least squares estimate (I.e. posterior mean) More results?

y1y1 Derivation of belief propagation x1x1 y2y2 x2x2 y3y3 x3x3

The posterior factorizes y1y1 x1x1 y2y2 x2x2 y3y3 x3x3

Propagation rules y1y1 x1x1 y2y2 x2x2 y3y3 x3x3

y1y1 x1x1 y2y2 x2x2 y3y3 x3x3

Belief, and message updates ji i = j

Optimal solution in a chain or tree: Belief Propagation “Do the right thing” Bayesian algorithm. For Gaussian random variables over time: Kalman filter. For hidden Markov models: forward/backward algorithm (and MAP variant is Viterbi).p

No factorization with loops! y1y1 x1x1 y2y2 x2x2 y3y3 x3x3 31 ),(xx 

First Toy Examples Show messages passed and beliefs at each stage show convergence in x steps.

Where does Evidence Fit In?

The Cost Functional Approach We can state the solution to many problems in terms of minimizing a cost functional. How can we put this our MRF framework?

Slide on Weiss’s Interior/exterior Example show graphs of convergence speed

Slide on Weiss’s Motion Detection

My own computer example taking the cost functional approach

Discussion of complexity issues with message passing How long are messages How many messages do we have to pass per iteration How many iterations until convergence Problem quickly becomes intractible

Mention some apprxomiate inference approaches

Slides on message passing with jointly gaussian distributions???

EXTRA SLIDES

Incorporating Evidence nodes into MRFs We would like to have nodes that don’t change their beliefs -- they are just observations. Can we do this via the potential functions on the non-maximal clique containing just that node? I tink this is what they do in the Yair Weiss implementation What if we don’t want to specify a potential function? Make it identically one, since it’s in a product.

From cost functional to transition matrix

From cost functional to update rule

From update rule to transition matrix

The factoriation into pair wise potentials -- good for general Markov networks

Other Stuff For shorthand, we will write x = (x_1, …, x_n).