# CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009.

## Presentation on theme: "CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009."— Presentation transcript:

CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009

Markov Chain Monte Carlo Let be a probability density function on S={1,2,..n}. f( ‧ ) is any function on S and we want to estimate Construct P ={P ij }, the transition matrix of an irreducible Markov chain with states S, where and π is its unique stationary distribution.

Markov Chain Monte Carlo Run this Markov chain for times t=1,…,N and calculate the Monte Carlo sum then

Metropolis-Hastings Algorithm A kind of the MCMC methods. The Metropolis-Hastings algorithm can draw samples from any probability distribution π (x), requiring only that a function proportional to the density can be cal culated at x. Process in three steps:  Set up a Markov chain;  Run the chain until stationary;  Estimate function with Monte Carlo methods.

Metropolis-Hastings Algorithm In order to perform this method for a given distribution π, we must construct a Markov chain transition matrix P with π as its stationary distribution, i.e. πP = π. Consider the matrix P was made to satisfy the reversibil ity condition that for all i and j, π i P ij = π j P ij. The property ensures that and hence π is the stationary distribution for P.

Metropolis-Hastings Algorithm Goal: Draw from a distribution with support S. Start with some value Draw a proposed value y from some candidate density q(x,y)=q(y|x). Accept a move to the candidate with probability otherwise, stay at state x. Assign x=y. Assign x=x. In either case, return to

Metropolis-Hastings Algorithm Note that we do not need a symmetric q. When q(x,y) is close to, it is the best. We are creating a Markov chain with transition law and

Metropolis-Hastings Algorithm We know that this transition law P will give a chain with stationary distribution if the detailed balance condition holds: Obviously, this holds for y=x.

Metropolis-Hastings Algorithm For

Metropolis-Hastings Algorithm Some common choices for the proposal distribution: “independent” candidate: q(x,y) = q(y) symmetric candidate: q(x,y)=q(y,x) “random walk” candidate: q(x,y)=q(x-y)=q(y-x) ETC

The MH Algorithm: Independent Candidate Example: Simulate a bivariate normal distribution with mean vector and covariance matrix

The MH Algorithm: Independent Candidate In this case, for We need a candidate distribution to draw values on R 2, and since exponentials are so easy to draw, we use:

The MH Algorithm: Independent Candidate

Simulation steps: consider the ratio if R>1, new = candidate if R<1, draw U~unif(0,1) if U<R, new = candidate if U>R, new = old

Total Variation Norm The total variation norm distance between two probability measures is where B denotes the Borel sets in R.

Rate of Convergence A Markov chain with n-step transition probability P (n) (x,y) and stationary distribution is called uniformly ergodic if for all x in the state space.

The MH Algorithm: Independent Candidate Theorem: (Mengersen and Tweedie) The Metropolis-Hastings algorithm with independent candidate density q is uniformly ergodic if and only if there exists a constant such that for all x in the state space. In this case,

Gibbs Sampling A special case of single-site Updating Metropolis

Swendsen-Wang Algorithm for Ising Models Swedsen-Wang (1987) is a smart idea that flips a patch at a time. Each edge in the lattice e= is associated a probability q=e - . 1. If s and t have different labels at the current state, e is turned off. If s and t have the same label, e is turned off with probability q. Thus each object is broken into a number of connected components (subgraphs). 2. One or many components are chosen at random. 3. The collective label is changed randomly to any of the labels.

Swendsen-Wang Algorithm Pros:  Computationally efficient in sampling the Ising models Cons:  Limited to Ising models  Not informed by data, slows down in the presence of an external field (data term) Cf) Swendsen Wang Cuts Generalizes Swendsen-Wang to arbitrary posterior probabilities Improves the clustering step by using the image data

Download ppt "CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov 03 2009."

Similar presentations