Presentation is loading. Please wait.

Presentation is loading. Please wait.

. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:

Similar presentations


Presentation on theme: ". PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:"— Presentation transcript:

1 . PGM: Tirgul 8 Markov Chains

2 Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem: It is difficult to sample from P(X 1, …. X n |e ) u We had to use likelihood weighting to reweigh our samples u This introduced bias in estimation u In some case, such as when the evidence is on leaves, these methods are inefficient

3 MCMC Methods u We are going to discuss sampling methods that are based on Markov Chain l Markov Chain Monte Carlo (MCMC) methods u Key ideas: l Sampling process as a Markov Chain H Next sample depends on the previous one l These will approximate any posterior distribution u We start by reviewing key ideas from the theory of Markov chains

4 Markov Chains  Suppose X 1, X 2, … take some set of values wlog. These values are 1, 2,... u A Markov chain is a process that corresponds to the network: u To quantify the chain, we need to specify Initial probability: P(X 1 ) Transition probability: P(X t+1 |X t ) u A Markov chain has stationary transition probability P(X t+1 |X t ) is the same for all times t X1X1 X2X2 X3X3 XnXn...

5 Irreducible Chains  A state j is accessible from state i if there is an n such that P(X n = j | X 1 = i) > 0 There is a positive probability of reaching j from i after some number steps u A chain is irreducible if every state is accessible from every state

6 Ergodic Chains  A state is positively recurrent if there is a finite expected time to get back to state i after being in state i If X has finite number of states, then this is suffices that i is accessible from itself u A chain is ergodic if it is irreducible and every state is positively recurrent

7 (A)periodic Chains  A state i is periodic if there is an integer d such that P(X n = i | X 1 = i ) = 0 when n is not divisible by d u A chain is aperiodic if it contains no periodic state

8 Stationary Probabilities Thm:  If a chain is ergodic and aperiodic, then the limit exists, and does not depend on i  Moreover, let then, P * (X) is the unique probability satisfying

9 Stationary Probabilities  The probability P * (X) is the stationary probability of the process u Regardless of the starting point, the process will converge to this probability u The rate of convergence depends on properties of the transition probability

10 Sampling from the stationary probability u This theory suggests how to sample from the stationary probability: Set X 1 = i, for some random/arbitrary i For t = 1, 2, …, n  Sample a value x t+1 for X t+1 from P(X t+1 |X t =x t ) return x n  If n is large enough, then this is a sample from P * (X)

11 Designing Markov Chains u How do we construct the right chain to sample from? l Ensuring aperiodicity and irreducibility is usually easy u Problem is ensuring the desired stationary probability

12 Designing Markov Chains Key tool:  If the transition probability satisfies then, P * (X) = Q(X) u This gives a local criteria for checking that the chain will have the right stationary distribution

13 MCMC Methods  We can use these results to sample from P(X 1,…,X n |e) Idea:  Construct an ergodic & aperiodic Markov Chain such that P * (X 1,…,X n ) = P(X 1,…,X n |e)  Simulate the chain n steps to get a sample

14 MCMC Methods Notes: u The Markov chain variable Y takes as value assignments to all variables that are consistent evidence u For simplicity, we will denote such a state using the vector of variables

15 Gibbs Sampler u One of the simplest MCMC method  At each transition change the state of just on X i u We can describe the transition probability as a stochastic procedure: Input: a state x 1,…,x n Choose i at random (using uniform probability) Sample x’ i from P(X i |x 1, …, x i-1, x i+1,…, x n, e) let x’ j = x j for all j  i return x’ 1,…,x’ n

16 Correctness of Gibbs Sampler u By chain rule P(x 1, …, x i-1, x i, x i+1,…, x n |e) = P(x 1, …, x i-1, x i+1,…, x n |e)P(x i |x 1, …, x i-1, x i+1,…, x n, e) u Thus, we get u Since we choose i from the same distribution at each stage, this procedure satisfies the ratio criteria

17 Gibbs Sampling for Bayesian Network u Why is the Gibbs sampler “easy” in BNs? u Recall that the Markov blanket of a variable separates it from the other variables in the network l P(X i | X 1,…,X i-1,X i+1,…,X n ) = P(X i | Mb i )  This property allows us to use local computations to perform sampling in each transition

18 Gibbs Sampling in Bayesian Networks  How do we evaluate P(X i | x 1,…,x i-1,x i+1,…,x n ) ?  Let Y 1, …, Y k be the children of X i By definition of Mb i, the parents of Y j are in Mb i  {X i } u It is easy to show that

19 Sampling Strategy u How do we collect the samples? Strategy I:  Run the chain M times, each run for N steps l each run starts from a different state points u Return the last state in each run M chains

20 Sampling Strategy Strategy II: u Run one chain for a long time u After some “burn in” period, sample points every some fixed number of steps “burn in” M samples from one chain

21 Comparing Strategies Strategy I: l Better chance of “covering” the space of points especially if the chain is slow to reach stationarity l Have to perform “burn in” steps for each chain Strategy II: l Perform “burn in” only once l Samples might be correlated (although only weakly) Hybrid strategy: l run several chains, and sample few samples from each l Combines benefits of both strategies


Download ppt ". PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:"

Similar presentations


Ads by Google