Presentation is loading. Please wait.

Presentation is loading. Please wait.

Gibbs Sampling A little bit of theory Outline: -What is Markov chain

Similar presentations


Presentation on theme: "Gibbs Sampling A little bit of theory Outline: -What is Markov chain"โ€” Presentation transcript:

1 Gibbs Sampling A little bit of theory Outline: -What is Markov chain
- -When will it be stationary? Properties โ€ฆ. -Gibbs is a Markov chain -P(z) is the stationary For structured learning like structured perceptron, know how to inference solve the problem Inference for everyone when using MRF Question - Ergoic: true considition. What is not? - Slice sampling (not sure to mention or not)

2 Gibbs Sampling Gibbs sampling from a distribution P(z) ( z = {z1,โ€ฆ,zN} ) ๐’› ๐ŸŽ = ๐‘ง 1 0 , ๐‘ง 2 0 ,โ‹ฏ, ๐‘ง ๐‘ 0 For t = 1 to T: ๐‘ง 1 ๐‘ก ~๐‘ƒ ๐‘ง 1 | ๐‘ง 2 = ๐‘ง 2 ๐‘กโˆ’1 , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง 2 ๐‘ก ~๐‘ƒ ๐‘ง 2 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง 3 ๐‘ก ~๐‘ƒ ๐‘ง 3 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 โ€ฆ Markov Chain: a stochastic process in which future states are independent of past states given the present state ๐‘ง ๐‘ ๐‘ก ~๐‘ƒ ๐‘ง ๐‘ | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘โˆ’1 = ๐‘ง ๐‘โˆ’1 ๐‘ก Output: ๐’› ๐’• = ๐‘ง 1 ๐‘ก , ๐‘ง 2 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘ ๐‘ก ๐’› ๐Ÿ , ๐’› ๐Ÿ , ๐’› ๐Ÿ‘ ,โ€ฆ, ๐’› ๐‘ป As sampling from P(z) Why?

3 Markov Chain Three cities A, B and C 1/6 1/2 1/2 1/2 1/6 2/3 1/2
Three cities A, B and C C 1/6 1/2 1/2 1/2 1/6 A B 2/3 1/2 Each day he decided which city he would visit with the probability distribution depending on which city he was in Random Walk Web pages ? Use on webpages, page rank The traveler recorded the cities he visited each day. โ€ฆโ€ฆ A B C A A This is a Markov chain state

4 Markov Chain With sufficient samples โ€ฆโ€ฆ A : B : C = 0.6 : 0.2 : 0.2
(independent of the starting city) 10000 days 10000 days 10000 days 5915 2025 2060 6016 2002 1982 5946 2016 2038 0.6 0.2

5 Markov Chain 0.2 0.2 0.6 The distribution will not change. P(A)=0.6
0.2 C P(A)=0.6 1/6 1/2 P(B)=0.2 1/2 1/2 P(C)=0.2 0.6 0.2 1/6 A B Stationary Distribution 2/3 1/2 Web pages ? Use on webpages, page rank 2/3 0.6 1/2 0.2 1/2 0.2 0.6 ๐‘ƒ ๐‘‡ ๐ด ๐ด ๐‘ƒ ๐ด + ๐‘ƒ ๐‘‡ ๐ด ๐ต ๐‘ƒ ๐ต + ๐‘ƒ ๐‘‡ ๐ด ๐ถ ๐‘ƒ ๐ถ =๐‘ƒ ๐ด ๐‘ƒ ๐‘‡ ๐ต ๐ด ๐‘ƒ ๐ด + ๐‘ƒ ๐‘‡ ๐ต ๐ต ๐‘ƒ ๐ต + ๐‘ƒ ๐‘‡ ๐ต ๐ถ ๐‘ƒ ๐ถ =๐‘ƒ ๐ต ๐‘ƒ ๐‘‡ ๐ถ ๐ด ๐‘ƒ ๐ด + ๐‘ƒ ๐‘‡ ๐ถ ๐ต ๐‘ƒ ๐ต + ๐‘ƒ ๐‘‡ ๐ถ ๐ถ ๐‘ƒ ๐ถ =๐‘ƒ ๐ถ The distribution will not change.

6 Unique stationary distribution
Markov Chain A Markov Chain can have multiple stationary distributions. C A B 1 Reaching which stationary distribution depends on starting state The Markov Chain fulfill some conditions will have unique stationary distribution. Web pages ? Use on webpages, page rank Irreducible, appreriodic PT(sโ€™|s) for any states s and sโ€™ is not zero Unique stationary distribution (sufficient but not necessary condition)

7 Markov Chain from Gibbs Sampling
Gibbs sampling from a distribution P(z) ( z = {z1,โ€ฆ,zN} ) ๐’› ๐ŸŽ = ๐‘ง 1 0 , ๐‘ง 2 0 ,โ‹ฏ, ๐‘ง ๐‘ 0 For t = 1 to T: ๐‘ง 1 ๐‘ก ~๐‘ƒ ๐‘ง 1 | ๐‘ง 2 = ๐‘ง 2 ๐‘กโˆ’1 , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง 2 ๐‘ก ~๐‘ƒ ๐‘ง 2 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง ๐‘ ๐‘ก ~๐‘ƒ ๐‘ง ๐‘ | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘โˆ’1 = ๐‘ง ๐‘โˆ’1 ๐‘ก โ€ฆ Output: ๐’› ๐’• = ๐‘ง 1 ๐‘ก , ๐‘ง 2 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘ ๐‘ก ๐‘ง 3 ๐‘ก ~๐‘ƒ ๐‘ง 3 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 Markov Chain: a stochastic process in which future states are independent of past states given the present state This is a Markov Chain ๐’› ๐Ÿ , ๐’› ๐Ÿ , ๐’› ๐Ÿ‘ ,โ€ฆ, ๐’› ๐‘ป zt only depend on zt-1 state

8 Markov Chain from Gibbs Sampling
Gibbs sampling from a distribution P(z) ( z = {z1,โ€ฆ,zN} ) ๐’› ๐ŸŽ = ๐‘ง 1 0 , ๐‘ง 2 0 ,โ‹ฏ, ๐‘ง ๐‘ 0 For t = 1 to T: ๐‘ง 1 ๐‘ก ~๐‘ƒ ๐‘ง 1 | ๐‘ง 2 = ๐‘ง 2 ๐‘กโˆ’1 , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง 2 ๐‘ก ~๐‘ƒ ๐‘ง 2 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง ๐‘ ๐‘ก ~๐‘ƒ ๐‘ง ๐‘ | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘โˆ’1 = ๐‘ง ๐‘โˆ’1 ๐‘ก โ€ฆ Output: ๐’› ๐’• = ๐‘ง 1 ๐‘ก , ๐‘ง 2 ๐‘ก ,โ‹ฏ, ๐‘ง ๐‘ ๐‘ก ๐‘ง 3 ๐‘ก ~๐‘ƒ ๐‘ง 3 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , ๐‘ง 4 = ๐‘ง 4 ๐‘กโˆ’1 ,โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 Markov Chain: a stochastic process in which future states are independent of past states given the present state Proof that the Markov chain has unique stationary distribution which is P(z). ๐’› ๐Ÿ , ๐’› ๐Ÿ , ๐’› ๐Ÿ‘ ,โ€ฆ, ๐’› ๐‘ป

9 Markov Chain from Gibbs Sampling
Markov chain from Gibbs sampling has unique stationary distribution? ๐‘ƒ ๐‘‡ ๐‘ง โ€ฒ |๐‘ง >0, for any z and zโ€™ Yes ๐‘ง 1 ๐‘ก ~๐‘ƒ ๐‘ง 1 | ๐‘ง 2 = ๐‘ง 2 ๐‘กโˆ’1 , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 None of the conditional probability is zero ๐‘ง 2 ๐‘ก ~๐‘ƒ ๐‘ง 2 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 3 = ๐‘ง 3 ๐‘กโˆ’1 , โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 ๐‘ง 3 ๐‘ก ~๐‘ƒ ๐‘ง 3 | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , โ‹ฏ, ๐‘ง ๐‘ = ๐‘ง ๐‘ ๐‘กโˆ’1 โ€ฆ ๐‘ง ๐‘ ๐‘ก ~๐‘ƒ ๐‘ง ๐‘ | ๐‘ง 1 = ๐‘ง 1 ๐‘ก , ๐‘ง 2 = ๐‘ง 2 ๐‘ก , โ‹ฏ, ๐‘ง ๐‘โˆ’1 = ๐‘ง ๐‘โˆ’1 ๐‘ก can be any zt

10 Markov Chain from Gibbs Sampling
Show that P(z) is a stationary distribution ๐‘ง ๐‘ƒ ๐‘‡ ๐‘ง โ€ฒ |๐‘ง ๐‘ƒ ๐‘ง =๐‘ƒ ๐‘ง โ€ฒ ๐‘ƒ ๐‘‡ ๐‘ง โ€ฒ |๐‘ง =๐‘ƒ ๐‘ง 1 โ€ฒ | ๐‘ง 2 , ๐‘ง 3 , ๐‘ง 4 ,โ‹ฏ, ๐‘ง ๐‘ ร—๐‘ƒ ๐‘ง 2 โ€ฒ | ๐‘ง 1 โ€ฒ , ๐‘ง 3 , ๐‘ง 4 ,โ‹ฏ, ๐‘ง ๐‘ ร—๐‘ƒ ๐‘ง 3 โ€ฒ | ๐‘ง 1 โ€ฒ , ๐‘ง 2 โ€ฒ , ๐‘ง 4 ,โ‹ฏ, ๐‘ง ๐‘ Can be proofed, Random systematic sweep โ€ฆ ร—๐‘ƒ ๐‘ง ๐‘ โ€ฒ | ๐‘ง 1 โ€ฒ , ๐‘ง 2 โ€ฒ , ๐‘ง 3 โ€ฒ ,โ‹ฏ, ๐‘ง ๐‘โˆ’1 โ€ฒ There is only one stationary distribution for Gibbs sampling, so we are done.

11 Thank you for your attention!


Download ppt "Gibbs Sampling A little bit of theory Outline: -What is Markov chain"

Similar presentations


Ads by Google