Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is.

Slides:



Advertisements
Similar presentations
Inapproximability of Hypergraph Vertex-Cover. A k-uniform hypergraph H= : V – a set of vertices E - a collection of k-element subsets of V Example: k=3.
Advertisements

Inapproximability of MAX-CUT Khot,Kindler,Mossel and O ’ Donnell Moshe Ben Nehemia June 05.
On the Density of a Graph and its Blowup Raphael Yuster Joint work with Asaf Shapira.
COMP 553: Algorithmic Game Theory Fall 2014 Yang Cai Lecture 21.
Outline. Theorem For the two processor network, Bit C(Leader) = Bit C(MaxF) = 2[log 2 ((M + 2)/3.5)] and Bit C t (Leader) = Bit C t (MaxF) = 2[log 2 ((M.
1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.
Randomized Algorithms Kyomin Jung KAIST Applied Algorithm Lab Jan 12, WSAC
The Probabilistic Method Shmuel Wimer Engineering Faculty Bar-Ilan University.
Markov Chains 1.
Noga Alon Institute for Advanced Study and Tel Aviv University
The number of edge-disjoint transitive triples in a tournament.
1 By Gil Kalai Institute of Mathematics and Center for Rationality, Hebrew University, Jerusalem, Israel presented by: Yair Cymbalista.
Markov Chain Monte Carlo Prof. David Page transcribed by Matthew G. Lee.
11 - Markov Chains Jim Vallandingham.
The Rate of Concentration of the stationary distribution of a Markov Chain on the Homogenous Populations. Boris Mitavskiy and Jonathan Rowe School of Computer.
Андрей Андреевич Марков. Markov Chains Graduate Seminar in Applied Statistics Presented by Matthias Theubert Never look behind you…
Entropy Rates of a Stochastic Process
6.896: Probability and Computation Spring 2011 Constantinos (Costis) Daskalakis lecture 2.
CS774. Markov Random Field : Theory and Application Lecture 16 Kyomin Jung KAIST Nov
Why almost all k-colorable graphs are easy A. Coja-Oghlan, M. Krivelevich, D. Vilenchik.
Complexity 18-1 Complexity Andrei Bulatov Probabilistic Algorithms.
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Computational statistics 2009 Random walk. Computational statistics 2009 Random walk with absorbing barrier.
Computability and Complexity 24-1 Computability and Complexity Andrei Bulatov Approximation.
1 On the Computation of the Permanent Dana Moshkovitz.
Sampling and Approximate Counting for Weighted Matchings Roy Cagan.
1 Separator Theorems for Planar Graphs Presented by Shira Zucker.
Problems, cont. 3. where k=0?. When are there stationary distributions? Theorem: An irreducible chain has a stationary distribution  iff the states are.
Introduction to AEP In information theory, the asymptotic equipartition property (AEP) is the analog of the law of large numbers. This law states that.
6. Markov Chain. State Space The state space is the set of values a random variable X can take. E.g.: integer 1 to 6 in a dice experiment, or the locations.
Entropy Rate of a Markov Chain
Random Walks and Markov Chains Nimantha Thushan Baranasuriya Girisha Durrel De Silva Rahul Singhal Karthik Yadati Ziling Zhou.
Diophantine Approximation and Basis Reduction
Ragesh Jaiswal Indian Institute of Technology Delhi Threshold Direct Product Theorems: a survey.
Proving Non-Reconstruction on Trees by an Iterative Algorithm Elitza Maneva University of Barcelona joint work with N. Bhatnagar, Hebrew University.
Markov Decision Processes1 Definitions; Stationary policies; Value improvement algorithm, Policy improvement algorithm, and linear programming for discounted.
Expanders via Random Spanning Trees R 許榮財 R 黃佳婷 R 黃怡嘉.
Markov Chain Monte Carlo and Gibbs Sampling Vasileios Hatzivassiloglou University of Texas at Dallas.
Maximum density of copies of a graph in the n-cube John Goldwasser Ryan Hansen West Virginia University.
Markov Chains and Random Walks. Def: A stochastic process X={X(t),t ∈ T} is a collection of random variables. If T is a countable set, say T={0,1,2, …
Approximate Inference: Decomposition Methods with Applications to Computer Vision Kyomin Jung ( KAIST ) Joint work with Pushmeet Kohli (Microsoft Research)
Markov Chain Monte Carlo Prof. David Page transcribed by Matthew G. Lee.
Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.
Seminar on random walks on graphs Lecture No. 2 Mille Gandelsman,
Unique Games Approximation Amit Weinstein Complexity Seminar, Fall 2006 Based on: “Near Optimal Algorithms for Unique Games" by M. Charikar, K. Makarychev,
CS6045: Advanced Algorithms NP Completeness. NP-Completeness Some problems are intractable: as they grow large, we are unable to solve them in reasonable.
CS774. Markov Random Field : Theory and Application Lecture 15 Kyomin Jung KAIST Oct
Date: 2005/4/25 Advisor: Sy-Yen Kuo Speaker: Szu-Chi Wang.
CS433 Modeling and Simulation Lecture 11 Continuous Markov Chains Dr. Anis Koubâa 01 May 2009 Al-Imam Mohammad Ibn Saud University.
11. Markov Chains (MCs) 2 Courtesy of J. Bard, L. Page, and J. Heyl.
1 Distributed Vertex Coloring. 2 Vertex Coloring: each vertex is assigned a color.
Daphne Koller Sampling Methods Metropolis- Hastings Algorithm Probabilistic Graphical Models Inference.
Theory of Computational Complexity Probability and Computing Lee Minseon Iwama and Ito lab M1 1.
An algorithmic proof of the Lovasz Local Lemma via resampling oracles Jan Vondrak IBM Almaden TexPoint fonts used in EMF. Read the TexPoint manual before.
Theory of Computational Complexity Yusuke FURUKAWA Iwama Ito lab M1.
CHAPTER SIX T HE P ROBABILISTIC M ETHOD M1 Zhang Cong 2011/Nov/28.
Krishnendu ChatterjeeFormal Methods Class1 MARKOV CHAINS.
Probabilistic Algorithms
Markov Chains and Random Walks
Markov Chains and Mixing Times
Random walks on undirected graphs and a little bit about Markov Chains
Markov Chains Mixing Times Lecture 5
CSC317 Graph algorithms Why bother?
Path Coupling And Approximate Counting
Haim Kaplan and Uri Zwick
Instructor: Shengyu Zhang
Markov Chain Monte Carlo: Metropolis and Glauber Chains
Seminar on Markov Chains and Mixing Times Elad Katz
Convergence of Sequential Monte Carlo Methods
Locality In Distributed Graph Algorithms
Presentation transcript:

Gibbs sampler - simple properties It’s not hard to show that this MC chain is aperiodic. Often is reversible distribution. If in addition the chain is irreducible (which depends on which elements of have non-zero probability), then this MC is a correct MCMC algorithm for simulating random variables with distribution.

Systematic Gibbs sampler A commonly used variant of the Gibbs sampler is the MC obtained by systematic cycle trough the vertex set. For instance, for we may decide to update vertex:

Back to the Boolean cube In this case:,. In each step we pick and flip the appropriate coordinate of with probability.

Stationary distribution for the Boolean Cube We will show that the uniform distribution is reversible (and therefore stationary). Denote by the transition probability from state to. We need to show that: for any two states. Denote by the number of indexes in which and differ: ◦ Case no.1: ◦ Case no.2: ◦ Case no.3:

Reminder - Coupling For a given MC with state space and transitions matrix, a coupling for the MC is a MC on the state space with transition probabilities defined by:

Reminder (cont.) - The coupling lemma For a coupling, based on a ground MC on. Suppose is a function satisfying the condition: for all and all : then the mixing time for is bounded by.

Fast convergence of the Gibbs sampler for the Boolean cube Coupling: we run two Markov chains and simultaneously. We go threw the coordinates systematically. If at some integer time it holds that: We flip both of them with probability Otherwise we flip each one separately with same probability.

Fast convergence of the Gibbs sampler for the Boolean cube For any coordinate : after one cycle threw the coordinates: After cycles threw the chain: The event implies that for at least one coordinate : meaning that:

Fast convergence of the Gibbs sampler for the Boolean cube Setting and solving for gives: Now, using the coupling lemma, it means that the mixing time is bounded by:

Gibbs sampler for q-colorings For a vertex and an assignment of colors to the vertices other than, the conditional distribution of the color at is uniform over the set of all colors that are not attained in at some neighbor of. A Gibbs Sampler for random q-colorings is therefore an -valued MC where at each time, transitions take place as follows: ◦ Pick a vertex uniformly at random. ◦ Pick according to the uniform distribution over the set of colors that are not attained at any neighbor of. ◦ Leave the color unchanged at all other vertices, i.e. let for all vertices except.

Gibbs sampler for q-colorings It has as a stationary distribution, the proof is similar to the hard-core model case, based on the fact that it is reversible. Whether or not the chain is irreducible depends on and, and it is a nontrivial question to generally determine this. But for big enough ( is always enough) it is.

Fast convergence of Gibbs sampler for q-coloring Theorem [HAG 8.1]: Let be a graph. Let be the number of vertices in, and suppose any vertex has at most neighbors. Suppose furthermore that Then, for any fixed, the number of iterations needed for the systematic sweep Gibbs sampler described above (starting from any fixed -coloring ) to come within total variation distance of the target distribution is at most:

Fast convergence of Gibbs sampler for q-coloring Comments: ◦ Can be proved for (rather than ). ◦ The important bound is: for some.

Fast convergence of Gibbs sampler for q-coloring Coupling: we run two -valued Markov chains and simultaneously. When a vertex is chosen to be updated, we pick a new color for it uniformly at random among the set of colors that are not attained by any neighbor of. Concretely: Let be an i.i.d sequence of random permutations of, each of them uniformly distributed on the set the of such permutations.

Fast convergence of Gibbs sampler for q-coloring (cont.) At each time, the updates of the 2 chains use the permutation and the vertex to be updated is assigned the new value where: in the first chain. In the second chain, we set where:

Successful and not-successful updates If at some (random, hopefully not too large) time, we will also have for all First consider the probability that: for a given vertex. We call an update of the 2 chains at time successful if it results in having: Otherwise – we say that the update failed.

Fast convergence of Gibbs sampler for q-coloring Notations: The event of having successful update has probability: Respectively: Using that and that we get: (we only used and some algebra).

Fast convergence of Gibbs sampler for q-coloring Hence, we have, after steps of the MC-s, that for each vertex : Now, consider updates during the 2 nd sweep of the Gibbs sampler, i.e. between times and. For an update at time during the 2 nd sweep to fail the configurations and need to differ in at least one neighbor of. Each neighbor has with probability at most By summing over at most neighbors we get that:

Fast convergence of Gibbs sampler for q-coloring “discrepancy” means that there exists a neighbor of with. By Repeating the arguments above we get: Hence, after steps of the Markov chains, each vertex has probability at most: By arguing the same for the third sweep:

Fast convergence of Gibbs sampler for q-coloring and for any : The event implies that for at least one vertex, meaning that:

Fast convergence of Gibbs sampler for q-coloring Summary: By setting and solving for, we get by using the coupling lemma that for: It holds that the mixing time after scans is bounded by.

Fast convergence of Gibbs sampler for q-coloring To go from the number of scans to the number of steps of the MC, we have to multiply by, giving that: should be enough. In order to make sure that gets an integer value we should take to be: