Download presentation
Presentation is loading. Please wait.
1
Randomized Algorithms
The Monte Carlo Method Speaker: Chuang-Chieh Lin Advisor: Professor Maw-Shang Chang National Chung Cheng University 2018/11/21
2
References Professor S. C. Tsai’s slides.
[MR95] Randomized Algorithms, Rajeev Motwani and Prabhakar Raghavan. [MU05] Probability and Computing - Randomized Algorithms and Probabilistic Analysis, Michael Mitzenmacher and Eli Upfal. Wikipedia – The Free Encyclopedia 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
3
Outline Introduction The DNF Counting Problem DNF counting algorithms
The Monte Carlo Method PRAS and FPRAS The DNF Counting Problem DNF counting algorithms A Naïve Approach A FPRAS Approximate Counting from FPAUS 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
4
Introduction The Monte Carlo method refers to a collection of tools for estimating values through sampling and simulation. Monte Carlo techniques are used extensively in almost all areas of physical sciences and engineering. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
5
Introduction (cont’d)
Let us first consider the following approach for estimating the value of the constant . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
6
¼ Estimating L e t ( X ; Y ) b a p o i n c h s u f r m l y d 2 £ q g ,
, . (1,1) (-1,1) (0,0) (0,0) (-1,-1) (1,-1) 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
7
¼ Estimating (cont’d) I f w e l t Z = ½ 1 i p X + Y · ; o h r s . T h
(0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 I f w e l t Z = 1 i p X 2 + Y ; o h r s . T h e p r o b a i l t y Z = 1 s x c f q u . H n , P [ ] 4 : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
8
Estimating (cont’d) Assume we run this experiment m times, with Zi being the value of Z at the ith run. I f W = P m i 1 Z , t h e n E [ ] X 4 : H c ( ) s a u r l o . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
9
¼ Estimating (cont’d) Applying the Chernoff bound, we have
Therefore, by using a sufficiently large number of samples we can obtain, with high probability, as tight an approximation of as we wish. P r [ j W " ] = m 4 E 2 e 1 : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
10
( " ; ¢ ) - a p r o x i m t n d z e l g h D e ¯ n i t o : A r a d m z
v s ( " ; ) - p x f u V X P [ j ] 1 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
11
The above method for estimating gives an (,Δ)-approximation, as long as < 1 and m large enough. 2 e m " = 1 ) l n ( : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
12
We may generalize the idea behind our technique for estimating to provide a relation between the number of samples and the quality of the approximation. We use the following simple application of the Chernoff bound throughout our discussing. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
13
Theorem 1 L e t X ; : b i n d p a c l y s - r u o v , w h ¹ = E [ ] .
f ( 3 2 ) " P j T x Proof: Exercise! 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
14
Approximation Schemes
There are problems for which the existence of an efficient (polynomial time) algorithm that gives an exact answer would imply that P = NP. Hence it is unlikely that such an algorithm will be found. So we eye on approximation algorithms instead. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
15
Approximation Schemes (cont’d)
For approximation algorithms, there are some important approximation schemes as follows. Polynomial time approximation schemes (PTAS) Fully polynomial time approximation schemes (FPTAS) Polynomial randomized approximation schemes (PRAS) Fully polynomial randomized approximation schemes (FPRAS) : We will focus on this scheme in this talk. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
16
Notes… Here we are considering couting problems that map inputs x to values V(x). For example, given an graph, we might want to know an approximation to the number of independent sets in the graph. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
17
PRAS A P R S f o r a p b l e m i s n d z g t h w c , v u x y " ¢ <
< ; 1 ( ) - V j . So, what is FPRAS? 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
18
FPRAS A F P R S f o r a p b l e m i s n d z g t h w c , v u x y " ¢
< ; 1 ( ) - V j = . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
19
The DNF Counting Problem
Let us consider the problem of counting the number of satisfying assignments of a Boolean formula in disjunctive normal form (DNF). 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
20
The DNF Counting Problem (cont’d)
Definition: a DNF formula is a disjunction of clauses C1 C2… Ct , where each clause is a conjunction of literals. For example, the following is a DNF formula: ( X 1 ^ 2 3 ) _ 4 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
21
The DNF Counting Problem (cont’d)
Counting the number of satisfying assignments of a DNF formula is actually #P-complete (pronounced “sharp-P”). What is #P? 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
22
#P A problem is in the class #P if there is a polynomial time, nondeterministic Turing machine such that, for any input I, the number of accepting computations equals the number of different solutions associated with the input I. Clearly, a #P problem must be at least hard as the corresponding NP problem. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
23
#P-complete A problem is #P-complete if and only if it is in #P, and every problem in #P can be reduced to it in polynomial time. Counting the number of Hamiltonian cycles in a graph and counting the number of perfect matching in a bipartite graph are examples of #P-complete problems. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
24
How Hard Is the DNF Counting Problem?
v e n a y C N f o r m u l H , w c p d M g s t b D h . T 2 ( ) ( X 1 _ 2 3 ) ^ 4 Negation 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
25
How Hard Is the DNF Counting Problem? (cont’d)
So the DNF counting problem is at least hard as solving the NP-complete problem SAT. Thus the DNF counting problem is actually a #P-complete problem. Why? 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
26
How Hard Is the DNF Counting Problem? (cont’d)
It is unlikely that there is a polynomial time algorithm that computes the exact number of solutions of a #P-complete problem. Such an algorithm would imply that P = NP. It is therefore interesting to find an approximation scheme, such as FPRAS, for the number of satisfying assignments of a DNF formula. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
27
A Naïve Algorithm Let c(F) be the number of satisfying assignments of a DNF formula F. Here we assume that c(F) > 0, since it is easy to check whether c(F) = 0 before running our sampling algorithm. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
28
DNF counting algorithm I
p u t : A D N F f o r m l a w i h v b e s O Y = x c ( ) X Ã . k 1 , d ? G g - y 2 + R 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
29
Analysis F L e t X = 8 < : 1 I f h i r a o n l g m d s y ; w . T u
k = 8 < : 1 I f h i r a o n l g m d s y ; w . T u P [ ] c ( ) 2 , E H Y 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
30
Analysis (cont’d) F B y T h e o r m 1 , X = g i v s a n ( " ; ¢ ) - p
2 d Y w 3 l : I b . u O ! 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
31
Analysis (cont’d) What is the problem?
The problem with this sampling approach is that the set of satisfying assignments might not be sufficiently dense in the set of all assignments. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
32
Revising Algorithm I We now revise the naïve algorithm to obtain a FPRAS. Let a DNF formula F = C1 C2 Ct. Assume WLOG that no clause includes a variable and its negation. If clause Ci has li literals, then there are exactly 2n li satisfying assignments for Ci. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
33
Revising Algorithm I (cont’d)
Let SCi denote the set of assignments that satisfy clause i and let U = {(i, a): 1 i t and a SCi}. N o t i c e h a j U = P 1 S C . T h e v a l u t w n o s i m c ( F ) = j S 1 C . H U , g f y r d p 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
34
Revising Algorithm I (cont’d)
To estimate c(F), we define a subset of U with size c(F). We construct this set by selecting, for each satisfying assignment of F, exactly one pair of U that has this assignment. Specifically, we consider the following set S: |S| = c(F) S = f ( i ; a ) j 1 t n d 2 C o r < g : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
35
Then let us consider the second DNF counting algorithm.
2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
36
DNF counting algorithm II:
p u t : A D N F f o r m l a w i h v b e s O Y = x c ( ) 1 . X Ã 2 k , d W y j S C P g ; < + 3 R Constructing S 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
37
Analysis Note that |S| / |U| 1/t.
R e m a r k : j U = t P i 1 S C n d c ( F ) . Note that |S| / |U| 1/t. Since each assignment can satisfy at most t different clauses. Now our S is relatively dense in U. Because the ith clause has |SCi| satisfying assignments, we have 或者用|U|t|S|來想 P r [ i s c h o e n ] = j S C t 1 U : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
38
Analysis (cont’d) Thus the probability that we choose the pair (i, a) is P r [ ( i ; a ) s c h o e n ] = j S C U 1 : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
39
Theorem 2 D N F c o u n t i g a l r h m I s P R A S f e p b w = d ( 3
" 2 ) . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
40
Proof of Theorem 2 Step 2(a) chooses an element of U uniformly at random. The probability that this element belongs to S is at least 1/t. (by the previous analysis) F i x a n y " ; > , d l e t m = ( 3 2 ) . S o m i s p l y n a t ; " d ( 1 = ) . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
41
Proof of Theorem 2 Besides, the processing time of each sample is polynomial in t. You can check this by observing 2(a) and 2(b). By Theorem 1, with m samples, X/m gives an (,Δ)-approximation of c(F)/|U| and hence Y gives an (,Δ)-approximation of c(F). 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
42
Approximate uniform sampling
Now, we are going to present the outline of a general reduction This general reduction shows that, if we can sample almost uniformly a solution to a self-reducible combinatorial problem, then we can construct a randomized algorithm that approximately counts the number of solutions to the problem. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
43
Approximate uniform sampling (cont’d)
We will demonstrate this technique for the problem of counting the number of independent sets in a graph. We first need to formulate the concept of approximate uniform sampling. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
44
Approximate uniform sampling (cont’d)
In this setting, we are given a problem instance in the form of an input x, and there is an underlying finite sample space (x) associated with the input. Let us see the following two definitions to make clear the concept of approximate uniform sampling. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
45
- uniform Sample L e t w b h ( r a n d o m ) u p f s l i g - ¯ c .
" , y S j P [ 2 ] : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
46
FPAUS A s a m p l i n g o r t h f u y e ( F P U S ) b , v x d " > -
- 1 ; j . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
47
FPRAS through FPAUS Consider an FPAUS for independent sets which would take as input a graph G(V, E) and a parameter . The sample space: the set of all independent sets in G. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
48
FPRAS through FPAUS (cont’d)
Goal: Given an FPAUS for independent sets, we construct an FPRAS for counting the number of independent sets. Assume G has m edges, and let e1,…,em be an arbitrary ordering of the edges. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
49
FPRAS through FPAUS (cont’d)
Let Ei be the set of the first i edges in E and let Gi = (V, Ei). Note that G = Gm . Gi1 is obtained from Gi by removing a single edge ei. Let (Gi) denote the set of independent sets in Gi. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
50
FPRAS through FPAUS (cont’d)
e n u m b r o f i d p t s G c a x j ( ) = 1 2 : |(G0)| = 2n. Why? To estimate |(G)|, we need good estimates for Since $G_0$ has no edges, every subset of $V$ is an independent set and $|\Omega(G_0)| =2^n$. $r_i$ 就是前面所講的self-reducible的概念。 r i = j ( G ) m 1 ; f o : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
51
FPRAS through FPAUS (cont’d)
L e t r i b a n s m f o h , w v j ( G ) 2 Q = 1 . T l u d R I " ; - p x P [ ] Let us see the following lemma. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
52
Lemma 1 S u p o s e t h a f r l i , 1 · m n ( " = 2 ; ¢ ) - x . T P [
j R ] : (By the definition of (, Δ)-approximation randomized algorithms.) 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
53
Proof of Lemma 1 By the assumption of Lemma 1, for each 1 i m, we have Equivalently, for each 1 i m, P r [ j e i " 2 m ] 1 : P r [ j e i > " 2 m ] < : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
54
Proof of Lemma 1 (cont’d)
By the union bound, we have Then we obtain Equivalently, P r [ m i = 1 f j e > " 2 g ] < : P r [ m \ i = 1 f j e " 2 g ] : P r [ m \ i = 1 f " 2 e + g ] : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
55
Proof of Lemma 1 (cont’d)
Thus we have Therefore, P r [ ( 1 " 2 m ) Y i = e + ] : P r [ 1 " R + ] : ( s i n c e 1 " 2 m ) a d + . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
56
Lemma 1 S u p o s e t h a f r l i , 1 · m n ( " = 2 ; ¢ ) - x . T P [
j R ] : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
57
Estimating ri Hence all we need is a method for obtaining an (/2m, Δ/m)-approximation algorithm for the ri. An algorithm estimating ri is given as follows. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
58
Estimating ri I n p u t : G r a h s = ( V ; E ) d O e o x m f . X Ã 2
1 = ( V ; E ) d O e o x m f . X Ã 2 R M 9 6 " l - b , + 3 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
59
Estimating ri (cont’d)
The constants in the procedure were chosen to facilitate the proof of the following lemma, which justifies the algorithm’s approximation . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
60
Lemma 2 W h e n m ¸ 1 a d < " · , t p r o c u f s i g y l ( = 2 ; ¢
< " , t p r o c u f s i g y l ( = 2 ; ) - x . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
61
Proof of Lemma 2 First we will show that ri is not too small,
avoiding the problem we have introduced previously. Suppose Gi1 and Gi differ in that edge {u, v} is in Gi but not in Gi1. (Gi) (Gi1). since an independent set in Gi is also an independent set in Gi1. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
62
Proof of Lemma 2 (cont’d)
Each independent set in (Gi1)\(Gi) contains both u and v. Why? Associate each I (Gi1)\(Gi) with an independent set I \{v} (Gi). In this mapping, note that I (Gi) is associated with no more than one independent set I {v} (Gi1)\(Gi), thus |(Gi1)\(Gi)| |(Gi)|. 第一點: 假設 $\Omega(G_{i-1})\setminus\Omega(G_i)$ 中有一個independent set $I_k$ 不同時包含u和v,則$G_{i-1}$ 加入 $(u,v)$ 變成 $G_i$ 之後, $I_k$依然屬於 $\Omega(G_i)$, 於是就矛盾了。 第三點: 因為I’ 頂多associated I’\cup {v} 或 I’\cup {u}其中之一。若兩者都有,則違反第一點。 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
63
Proof of Lemma 2 (cont’d)
It follows that Now consider our M samples. Let a random variable Xk = 1 if the kth sample is in (Gi), and Xk = 0 otherwise r i = j ( G ) 1 + n 2 : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
64
Proof of Lemma 2 (cont’d)
Because our samples are generated by an (/6m)-uniform sampler, by definition, j P r [ X k = 1 ] ( G i ) " 6 m : E 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
65
Proof of Lemma 2 (cont’d)
By linearity of expectations, Therefore, we have r i j E [ P M k = 1 X ] ( G i ) " 6 m : e r i j E [ e r i ] = P M k X ( G ) 1 " 6 m : 因為平均的期望值一定比最大X_k之期望值小,比最小的X_k之期望值大, 因此平均的期望值與$|\Omega(G_i)| / |\Omega(G_{i-1})|$ 的差距必定也滿足任一E[X_k]的與之的差距。 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
66
Proof of Lemma 2 (cont’d)
Since ri ½, we have E [ e r i ] " 6 m 1 2 3 : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
67
Proof of Lemma 2 (cont’d)
I f M 3 l n ( 2 m = ) " 1 9 6 o b t a i e d r T h , P [ j E ] 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
68
Proof of Lemma 2 (cont’d)
Equivalently, with probability 1 Δ/m, 1 " 2 m e r i E [ ] + : (1) A s j E [ e r i ] " 6 m , w h a v 1 " 6 m r i E [ e ] + : U s i n g t h a r 1 = 2 e y l d " 3 m E [ ] + . (2) 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
69
Proof of Lemma 2 (cont’d)
Combining (1) and (2), with probability 1 Δ/m, we have Thus this gives the desired (/2m, Δ/m)-approximation. ( 1 " 3 m ) e r i + 2 : 1 " 2 m 1 + " 2 m 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
70
Lemma 2 W h e n m ¸ 1 a d < " · , t p r o c u f s i g y l ( = 2 ; ¢
< " , t p r o c u f s i g y l ( = 2 ; ) - x . 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
71
Remark The number of samples M is polynomial in m, , and ln Δ1, and the time for each sample is polynomial in the size of the graph and ln 1/, we therefore have the following theorem. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
72
Theorem 3 G i v e n a F P A U S f o r d p t s y g h , w c u R m - b .
2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
73
However,… How to obtain an FPAUS for independent sets for graphs?
See Chapter 11, Coupling of Markov Chains, page in [MU05]. Or consider the Markov chain Monte Carlo (MCMC) method. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
74
The Markov Chain Monte Carlo Method
The Markov Chain Monte Carlo method provides a very general approach to sampling from a desired probability distribution. Basic idea: Define an ergodic Markov chain whose set of states is the sample space and whose stationary distribution is the required sampling distribution. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
75
The Markov Chain Monte Carlo Method (cont’d)
Let X0, X1, … , Xn be a run of the chain. The Markov chain converges to the stationary distribution from any starting state X0. After a sufficiently large number of steps r, the distribution of the state Xr will be close to the stationary distribution, so it can be used as a sample. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
76
The Markov Chain Monte Carlo Method (cont’d)
Similarly, repeating this argument with Xr as the starting point, we can use X2r as another sample, and so on. We can therefore use the sequence of states Xr , X2r , … as almost independent samples from the stationary distribution of the Markov chain. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
77
The Markov Chain Monte Carlo Method (cont’d)
The efficiency of MCMC depends on how large r must be to ensure a suitable good sample how much computation is required for each step of the Markov chain Here we focus on finding efficient Markov chains with the appropriate stationary distribution. For simplicity, we consider constructing a Markov chain with a uniform stationary distribution over the state space . “Coupling”, see Ch. 11 of [MU05] 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
78
Revisiting the Independent Sets in a Graph
Given a graph G(V, E). Let the state space be all of the independent sets of G. Two independent states x and y are neighbors if they differ in just one vertex. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
79
Revisiting the Independent Sets in a Graph (cont’d)
The neighbor relationship guarantees that the state space is irreducible. Since all independent sets can reach (resp., can be reached from) the empty independent set by a sequence of vertex deletions (resp., vertex additions). 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
80
Revisiting the Independent Sets in a Graph (cont’d)
Next we need to establish the transition probabilities. A naïve approach: Random walk on the graph of the state space. Yet the probability of a vertex is proportional to its degree, so this may not lead to a uniform distribution. Consider the following lemma. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
81
Lemma 3 F o r a ¯ n i t e s p c d g h b u f N ( x ) j 2 , l = m . L
y 1 C k v w P ; 8 < : 6 I 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
82
T h a t i s , f w e m o d y r n l k b g v c x p - u . L 2018/11/21
Computation Theory Lab, CSIE, CCU, Taiwan
83
Proof of Lemma 3 For any x y, since x = y and Px,y = Py,x (= 1 / M), we have Then apply the following theorem (Theorem 7.10 at [MU05]), it follows that the uniform distribution is the stationary distribution. x P ; y = : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
84
Theorem 4 (for the proof)
C o n s i d e r a t , u c b l g M k v h w m x P . I f - = ( ; : ) 1 y p j Proof: Pleae refer to page 172 in [MU05]. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
85
Example: Independent Sets in a Graph
Consider the following simple Markov chain whose states are independent sets in G(V, E). 1 . X i s a n r b t y d e p G 2 T o c m u + : ( ) h v x f l V ; = g , [ w 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
86
Example: Independent Sets in a Graph (cont’d)
The neighbors of a state Xi are independent sets that differ from Xi in just one vertex. Since every state is reachable from the empty set, the chain is irreducible. Assume G has at least one edge (u,v), then the state {v} has a self-loop (P{v},{v}> 0), thus aperiodic. When Xi Xj , it follows that PXi, Xj = 1 / |V| or 0, by the previous lemma, the stationary distribution is the uniform distribution. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
87
How about the non-uniform cases?
However, in some other cases, we may want to sample from a chain with non-uniform stationary distribution. What should we do? Solution: the Metropolis Algorithm. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
88
The Metropolis Algorithm
Let us again assume that we have designed an irreducible state space for our Markov chain. Now we want to construct a Markov chain on this state space with a stationary distribution x = b(x) / B, where for all x we have b(x) > 0 and such that B = x b(x) is finite. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
89
Lemma 4 F o r a ¯ n i t e s p c d g h b u f N ( x ) j 2 , l = m . L
y > - C k v w P ; 8 < : 1 6 I 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
90
Proof of Lemma 4 The proof is similar to the one of Lemma 3 as follows. For any x y, if x y , then Px,y = 1 / M and Py,x= (1 / M)(x / y). It follows that Px,y= 1 / M = (y / x) Py,x . x Px,y = y Py,x. The case for x > y is similar. Again, by the previous theorem, x’s form the stationary distribution. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
91
Example: Independent Sets in a Graph
Create a Markov chain, in the stationary distribution, each independent set I has probability proportional to| I |, for some > 0. That is, x = |Ix|/B, where Ix is the independent set corresponding to state x and B = x|Ix|. Note that, when =1, this is the uniform distribution. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
92
Example: Independent Sets in a Graph (cont’d)
Consider the following variation on the previous Markov chain for independent sets in a graph G(V, E). 1 . X i s a n r b t y d e p G 2 T o c m u + : ( ) h v x f l V ; = g w - , [ 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
93
Example: Independent Sets in a Graph (cont’d)
First, we propose a move by choosing a vertex v to add or delete. Each vertex is chosen with probability 1/M, here M = |V|. Second, this proposal is then accepted with probability min(1, y / x), where x is the current state and y is the proposed state where the chain will move. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
94
Example: Independent Sets in a Graph (cont’d)
y / x is “” if the chain attempts to add a vertex, and is “1/” if the chain attempts to delete a vertex. Then we obtain the transition probability Px,y is Thus Lemma 4 applies. P x ; y = 1 M m i n ( ) : 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
95
Example: Independent Sets in a Graph (cont’d)
Comments: We never need to know B = x|Ix| Calculating this sum would cost much time. Our Markov chains gives the correct stationary distribution by using the ratios y / x , which are much easier to deal with. 2018/11/21 Computation Theory Lab, CSIE, CCU, Taiwan
96
Thank you.
Similar presentations
© 2025 SlidePlayer.com Inc.
All rights reserved.