# 1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square.

## Presentation on theme: "1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square."— Presentation transcript:

1 The Monte Carlo method

2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square centered at the origin (0,0). P(Z=1)=  /4.

3 Assume we run this experiment m times, with Z i being the value of Z at the ith run. If W=  i Z i, then  W ’ =4W/m is an estimate of .

4 By Chernoff bound, Def: A randomized algorithm gives an ( ,  )- approximation for the value V if the output X of the algorithm satisfies Pr[|X-V|  V]  1- .

5 The above method for estimating  gives an ( ,  )-approximation, as long as  <1 and m large enough.

6 Thm1: Let X 1, …,X m be independent and identically distributed indicator random variables, with  =E[X i ]. If m  (3 ln (2/  ))/  2 , then I.e. m samples provide an ( ,  )-approximation for . Pf: Exercise!

7 Def: FPRAS: fully polynomial randomized approximation scheme. A FPRAS for a problem is a randomized algorithm for which, given an input x and any parameters  and  with 0< ,  <1, the algorithm outputs an ( ,  )-approximation to V(x) in time poly(1/ , ln  -1,|x|).

8 Def: DNF counting problem: counting the number of satisfying assignments of a Boolean formula in disjunctive normal form (DNF). Def: a DNF formula is a disjunction of clauses C 1  C 2  …  C t, where each clause is a conjunction of literals. Eg. (x 1  x 2  x 3 )  (x 2  x 4 )  (x 1  x 3  x 4 )

9 Counting the number of satisfying assignments of a DNF formula is actually  #P- complete. Counting the number of Hamiltonian cycles in a graph and counting the number of perfect matching in a bipartite graph are examples of #P-complete problems.

10 A na ï ve algorithm for DNF counting problem: Input: A DNF formula F with n variables. Output: Y = an approximation of c(F) The number of satisfying Assigments of F. 1. X  0. 2. For k=1 to m, do: (a) Generate a random assignment for the n variables, chosen uniformly at random. (b) If the random assignment satisfies F, then X  X+1. 3. Return Y  (X/m)2 n.

11 Analysis Pr[X k =1]=c(F)/2 n. Let X= and then E[X]=mc(F)/2 n. Xk=Xk= 1 If the k-th iteration in the algorithm generated a satisfying assignment; 0 o/w.

12 Analysis By Theorem 1, X/m gives an ( ,  )- approximation of c(F)/2 n, and hence Y gives an ( ,  )-approximation of c(F), when m  3  2 n ln (2/  ))/  2 c(F). If c(F)  2 n /poly(n), then this is not too bad, m is a poly. But, if c(F)=poly(n), then m=O(2 n /c(F))!

13 Analysis Note that if C i has l i literals then there are exactly 2 n- l i satisfying assignments for C i. Let SC i denote the set of assignments that satisfy clause i. U={(i,a): 1  i  t and a  SC i }. |U|= Want to estimate Define S={(i,a): 1  i  t and a  SC i, a  SC j for j<i}.

14 DNF counting algorithm II: Input: A DNF formula F with n variables. Output: Y: an approximation of c(F). 1. X  0. 2. For k=1 to m do: (a) With probability choose, uniformly at random an assignment a  SC i. (b) If a is not in any SC j, j<i, then X  X+1. 3. Return Y 

15 DNF counting algorithm II: Note that |U|  t|S|. Why? Let Pr[i is chosen]= Then Pr[(i,a) is chosen] =Pr[i is chosen]  Pr[a is chosen|i is chosen] =

16 DNF counting algorithm II: Thm: DNF counting algorithm II is an FPRAS for DNF counting problem when m=  (3t/  2 ) ln (2/  ) . Pf: Step 2(a) chooses an element of U uniformly at random. The probability that this element belongs to S is at least 1/t. Fix any ,  >0, and let m=  (3t/  2 ) ln (2/  ) . poly(t,1/ , ln (1/  ))

17 DNF counting algorithm II: The processing time of each sample is poly(t). By Thm1, with m samples, X/m gives an ( ,  )- approximation of c(F)/|U| and hence Y gives an ( ,  )-approximation of c(F).

18 Counting with Approximate Sampling Def: w: the output of a sampling algorithm for a finite sample space . The sampling algorithm generates an  -uniform sample of  if, for any subset S of , |Pr[w  S]-|S|/|  || .

19 Def: A sampling algorithm is a fully polynomial almost uniform sampler (FPAUS) for a problem if, given an input x and  >0, it generates an  - uniform sample of  (x) in time poly(|x|, ln (1/  )). Consider an FPAUS for independent sets would take as input a graph G=(V,E) and a parameter . The sample space: the set of all independent sets in G.

20 Goal: Given an FPAUS for independent sets, we construct an FPRAS for counting the number of independent sets. Assume G has m edges, and let e 1, …,e m be an arbitrary ordering of the edges. E i : the set of the first i edges in E and let G i =(V,E i ).  (G i ): denote the set of independent sets in G i.

21 |  (G 0 )|=2 n. Why? To estimate |  (G)|, we need good estimates for

22 Let be estimate for r i, then To evaluate the error, we need to bound the ratio To have an ( ,  )-approximation, we want Pr[|R-1|  ]  1- .

23 Lemma: Suppose that for all i, 1  i  m, is an (  /2m,  /m)-approximation for r i. Then Pr[|R-1|  ]  1- . Pf: For each 1  i  m, we have

24 Equivalently,     

25 Estimating r i : Input: Graph G i-1 =(V,E i-1 ) and G i =(V,E i ) Output: = an approximation of r i. 1. X  0 2. Repeat for M=  (1296m 2 /  2 ) ln (2m/  )  independent trials: (a) Generate an (  /6m)-uniform sample from  (G i-1 ). (b) If the sample is an independent set in G i, then X  X+1. 3. Return  X/M.

26 Lemma: When m  1 and 0<  1, the procedure for estimating r i yields an (  /2m,  /m)- approximation for r i. Pf: Suppose G i-1 and G i differ in that edge {u,v} is in G i but not in G i-1.  (G i )   (G i-1 ). An independent set in  (G i-1 )\  (G i ) contains both u and v.

27 Associate each I  (G i-1 )\  (G i ) with an independent set I\{v}  (G i ). Note that I ’  (G i ) is associated with no more than one independent set I ’  {v}  (G i-1 )\  (G i ), thus |  (G i-1 )\  (G i )|  |  (G i )|. It follows that

28 Let X k =1 if the k-th sample is in  (G i ) and 0 o/w. Because our samples are generated by an (  /6m)-uniform sampler, by definition, 

29 By linearity of expectations,   Since r i  1/2, we have

30 If M  3 ln (2m/  )/(  /12m) 2 (1/3)=1296m 2  -2 ln (2m/  ), then Equivalently, with probability 1-  /m, -----(1)

31 As we have Using, r i  1/2, then -----(2)

32 Combining (1) and (2), with probability 1-  /m, This gives the desired (  /2m,  /m)-approximation. Thm: Given FPAUS for independent sets in any graph, we can construct an FPRAS for the number of independent sets in a graph G.

33 The Markov Chain Monte Carlo Method The Markov Chain Monte Carlo Method provides a very general approach to sampling from a desired probability distribution. Basic idea: Define an ergodic Markov chain whose set of states is the sample space and whose stationary distribution is the required sampling distribution.

34 Lemma: For a finite state space  and neighborhood structure {N(x)|x  }, let N=max x  |N(x)|. Let M be any number such that M  N. Consider a Markov chain where If this chain is irreducible and aperiodic, then the stationary distribution is the uniform distribution. P x,y = 1/M if x  y and y  N(x) 0 if x  y and y  N(x) 1-N(x)/M if x=y.

35 Pf: For x  y, if  x =  y, then  x P x,y =  y P y,x, since P x,y =P y,x =1/M. It follows that the uniform distribution  x =1/|  | is the stationary distribution by the following theorem.

36 Thm: P: transition matrix of a finite irreducible and ergodic Markov chain. If there are nonnegative numbers =(  0,..,  n ) such that and if, for any pair of states i,j,  i P i,j =  j P j,i, then is the stationary distribution corresponding to P. Pf:  Since, it follows that is the unique stationary distribution of the Markov Chain.

37 Eg. Markov chain with states from independent sets in G=(V,E). 1. X 0 is an arbitrary independent set in G. 2. To compute X i+1 : (a) choose a vertex v uniformly at random from V; (b) if v  X i then X i+1 =X i \{v}; (c) if v  X i and if adding v to X i still gives an independent set, then X i+1 =X i  {v}; (d) o/w, X i+1 =X i.

38 The neighbors of a state X i are independent sets that differ from X i in just one vertex. Since every state is reachable from the empty set, the chain is irreducible. Assume G has at least one edge (u,v), then the state {v} has a self-loop (P v,v >0), thus aperiodic. When x  y, it follows that P x,y =1/|V| or 0, by the previous lemma, the stationary distribution is the uniform distribution.

39 The Metropolis Algorithm (When stationary distribution is nonuniform) Lemma: For a finite state space  and neighborhood structure {N(x)|x  }, let N=max x  |N(x)|. Let M be and number such that M  N. For all x , let  x >0 be the desired probability of state x in the stationary distribution. Consider a Markov chain Then, if this chain is irreducible and aperiodic, then the stationary distribution is given by  x. P x,y = (1/M)min(1,  y /  x ) if x  y and y  N(x) 0 if x  y and y  N(x) 1-  y  x P x,y if x=y.

40 Pf: For any x  y, if  x  y, then P x,y =1/M and P y,x =(1/M)(  x /  y ). It follows that P x,y =1/M=(  y /  x )P y,x.   x P x,y =  y P y,x. Similarly, for  x >  y. Again, by the previous theorem,  x ’ s form the stationary distribution.

41 Eg. Create a Markov chain, in the stationary distribution, each independent set I has probability proportional to  |I|, for some >0. I.e.  x = |I x | /B, where I x is the independent set corresponding to state x and B=  x |I x |. Note that, when =1, this is the uniform distribution.

42 1. X 0 is an arbitrary independent set in G. 2. To compute X i+1 : (a) choose v uniformly at random from V; (b) if v  X i then X i+1 =X i \{v} with probability min(1,1/ ) ; (c) if v  X i and X i  {v} still gives an independent set, then put X i+1 =X i  {v} with probability min(1, ) ; (d) o/w, set X i+1 =X i.

Download ppt "1 The Monte Carlo method. 2 (0,0) (1,1) (-1,-1) (-1,1) (1,-1) 1 Z= 1 If  X 2 +Y 2  1 0 o/w (X,Y) is a point chosen uniformly at random in a 2  2 square."

Similar presentations