# Approximation Algorithms Chapter 28: Counting Problems 2003/06/17.

## Presentation on theme: "Approximation Algorithms Chapter 28: Counting Problems 2003/06/17."— Presentation transcript:

Approximation Algorithms Chapter 28: Counting Problems 2003/06/17

Issues in this chapter n Definitions for counting # solutions –#P, #P-complete, fully polynomial randomized approximation scheme (FPRAS). n Counting DNF solutions n Network reliability

Summary n This chapter explains methods of approximately counting the number of solutions of #P-complete problems. Instance f=(x 1 ∧￢ x 2 ) ∨ ( ￢ x 1 ∧￢ x 2 ) ∨ (x 2 ∧￢ x 2 ). finding a solution (problems in previous chapters) x 1 =1 (true), x 2 =0 (false). x1x1 x2x2 f 001 010 101 110 counting # solutions (this chapter) # truth assignments satisfying f is 2.

Two major approaches methods for counting the number of solutions Markov Chain Monte Carlo Combinatorial Algorithms (not using MCMC) Counting # truth assignments satisfying a DNF formula Estimating failure prob. of an undirected network Topics of this chapter

Definition of #P n #P denotes a class of counting problems. n We use the following notations for the definition. –L: a language in NP all instances satisfying constraints of an NP problem L 3SAT ={(x 1 ∨ x 1 ∨ x 1 ), (x 1 ∨ x 2 ∨￢ x 3 ) ∧ ( ￢ x 1 ∨ x 2 ∨ x 3 ), …} –M: a verifier for L M((x 1 ∨ x 1 ∨ x 2 ),((x 1, x 2 )=(1, 1))): –p: polynomial bounding the length of M’s certificates (y). p 3SAT (|x|) ≦ c 1 n ≦ c 2 |x| (x: instance, n: # variables, c 1, c 2 : constants). –f(x): the number of strings y s.t. |y| { "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/14/4193294/slides/slide_5.jpg", "name": "Definition of #P n #P denotes a class of counting problems.", "description": "n We use the following notations for the definition. –L: a language in NP all instances satisfying constraints of an NP problem L 3SAT ={(x 1 ∨ x 1 ∨ x 1 ), (x 1 ∨ x 2 ∨￢ x 3 ) ∧ ( ￢ x 1 ∨ x 2 ∨ x 3 ), …} –M: a verifier for L M((x 1 ∨ x 1 ∨ x 2 ),((x 1, x 2 )=(1, 1))): –p: polynomial bounding the length of M’s certificates (y). p 3SAT (|x|) ≦ c 1 n ≦ c 2 |x| (x: instance, n: # variables, c 1, c 2 : constants). –f(x): the number of strings y s.t. |y|

Definition of #P-complete n #P-complete intuitively means one of the most intractable counting problems in NP. n f is #P-complete if –f is in #P. –For any g in #P, g is reducible to f. There are a transducer R and a function S that are polynomial time computable. –R(x)∈Lf ⇔ x∈Lg.–R(x)∈Lf ⇔ x∈Lg. –g(x)=S(x, f(R(x)). #P f g1g1 g2g2 Rg1Rg1 Sg1Sg1 Rg2Rg2 Sg2Sg2

n The solution counting versions of all known NP- complete problems are #P-complete. n #P-complete problems can be divided into two. –An algorithm A is an FPRAS if, for any instance x, A runs in poly. time in |x| and 1/ε, and #P Approximable one (fully polynomial randomized approximation scheme; FPRAS) Inapproximable one FPRAS instance x A(x) is close (not necessarily equal) to f(x). A(x) can be far from f(x).

Issues in this chapter n Definitions for counting # solutions –#P, #P-complete, fully polynomial randomized approximation scheme (FPRAS). n Counting DNF solutions n Network reliability

Counting DNF solutions n Input: –a formula f in disjunctive normal form (DNF) on n Boolean variables. E.g., f EX =(x 1 ∧￢ x 2 ) ∨ (x 2 ∧ ￢ x 3 ) ∨ ( ￢ x 1 ). n Output: –The number of satisfying truth assignments of f. Let #f be the number (#f EX is 7). x1x1 x2x2 x3x3 f 0001 0011 0101 0111 x1x1 x2x2 x3x3 f 1001 1011 1101 1110

Approximation by random sampling n Task: To efficiently approximate #f. –The main idea Estimating #f by sampling a random variable X. –X is an unbiased estimator, i.e., E[X]=#f. –The standard deviation of X is within a poly. of E[X]. –An FPRAS Sample X a poly. number of times (in n and 1/ε). Output the mean. –In the following, we first consider efficient samplings of X.

Constructing an unbiased estimator n Y and Y(τ) are defined as follows: –Y(τ): 2 n (τ satisfies f), 0 (otherwise). –Pr(Y): uniform distribution on all 2 n truth assignments. n E[Y(τ)] is then an unbiased estimator. x1x1 x2x2 x3x3 fY 00018 00118 01018 01118 x1x1 x2x2 x3x3 fY 10018 10118 11018 11100

Standard deviation of Y(τ) n σ 2 (Y(τ)): variance –Square of the standard deviation. Not bounded by a polynomial of n. Not useful for constructing an FPRAS.

Constructing a new random variable n X: a random variable –X(τ)>0 only if τsatisfies f. n S i : a set of truth assignments that satisfy clause C i. –|S i |=2 n-ri where r i is the number of literals in clause C i. –#f=| ∪ S i |. –c(τ): # clauses that τ satisfies. –M: multiset union of the sets S i. |M|=Σ|S i |=Σ2 n-ri is easy to compute. n X(τ): |M|/c(τ).

Constructing a new random variable n Example: –f EX =(x 1 ∧￢ x 2 ) ∨ (x 2 ∧ ￢ x 3 ) ∨ ( ￢ x 1 ). –S i : a set of truth assignments that satisfy clause C i. S 1 ={(1,0,0), (1,0,1)}, S 2 ={(0,1,0), (1,1,0)}, S 3 ={(0,0,0), (0,0,1), (0,1,0), (0,1,1)}, |S 1 |=2 3-2 =2, |S 2 |=2 3-2 =2,|S 3 |=2 3-1 =4. #f=| ∪ S i |, c(τ): # clauses that τ satisfies. M: multiset union of the sets S i. –M EX =. X(τ)=|M|/c(τ). x1x1 x2x2 x3x3 cX 00008/1 0010 01028/2 01118/1 x1x1 x2x2 x3x3 cX 1001 1011 1101 11100

An FPRAS n Example –f EX =(x 1 ∧￢ x 2 ) ∨ (x 2 ∧ ￢ x 3 ) ∨ ( ￢ x 1 ). n for i=1 to k –Pick one clause C j from f with prob. |S j |/|M|. C 1 with prob. 2/8. –Pick a truth assignment τ i satisfying C i at random. τ i = (1,0,1). –Find c(τ i ) and X(τ i )= |M|/c(τ i ). c(τ i )=1, X(τ i )=8/1=8. n end-for n output X k =(X (τ 1 )+…+ X(τ k ))/k –X k =(8+4+8+8)/4=7. n x1x1 x2x2 x3x3 cX 00008/1 0010 01028/2 01118/1 x1x1 x2x2 x3x3 cX 1001 1011 1101 11100 From lemma 28.2, τ is picked with prob. c(τ)/|M|.

Overview Lemma 28.5, Theorem 28.6 There is an FPRAS for counting DNF solutions. Lemma 28.2, 28.3 X is an unbiased estimator. Lemma 28.4 The variance of X is sufficiently small.

Lemma 28.2 n Random variable X can be efficiently sampled. –Sampling X is done with picking a random element from the multiset M. 1. pick a clause so that the probability of picking clause C i is | S i |/|M|. 2. among the truth assignments satisfying the picked clause, pick one at random. –The probability with which truth assignment τ is picked is

Lemma 28.3 n X is an unbiased estimator for #f.

Lemma 28.4 n α=|M|/m. n If m denotes the number of clauses in f, then #clauses in f. # clauses satisfied with τ the average number of truth assignments satisfying one clause the number of truth assignments satisfying at least one clause

Lemma 28.5 and Theorem 28.6 n Let k=4(m － 1) 2 /ε 2. For any ε>0, Chebyshev’s inequality There is an FPRAS for the problem of counting DNF solutions.

Issues in this chapter n Definitions for counting # solutions –#P, #P-complete, fully polynomial randomized approximation scheme (FPRAS). n Counting DNF solutions n Network reliability

Network reliability n Input: –a connected undirected graph G=(V, E), with failure prob. for each edge e. Parallel edges between two nodes are allowed. n Output: –The prob. that the graph becomes disconnected. Denote the prob. by FAIL(p). v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 v1v1 v2v2 v3v3 v4v4 v5v5 v6v6 v7v7 disconnectedstill connected

Tractability of FAIL(p) n Tractable if FAIL(p) is not small. –“Small” means at least inverse polynomial. –FAIL(p) can be estimated by sampling. We will explain it later (in the proof of Theorem 28.11). n Intractable if FAIL(p) is small. –Sampling approaches do not work. Many samplings are required for the estimation. –In the following, we assume that FAIL(p) ≦ n － 4. n Pr(cut (C,C) gets disconnected)=p c. –where c is the number of edges crossing the cut. –p c decreases exponentially with capacity (# edges, c).

Ideas of the algorithm n For any ε>0, we will show that only polynomially many “small’’ cuts (in n and 1/ε) are responsible for 1 － ε fraction of the total failure probability FAIL(p). Moreover, these cuts, say E 1,…,E k, E i ⊆ E, can be enumerated in polynomial time. n We will construct a polynomial sized DNF formula f whose probability of being satisfied is precisely the probability that at least one of these cuts fails.

Illustration of the idea (1/2) v1v1 v3v3 v4v4 v5v5 v6v6 v7v7 v1v1 v3v3 v4v4 v5v5 v6v6 v7v7 v1v1 v3v3 v4v4 v5v5 v6v6 v7v7 Prob. p2p2 p3p3 p5p5 Ratio of prob. 1－ε1－ε ε Enumerable in polynomial time (Exercise 28.11-13)

Illustration of the idea (2/2) e1e1 e2e2 e2e2 e3e3 e4e4 cut E 1 cut E k e1e1 e2e2 cut E 2 e3e3 One-to-one correspondence x ei is true with probability p ei.

Lemma 28.8 (1/3) n The number of minimum cuts in G=(V,E) is bounded by n(n-1)/2. –Contractions of an edge. v1v1 v2v2 v7v7 v1v1 v3v3 v1v1 v 1 v 7 v 2,….,v 6 {v 1,….,v 6 }{v7}{v7} Cut ({v 1,….,v 6 }, {v 7 }) survives.

Lemma 28.8 (2/3) n Let M be the number of minimum cuts in G. –M is bounded by n(n-1)/2 if

Lemma 28.8 (3/3) n H: a graph at the beginning of contraction process. –Contractions never decrease the capacity of the minimum cut. The degree of each node in H is at least c. m is the number of nodes in H. Hence, H must have at least cm/2 edges. –The minimum cut survives with the probability (1- c/#edges).

Lemma 28.9 (1/3) n For any α ≧ 1, the number of α-min cuts in G is at most n 2α. –A cut is an α-min cut if its capacity is at most αc. –We assume α is a half-integer. Let k=2α. –Consider the two-phase process. 1. Contract edges at random until there remain k nodes in the graph. 2. Pick up a cut from all 2 k － 1 － 1 at random.

Lemma 28.9 (2/3) n Example –k=4. –Phase 1. –Phase 2. v1v1 v2v2 v7v7 v1v1 v3v3 v1v1 v1v1 v1v1

Lemma 28.9 (3/3)

FAIL(p) n In case that FAIL(p) ≦ n － 4. n The failure probability of a minimum cut is p c ≦ FAIL(p) ≦ n － 4. n p c = n － (2+δ). n From lemma 28.9, for any α ≧ 1, the total failure probability of all cuts of capacity αc is at most p cα n 2α = n － αδ.

Lemma 28.10 (1/3) n For any α, – –For bounding the total failure prob. of “large” capacity cuts. –Number all cuts in G by increasing capacity. c k : the capacity of the k-th cut in this numbering. p k : the failure probability of the k-th cut. a: the number of the first cut of capacity greater than αc. –It suffices to show that

Lemma 28.10 (2/3) n Illustration of the idea of lemma 28.10 –Number all cuts in G by increasing capacity.

Lemma 28.10 (3/3) n For c k (a ≦ k ≦ a+n 2α ), –c k >αc → p k

{ "@context": "http://schema.org", "@type": "ImageObject", "contentUrl": "http://images.slideplayer.com/14/4193294/slides/slide_36.jpg", "name": "Lemma 28.10 (3/3) n For c k (a ≦ k ≦ a+n 2α ), –c k >αc → p k

Theorem 28.11 (1/4) n There is an FPRAS for estimating network reliability. –In case that FAIL(p)>n － 4. –The network is connected/disconnected: binomial distribution. Sampling and Chernoff bound are used to estimate FAIL(p)=μ. The light blue areas are less than 1/4 if n>2.

Theorem 28.11 (2/4) n In case that FAIL(p) ≦ n － 4. n α must be determined for enumerating graphs with high probabilities such that This inequality is given in the textbook, but this seems to contradict with in p. 300. total failureone failure By lemma 28.10

Theorem 28.11 (3/4) n By lemma 28.9, n Pr[one of the first n 2α fails] ≧ (1 － ε)FAIL(p). n The first n 2α =O(n 4 /ε) cuts are enumerable in polynomial time (Exercise 28.11-13). p2p2 p3p3 p5p5 1－ε1－ε ε

Theorem 28.11 (4/4) n To reduce the case of arbitrary edge failure probabilities, parallel edges are used. failure probability p e parallel –(ln p e )/θ edges failure probability θ all edges are disconnected with prob.