Download presentation

Presentation is loading. Please wait.

Published byJordan Norland Modified over 4 years ago

1
A threshold of ln(n) for approximating set cover By Uriel Feige Lecturer: Ariel Procaccia

2
Set Cover We are given an (unweighted!) collection F of subsets of U={1,…,n}. We must find as few as possible subsets from F such that their union covers U. Hypergraph vertex cover (lecture of 10/3) is a special case of set cover.

3
Approximation algorithms A greedy algorithm: Recall algorithms 1… At each stage, add to the cover the subset which maximizes the number of new elements. This algorithm gives an approximation of ln(n)- lnln(n)+O(1). An approximation of ln(n) can also be achieved using a linear programming relaxation.

4
A bad example for the greedy algorithm U={(x,y): 1 ≤ x,y ≤ n} S 1 = {(x,y): x ≤ n/2} S 2 = {(x,y): x ≥ n/2} T 1 = {(x,y): 1 ≤ y ≤ n/2} T 2 = {(x,y): n/2 ≤ y ≤ 3n/4} … 32 16 8 8 Ratio: logn/2

5
Overview (1) We wish to prove Theorem 8: If a poly time algorithm can approximate set cover within (1- ε)ln(n), then NP is a subset of DTIME(n O(loglog(n)) ). The strategy: a reduction from a k-prover proof system for an NP-complete problem. Familiarwith IP?

6
Overview (2) Parts of the lecture: MAX 3SAT-5. The k-prover proof system. Partition systems. The reduction to set cover. Thundering applause. May the force be with us. Similar to the lecture of 17/3.

7
MAX 3SAT-5 Input: A CNF formula with n variables and 5n/3 clauses, in which every clause contains exactly three literals, every variable appears in exactly 5 clauses, and a variable does not appear in a clause more than once. Theorem 1: For some >0, it is NP-hard to distinguish between 3CNF-5 formulas where OPT=1 or OPT≤(1- ). Reduction from MAX- 3SAT-B

8
Two prover proof system for 3SAT- 5 Presented as to provide intuition and help in the analysis of the k-prover system. One round, two prover system for 3SAT-5. Protocol: V selects an index of a clause, sends it to the first prover, selects a random var in the clause, and sends it to the second prover. First prover returns 3 bits, second returns 1 bit. V accepts if the following conditions hold: Clause check: the assignment sent by the first prover satisfies the clause. Consistency check: the assignment sent by the second prover is identical to the assignment for the same variable sent by the first prover. Assgn. To clause and var

9
Proposition 2: If OPT=1- ε, then under the optimal strategy of the provers, V accepts with probability (1-ε/3). Proof: The strategy of the second prover defines an assignment A to the variables. If V selects a clause that is not satisfied by A (prob. ε), the first prover must set one of the variables differently from A. The consistency check fails with prob. ≥ 1/3. ■ There is a strategy with acceptance prob ε/3

10
Parallel repetition We would like to lower the error. Modified proof system: V sends to the first prover l clauses, from each chooses a var, and sends these l vars to the second prover. Theorem 3 (Ran Raz): If a one round two prover system is repeated l times independently in parallel, then the error is 2 -cl, where c>0 is a constant that depends only on the original proof system. The error of the modified two prover system is ≤ 2 -cl, for some universal constant c. C for subtle

11
The k-prover proof system Binary code with k codewords, each of length l and weight l/2, with Hamming distance at least l/3. Each prover is associated with a codeword. The Protocol: The verifier selects l clauses C 1,…,C l, then selects a var from each clause: the distinguished variables x 1,…,x l. Prover P i receives C j for those coordinates in its codeword that are 1, and x j for the coordinates that are 0, and replies with 2l bits. The answer of the prover induces an assignment to the distinguished variables. Acceptance predicate: Weak: at least one pair of provers is consistent. Strong: every pair of provers is consistent. P 1 : 0011P 2 : 0101P 3 : 1100 V c 1 c 2 v 3 v 4 100,010,1,0

12
Lemma 4: If OPT=1 then the provers have a strategy that causes V to always strongly accept. If OPT≤(1- ε), then the verifier weakly accepts with probability at most k 2 2 -cl, where c>0 is a constant that depends only on ε. Proof: If OPT=1, the provers can base their answers on a satisfying assignment. Assume OPT=(1- ε), and that V weakly accepts with prob. ≥ δ. Then with respect to P i and P j, V accepts with probability δ/k 2. There are ≥ l/6 coordinates on which P i receives a clause, and P j a var in this clause. Other coordinates: for free. The provers have a strategy that succeeds with prob. ≥ δ/k 2 on l/6 parallel repetitions of the original proof system. δ/k 2 <2 -cl. ■

13
Partition systems A partition system B(m,L,k,d) has the following properties. 1. There exists a ground set B of m distinct points. 2. There is a collection of L distinct partitions p 1,…,p L. 3. For 1≤i≤L, partition p i is a collection of k disjoint subsets of B whose union is B. 4. Any cover of the m points by subsets that appear in pairwise different partitions requires at least d subsets. Lemma 6: For every c≥0 and m sufficiently large, there is a partition system B(m,L,k,d) whose parameters satisfy the following inequalities: 1. L ≈ (log(m)) c. 2. K can be chosen arbitrarily as long as k < ln(m)/3ln(ln(m)). 3. d = (1-f(k))k∙ln(m), where f(k)→∞ as k→∞. Proof: a randomized construction usually works. Looks familiar? L=4 k = 4

14
The reduction R=(5n) l : num of r. r↔B r (m,L,k,d): m=n Θ(l), L=2 l, d=(1-f(k))k∙ln(m). Partition↔dist. vars, subset↔prover. B(r,j,i) = i’th subset, partition j, partition system r. Subsets are S(q,a,i): all r s.t. (q,i)@r, extract from a an assignment a r to dist. vars. S(q,a,i) is the union of all subsets B(r,a r,i), for all r s.t. (q,i)@r. Q=n l/2 (5n/3) l/2 (questions to P i ). r=1 r=2 r=R L=2 l m points, k subsets 00 01 10 11 B(2,4,3) S(q,a,1) |U|?

15
Lemma 5: 1. Completeness: OPT=1 there is a set cover of size kQ. 2. Soundness: OPT=(1- ε) more than (1- 2f(k))kQln(m) subsets are required. Proof of Completeness: If OPT=1, the provers answer consistently with the satisfying assignment. For any r, consider S(q 1,a 1,r),…,S(q k,a k,r) s.t. (q i,i)@r, and a i is the appropriate answer. B r (m,L,k,d) is covered by these k sets. Similar for every r. The number of subsets is kQ. ■ a r =01

16
Proof of soundness: the saga begins Assume OPT≤1-ε, and that there exists C that covers U, such that |C|=(1-δ)kQln(m), δ=2f(k). q to prover P i ↔ weight w(q,i)=number of answers a s.t. S(q,a,i) is in C. Σ q,i w(q,i) = |C|. r↔w(r)= Σ (q,i)@r w(q,i). This weight = number of subsets that participate in covering B r (m,L,k,d). Call r good if w(r)<(1-δ/2)k∙ln(m).

17
The good, the bad, and the random Proposition 6: The fraction of good r is at least δ/2. Proof: Assume otherwise, then: On the other hand, Hence |C|>(1- δ)kQln(m) – contradiction. ■ r is good if: w(r)= Σ (q,i)@r w(q,i)<(1-δ/2)k∙ln(m)

18
Proposition 7: C covers U, |C|=(1-δ)kQln(m) for some strategy for the k provers, V weakly accepts with prob. ≥ 2δ/(k∙ln(m)) 2. This proves soundness, since 2δ/(k∙ln(m)) 2 >k 2 2 -cl, for l=Θ(loglogn). Proof of proposition 7: Randomized strategy: on q to P i, select a from the set of a s.t. S(q,a,i) is in C. For a fixed r: sets B(r,p,i) in the cover of B r (m,L,k,d) ↔ sets S(q,a,i) in C. For a good r: C used two subsets from p in the cover of B r (m,L,k,d). Denote B(r,p,i) and B(r,p,j) ↔ S(q i,a i,i) and S(q j,a j,j). Denote: a is in A r,i iff S(q i,a,i) is in C. r is good |A r,i |+|A r,j |<k∙ln(m). The prob. that the provers answer a i and a j ≥ 4/(k∙ln(m)) 2. Answers are consistent with p V weakly accepts. The prob. that V chooses a good r ≥ δ/2. ■ Fix coin tosses for provers

19
Theorem 8: If there is some ε>0 such that a polynomial time algorithm can approximate set cover within (1-ε)ln(n), then NP is a subset of DTIME(n O(loglog(n)) ). Proof: Assume there is a poly time algorithm A that approximates set cover within (1-ε)ln(m). Reduce GAP-3SAT-5 to set cover as described, with k s.t. f(k)<ε/4, and m=(5n) 2l/ε. m,R,Q=n O(loglogn) time to perform the reduction is n O(loglogn). ln(m)>(1-ε/2)ln(N), where N=mR. By lemma 5, for a YES instance all points can be covered by kQ subsets, and for a NO instance all points cannot be covered by (1-2f(k))kQln(m). The ratio is (1-2f(k))ln(m)>(1-ε)lnN. ■

20
Closing remarks (1) Max k-cover: Given U and F as before, select k subsets that such that their union has maximum cardinality. The obvious greedy algorithm approximates max k- cover within a ratio of at least 1-1/e ≈ 0.632. Theorem 8 can be directly used to prove: if max k- cover can be approximated in a polynomial time within a ration of (1 - 1/e + ε) for some ε>0, then NP is a subset of DTIME(n O(loglogn) ).

21
Closing remarks (2) Refinements: In our hardness of approximation result for set cover, ε is a constant. We may strengthen our assumption so as to improve our result. ZTIME=class of languages that have a probabilistic algorithm that runs in expected time t (with zero error). If for some η>0, NP is not a subset of ZTIME(2 n^η), then for some constant c’>0 there is no polynomial time algorithm that approximates set cover within ln(n) –c’(lnln(n)) 2.

22
Questions and cool animations ?

Similar presentations

OK

Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.

Linear Program Set Cover. Given a universe U of n elements, a collection of subsets of U, S = {S 1,…, S k }, and a cost function c: S → Q +. Find a minimum.

© 2018 SlidePlayer.com Inc.

All rights reserved.

To make this website work, we log user data and share it with processors. To use this website, you must agree to our Privacy Policy, including cookie policy.

Ads by Google