Hardness of Approximation Complexity ©D.Moshkovits
Introduction Objectives: To show several approximation problems are NP-hard Overview: Reminder: How to show inapproximability? Probabilistic Checkable Proofs Hardness of approximation for clique Complexity ©D.Moshkovits
Promise Problems Sometimes you can promise something about the input It doesn’t matter what you say for inputs that do not satisfy the promise I know my graph has clique of size n/4! Does it have a clique of size n/2? Complexity ©D.Moshkovits
Promise Problems & Approximation We’ll see promise problems of a certain type, called gap problems, can be utilized to prove hardness of approximation. Complexity ©D.Moshkovits
Optimization Problems Consider an optimization problem P: Example: all graphs instances: x1,x2,x3,… all cliques in that graph feasible solutions the clique’s size (max) optimization measure Complexity ©D.Moshkovits
Each Instance Has an Optimal Solution x4 x1 x2 x3 OPT Complexity ©D.Moshkovits
Approximation (Max Version) OPT Complexity ©D.Moshkovits
How To Show Hardness of Approximation? Hardness of distinguishing far off instances Hardness of approximation A B xi OPT gap Complexity ©D.Moshkovits
Gap Problems (Max Version) Instance: … Problem: to distinguish between the following two cases: The maximal solution B The maximal solution ≤ A YES NO Complexity ©D.Moshkovits
Formally: Claim: If the [A,B]-gap version of a problem is NP-hard, then that problem is NP-hard to approximate to within factor B/A. Complexity ©D.Moshkovits
Formally: Proof (for maximization): Suppose there is an approximation algorithm that outputs C≤C* so that C*/C ≤ B/A A proper distinguisher: * If CA, return ‘YES’ * Otherwise return ‘NO’ Complexity ©D.Moshkovits
Proof Since C ≥ C*·A/B, (1) If C* > B (the correct answer is ‘YES’), then necessarily C ≥ C*·A/B > B·A/B = A (we answer ‘YES’) (2) If C*≤A (the correct answer is ‘NO’), then necessarily C≤C*≤A (we answer ‘NO’). Complexity ©D.Moshkovits
Translating To Decision Problems To Prove Hardness Optimization Problems Approximation Problems Threshold Problems Gap Problems Is the size of the max clique > ½n? Is the size of the max clique > ¾n or < ¼n? Complexity ©D.Moshkovits
Idea We’ve shown “standard” problems are NP-hard by reductions from 3SAT. We want to prove gap-problems are NP-hard, Why won’t we prove some canonical gap-problem is NP-hard and reduce from it? If a reduction reduces one gap-problem to another we refer to it as gap-preserving Complexity ©D.Moshkovits
Gap-3SAT[] Instance: a set of 3-clauses {c1,…,cm} over variables v1,…,vn. Problem: to distinguish between the following two cases: There exists an assignment that satisfies all clauses. No assignment can satisfy more than 7/8+ of the clauses. YES NO Complexity ©D.Moshkovits
Gap-3SAT: Example ( x1 x2 x3 ) ( x1 x2 x2 ) = { x1 F ; x2 T ; x3 F } satisfies 5/6 of the clauses Complexity ©D.Moshkovits
Why 7/8? Claim: For any set of clauses with exactly three independent literals, there always exists an assignment that satisfies at least 7/8 clauses. Complexity ©D.Moshkovits
The Probabilistic Method Proof: Consider a random assignment. . . . x1 x2 x3 xn Complexity ©D.Moshkovits
1. Find the Expectation Let Yi be the random variable indicating the outcome of the i-th clause. For any 1im, F T Complexity ©D.Moshkovits
1. Find the Expectation The number of clauses satisfied is a random variable Y=Yi. By the linearity of the expectation: E[Y] = E[ Yi] = E[Yi] = 7/8m Complexity ©D.Moshkovits
2. Conclude Existence Thus, there exists an assignment which satisfies at least the expected number of clauses. Complexity ©D.Moshkovits
This is tight! Gap-3SAT[0] is polynomial time decidable PCP (Without Proof) Theorem (PCP): For any >0, Gap-3SAT[] is NP-hard. This is tight! Gap-3SAT[0] is polynomial time decidable Complexity ©D.Moshkovits
Why Is It Called PCP? (Probabilistically Checkable Proofs) 3SAT has a polynomial membership proof checkable in polynomial time. Prove it! My formula is satisfiable! This assignment satisfies it! x1 x2 x3 x4 x5 x6 x7 x8 xn-3 xn-2 xn-1 xn . . . Complexity ©D.Moshkovits
Why Is It Called PCP? (Probabilistically Checkable Proofs) …Now our verifier has to check the assignment satisfies all clauses… Complexity ©D.Moshkovits
Why Is It Called PCP? (Probabilistically Checkable Proofs) But gap-3SAT also has a polynomial membership proof checkable in polynomial time. Prove it! My formula is satisfiable! This assignment satisfies it! x1 x2 x3 x4 x5 x6 x7 x8 xn-3 xn-2 xn-1 xn . . . Complexity ©D.Moshkovits
Why Is It Called PCP? (Probabilistically Checkable Proofs) In a NO instance of gap-3SAT, 1/8 of the clauses are not satisfied! And for gap-3SAT the verifier would be right with high probability, even if he picks at random a constant number of clauses and checks only them Complexity ©D.Moshkovits
Why Is It Called PCP? (Probabilistically Checkable Proofs) Since gap-3SAT is NP-hard, All NP problems have probabilistically checkable proofs. Complexity ©D.Moshkovits
Gap Preserving Reductions YES YES don’t care don’t care NO NO Complexity ©D.Moshkovits
Hardness of Approximation Do the reductions we’ve seen also work for the gap versions (i.e approximation preserving)? We’ll revisit the CLIQUE example. Complexity ©D.Moshkovits
CLIQUE Construction a vertex for each literal a part for each clause edge indicates consistency: one is not the negation of the other . Complexity ©D.Moshkovits
Cliques & Truth-Assignments . A Clique CV corresponds to the assignment A:V{T,F} s.t C A()=T. An edge between two vertices implies the corresponding literals can be both assigned T. Thus each clique corresponds to a satisfying truth-assignment. Complexity ©D.Moshkovits
Gap Preservation If there is an assignment that satisfies all clauses, there is a clique of size m. If there is a clique of size m (for some 0<<1) there is an assignment that satisfies at least of the clauses. Complexity ©D.Moshkovits
Gap-CLIQUE (Ver1) YES NO The following problem is NP-hard for any >0: Instance: a graph G=(V,E) composed of m independent sets of size 3. Problem: to distinguish between: There’s a clique of size m=|V|/3 Every clique is of size at most (7/8+)m YES NO Complexity ©D.Moshkovits
Corollary Theorem: for any >0, CLIQUE is hard to approximate within a factor of 1/(7/8+) Complexity ©D.Moshkovits
Can We Do Better? The bigger the gap is, the better the hardness result. We’ll see an improved result for CLIQUE. Complexity ©D.Moshkovits
Amplification Given an instance of the Gap-3SAT problem and a constant k (to be determined later): A part for every k clauses . ... edge indicates consistency vertex for each satisfying assignment to the k clauses Complexity ©D.Moshkovits
Boolean Assignments Each clause has at most 7 satisfying assignments. Thus k clauses have at most 7k satisfying assignments. F T Complexity ©D.Moshkovits
Consistency Two assignments are inconsistent, if they give the same variable different truth-values. x y z w x y z F F T x w y F T T Complexity ©D.Moshkovits
The Graph G=<V, E> Given = {C1, …, Cm} over variables y1, …, yn denote Y(C1, …, Ck) the set of variables which appear in C1, …, Ck Vertices Edges between every two consistent assignments Complexity ©D.Moshkovits
Cliques & Assignments Observation: A clique on of the parts corresponds to an assignment which satisfies all relevant clauses. . ... . ... Complexity ©D.Moshkovits
Correctness (1) If there is a satisfying assignment, then picking the corresponding assignment in each of the parts yields a clique of size read: “m choose k” i.e. m!/k!(m-k)! Complexity ©D.Moshkovits
Observation Fix an assignment. If 1/8 of the clauses are false, then only (7/8)k of the sets of k clauses are satisfiable. Complexity ©D.Moshkovits
Correctness (2) For any 0<<1, set k so (7/8+)k < If there is a clique with representatives in ≥ of the parts There is an assignment satisfying ≥ fraction of the k-tuples of clauses Ruling out the NO case, in which no assignment satisfies more than 1/8- of the clauses. Complexity ©D.Moshkovits
Gap-CLIQUE (Ver2) YES NO The following problem is NP-hard for any 0<<1: Instance: a graph G=(V,E) composed of m independent sets of size r. Problem: to distinguish between: There’s a clique of size m = |V|/r Every clique is of size at most m YES NO Complexity ©D.Moshkovits
Corollary Theorem: MAX-CLIQUE is NP-hard to approximate to within any constant factor. Complexity ©D.Moshkovits
Chromatic Number Instance: a graph G=(V,E). Problem: To minimize , so that there exists a function f:V{1,…, }, for which (u,v)E f(u)f(v) Complexity ©D.Moshkovits
Chromatic Number Complexity ©D.Moshkovits
Observation: Each color class is an independent set Chromatic Number Observation: Each color class is an independent set Complexity ©D.Moshkovits
Clique Cover Number (CCN) Instance: a graph G=(V,E). Problem: To minimize , so that there exists a function f:V{1,…, }, for which (u,v)E f(u)=f(v) Complexity ©D.Moshkovits
Clique Cover Number (CCN) Complexity ©D.Moshkovits
Observation Claim: The CCN problem on graph G is the CHROMATIC-NUMBER problem of the complement graph Gc. Complexity ©D.Moshkovits
Reduction Idea . . CLIQUE CCN G G’ m same under cyclic shift clique preserving q Complexity ©D.Moshkovits
Correctness Given such transformation: MAX-CLIQUE(G) = m CCN(G’) = q to be determined later Complexity ©D.Moshkovits
T is unique for triplets Transformation T:V[q] for any v1,v2,v3,v4,v5,v6, T(v1)+T(v2)+T(v3) T(v4)+T(v5)+T(v6) (mod q) {v1,v2,v3}={v4,v5,v6} T is unique for triplets Complexity ©D.Moshkovits
Observations Such T is unique for pairs and for single vertices as well: If T(x)+T(u)=T(v)+T(w) (mod q), then {x,u}={v,w} If T(x)=T(y) (mod q), then x=y T is unique for: triplets, pairs and single vertex Complexity ©D.Moshkovits
Vertices of CCN Each vertex v Gclique in layer i is mapped to vertex T(v)GCCN in layer i. q ... . . G G’ m CLIQUE CCN Complexity ©D.Moshkovits
Using the Transformation CLIQUE GCCN has m layers each of 0….q-1 vertices. vi vj T(vj)=4 T(vi)=1 CCN 0 1 2 3 4 … (q-1) Complexity ©D.Moshkovits
Edges of CCN (1) (s,t)ECLIQUE (T(s),T(t))ECCN T(s) T(t) Complexity ©D.Moshkovits
Edges of CCN (2) Closing edges under cyclic shift For every (x,y)E, s.t. xlayer_i and ylayer_j if x’-y’=x-y (mod q), and x’layer_i and y’layer_j Then (x’,y’)E Complexity ©D.Moshkovits
Completing the CCN Graph Construction x=T(s)=3 x’+y’=x+y (mod q) x’ ... ... ... ... Y=T(t)=4 Y’ . . G G’ CLIQUE CCN Complexity ©D.Moshkovits
Edge Origin Unique First Observation: This edge comes only from (s,t) T(s) First Observation: This edge comes only from (s,t) T(t) (x,y)E’ iff (s,t)E r{0..q-1} s.t x=T(s)+r1 y=T(b)+r1 Complexity ©D.Moshkovits
Triangle Consistency Second Observation: A triangle only comes from a triangle. T(a)=x a x T(b)=y b y T(c)=z c z CLIQUE CCN Complexity ©D.Moshkovits
Proof - Triangle Consistency x z y a c b claim: <x,y,z> is a triangle in Gclique iff <a,b,c> is a triangle in GCCN. Proof ): (trivial- by def. of T) (x,y)E’ iff (a1,b1)E r1 [q] s.t x=T(a1)+r1 y=T(b1)+r1 (y,z)E’ iff (b2,c2)E r2 [q] s.t y=T(b2)+r2 z=T(c2)+r2 (z,x)E’ iff (c3,a3)E r3 [q] s.t z=T(c3)+r3 x=T(a3)+r3 x+y+z T(a1)+ T(b2)+T(c3)+r1 +r2 +r3 T(a3)+ T(b1)+T(c2)+r3 +r1 +r2 {a1, a2, a3} = {b1, b2, b3} and because {ai} , {bi} , {ci} are from 3 different layers then by the definition of T: a1= b3 , a2 =b1 , a3 =b2 and r1 = r2 =r3 Complexity ©D.Moshkovits
Clique Preservation Corollary: {c1,…,ck} is a clique in the CLIQUE graph iff {T(c1)+r,…,T(ck) +r} r{0…q-1} are q different cliques in the CCN graph. Complexity ©D.Moshkovits
What did we get by now? MAX-CLIQUE(G) = m By the last corollary MAX-CLIQUE(G) = m CCN(G’) has q distinct cliques of size m CCN(G’) = q MAX-CLIQUE(G) < m CCN(G’) > q/ Complexity ©D.Moshkovits
What Remains? It remains to show how to construct the transformation T in polynomial time. Complexity ©D.Moshkovits
vertices we determined Greedy Construction feasible values v6 vertices we determined forbidden values v2 v1 v5 v3 v4 v1 v2 v6 v1 v6 v4 v6 v5 v6 v2 v6 v3 v6 Complexity ©D.Moshkovits
Greedy Construction - Analysis At most values are ruled out totally (choose a triple and a couple), so for q=n5 the greedy construction works. Corollary: There exists a polynomial time algorithm which constructs a triplet unique transformation with q=n5 Complexity ©D.Moshkovits
Corollaries Theorem: CCN is NP-hard to approximate within any constant factor. Theorem: CHROMATIC-NUMBER is NP-hard to approximate within any constant factor. Complexity ©D.Moshkovits
Summary We saw how to show hardness of approximation and explained the concept of gap problems. We presented the PCP theorem, stating that 3SAT is hard to approximate within some constant factor. Complexity ©D.Moshkovits
Summary We saw that some of the reductions we know were approximation preserving. That was the case for the 3SATpCLIQUE reduction. Complexity ©D.Moshkovits
Summary However, that reduction gave us a weak result for CLIQUE, So we showed how to amplify it. Complexity ©D.Moshkovits
Summary Then we introduced a new problem, called CHROMATIC-NUMBER. We reduced gap-CLIQUE to its gap version, showing it was in fact NP-hard to approximate. Complexity ©D.Moshkovits