# Exploring the border between P and NP Uriel Feige Weizmann Institute 1.

## Presentation on theme: "Exploring the border between P and NP Uriel Feige Weizmann Institute 1."— Presentation transcript:

Exploring the border between P and NP Uriel Feige Weizmann Institute 1

Computational Intractability Convention: tractable = polynomial time. When a computational problem appears to be intractable, can we pinpoint what makes it intractable? 2

Distant History Many natural problems had no known polynomial time algorithm: traveling salesperson TSP, satisfiability of Boolean formulas SAT, maximum independent set MIS, etc. At that time: no unifying theory explaining why. 3

NP-hardness [Cook, Levin, Karp] 4

What makes a problem intractable? Almost a dichotomy: Either a problem is in P, or it encodes 3SAT. A computational problem is intractable if (and only if) it encodes 3SAT. 5

What next? What about problems not known to be NP-hard (factoring, graph isomorphism, Nash equilibrium)? Not addressed in this talk. Look more closely at 3SAT, trying to understand what exactly is computationally difficult about it. 6

Approximation 7

Hardness of approximation Approximating 3SAT within a ratio better than 7/8 is NP-hard [Hastad]. Given a satisfiable 3CNF, on the road to advancing from 7/8 to 1, we get stuck already at 7/8 – we do not know how to get started. If we could start, we could also finish. 8

What makes 3SAT intractable? 3SAT is intractable because we do not have a handle on how to start looking for a satisfying assignment. We cannot even find an assignment better than random (with respect to fraction of clauses satisfied). 9

What more can we hope to know? 10

The 7/8 boundary NP-hardness concerns worst case instances. How do 3CNF formulas that are hard to approximate look like? Does the worst case only manifest itself in rare cases? 11

Random 3CNF n variables, m clauses chosen independently at random. Density d = m/n. If m is large (say, m = n log n), every assignment satisfies roughly a 7/8 fraction of the clauses. A better than 7/8 approximation algorithm would at least be able to tell that such formulas are not satisfiable. 12

Framework for study Randomized algorithms that for every 3CNF formula correctly decide satisfiability. On some inputs, likely to be exponential. Desirable goal: polynomial on most inputs within a certain range of densities. High density – refutation heuristics. How do we know when to stop? How does a witness for non-satisfiability look like? 13

Refutation of random 3SAT - density Approach of [Feige, Kim and Ofek 2006]. 14

Algebra of refutation Given an assignment, denote: n0 – number of clauses with no satisfied literal n1 – number of clauses with one satisfied literal n2 – number of clauses with two satisfied literals n3 – number of clauses with three satisfied literals n0 + n1 + n2 + n3 = m If the assignment is satisfying, then n0 = 0. 15

The 3LIN principle Given a random 3CNF, can certify in polynomial time that if there is a satisfying assignment, it must satisfy the inequality: Replace 3CNF by system of linear equations. clause replaced by x1+x2+x3=1 modulo 2 Only clauses are not satisfied as 3LIN. 16

Certifying that there is no good 3LIN assignment Refuting 3LIN is easy - Gaussian elimination. In random 3CNF, there are small subformulas of size that are not satisfiable as 3LIN. Moreover, there are roughly disjoint subformulas, each not 3LIN-satisfiable. Refute each of them using Gaussian elimination. Implies. Refutes 3SAT when, or. 17

Factor graphs (X1 v X2 v X3), (¬X1 v X3 v X4), (X2 v ¬X3 v X4) 18

More on the 3LIN partition Consider the bipartite factor graph of formula. A set S of left hand side vertices (clauses) is an even cover if each right hand side vertex (variable) has even degree into S (every variable appears in an even number of clauses of S, possibly 0). If the clauses of S have an odd number of negated literals (happens w.p. ½) they are not satisfiable as 3LIN (add up all clauses mod 2). 19

Remarks on running time When one can find the even covers in polynomial time (because each one is of constant size). The whole refutation algorithm is polynomial. At lower densities (with ), we do not know how to find the even covers in polynomial time. Still interesting that at these densities witnesses for 3SAT non-satisfiability exist. 20

Is dense random 3SAT easy to refute? Current obstacle: given a random bipartite graph (with left hand side degree 3), find a small even cover. This a computational problem regarding the structure of the factor graph – does not refer to Boolean assignments to variables. 21

A general phenomena? Could it be that algorithms for 3SAT spend most of their time making sense of the factor graph, and then once they succeed, assigning truth values to the variables (or refuting the formula) becomes easy (or easier)? Anecdotal evidence: some common heuristics try to first decompose the factor graph, or to find a favorable order on the variables. 22

Preprocessing Reveal the 3SAT instances in two stages. First, reveal only the factor graph. Allow the algorithm to preprocess the factor graph for arbitrary time and record arbitrary polynomial size advice (e.g., an optimal tree decomposition). Then reveal the polarities. Now algorithm may use the advice, and needs to decide satisfiability in polynomial time. 23

Informal interpretation If preprocessing helps, then difficulty of 3SAT is due to the complexity of analyzing factor graph. If preprocessing does not help, then difficulty of 3SAT is due to the combinatorial richness of the polarities. Intermediate possibilities: after preprocessing, second stage gives good approximation, or takes sub-exponential time. Difficulty of 3SAT is partly due to the complexity of analyzing factor graph. 24

Research challenge Find at least some aspect in which preprocessing the factor graph would help. Essentially nothing is known (to us). Alternatively, provide evidence that preprocessing cannot help. What would this evidence look like? 25

Universal Factor Graphs A factor graph is polytime-universal for 3SAT if a polytime algorithm for 3SAT instances with this factor graph (and arbitrary polarities of variables) implies a polytime algorithm for all factor graphs. Preprocessing would not help on a universal factor graph, unless NP is in P/poly. (Proof: give the advice without doing the preprocessing.) 26

Some technicalities Family of universal factor graphs, parameterized by n and m. The universal graph for given (n,m) may have somewhat more than n variables and somewhat more than m clauses (but polynomially related to m and n). 27

Proving that a factor graph is universal A sufficient condition for a factor graph G with N variables and M clauses to be polytime-universal for (n,m): Design a polytime algorithm that reduces any arbitrary 3CNF instance with n variables and m clauses to a 3CNF instance with factor graph G, while preserving satisfiability. 28

Some results (with Shlomo Jozeph) Theorem: there are subexp-universal factor graphs for 3SAT. Theorem: there are 77/80-universal factor graphs for max-3SAT. 29

77/80-universal factor graphs Our proof has three parts: NP-universal factor graphs (easy). APX-universal factor graphs (via gap amplification [Dinur]). Amplification to 77/80-universal factor graphs (via long code tests [Bellare, Goldreich, Sudan]). 30

The technical content of second part Given two 3CNF formulas with the same factor graph G (but different variable polarities), a modified version of Dinurs gap amplification technique (proof of the PCP theorem) produces two new CNF formulas with the same factor graph G. 31

77/80-universal factor graphs Tight hardness of approximation results are based on the long code of [Bellare, Goldreich and Sudan]. They use the idea that only properties of the long code need to be tested explicitly. The requirement that the underlying predicate (e.g., a collection of 3CNF clauses) is satisfied is implicitly enforced on the long code via a mechanisms called folding [in BGS], and extended to conditioning [in Hastad]. 32

A problem Folding and conditioning depend on the polarities of variables in the underlying clauses. Changing the polarities changes the locations in which the long code is queried. As a consequence, the resulting factor graph changes. 33

So is folding hopeless? Not quite. We identify variants of folding that can be performed while still maintaining the pattern of queries despite change of polarities. We call these oblivious foldings. 34

77/80-universal factor graphs - proof By modification of the proof of [Bellare, Goldreich and Sudan], which also had the same ratio. The difference is that the folding in [BGS] was not oblivious, and we show how to replace it by oblivious folding. In Hastads tight proof, conditioning rather than folding is used, and we do not know how to make it oblivious. 35

Open question Must (exact or approximate) algorithms for 3SAT spend most of their time processing the factor graph? Can preprocessing of the factor graph lead to a substantial saving in the running time? 36

What makes 3SAT intractable? 3SAT is intractable because we do not have a handle on how to start looking for a satisfying assignment. We cannot even find an assignment better than random (with respect to fraction of clauses satisfied). 37

Returning to worst case versus average case 38

Per instance insights If in worse case have nontrivial approximation than in worse case can solve exactly. If in average case have nontrivial approximation than in average case can solve exactly? If per instance (from a random distribution) have nontrivial approximation than per same instance case can solve exactly? Is the problem just to get started? 39

A test case – the planted model 40

Known results [Alon and Kahale; Flaxman] Majority assignment (setting each variable to the polarity that agrees with most of its literals) is highly correlated with planted assignment (because all dropped literals were false). Gives a very good start. Finish off in two steps: Identify sure variables (core). Residual formula – simple enough. 41

Alternatives to majority assignment May modify the planted model to drop additional clauses so as to destroy correlation of majority assignment with planted assignment. Drop 3-clauses (~3NAE). Spectral techniques give a good start, that can be amplified and be followed by the final two steps. Drop most 2-clauses (~3LIN). No known good start. 42

Different variant of the planted model (joint work with Alina Arbitman) Drop most of the 1-clauses. Majority assignment even more strongly correlated with planted assignment. An excellent start. What about the final two steps? For some range of parameters, they still work (with some modifications). In some other range of parameters, still open. 43

Open question 44

Summary Some open questions for 3SAT Are dense random instances also worst case instances? Is it true that for most instances, finding an approximate solution is a sufficient step for finding an exact solution? Can preprocessing of the factor graph lead to faster (exact or approximate) satisfiability algorithms? 45

Homework 46

Download ppt "Exploring the border between P and NP Uriel Feige Weizmann Institute 1."

Similar presentations