Instructor: Aaron Roth aaroth@cis.upenn.edu CIS 262 Automata, Computability, and Complexity Spring 2019 http://www.seas.upenn.edu/~cse262/ Instructor: Aaron Roth aaroth@cis.upenn.edu Lecture: April 15, 2019
Recap: The class NP A language L is in NP if there exists a polynomial-time nondeterministic Turing machine that accepts L Equivalently, a language L is in NP if there exists a polynomial-time deterministic TM V such that w is in L iff V accepts <w,c> for some c Examples: Clique, Hamiltonian Path, Graph coloring Every problem in P is in NP Every problem in NP is in ExpTime (deterministic exponential time)
The class EXPTIME A language L belongs to EXPTIME if there exists a deterministic TM M for L with time complexity of M is O ( exp ( nk )) Claim: If a language L is in NP, then it is in EXPTIME Proof: Suppose L is in NP. Consider a verifier V for L: Time complexity of V is f(n) and V accepts <w,c> for some c if and only if w is in L Deterministic TM M for L: Given input w of length n, For each string c of length f(n), Run V on <w,c>, If V accepts, stop and accept, else continue Claim: M accepts L and time complexity of M is O( exp (f(n))) Exercise: if Complement of L is in NP then L is in EXPTIME
Problem Classification DECIDABLE NP P REGULAR EXPTIME
Boolean Formulas Boolean formula (also called propositional formula) is an expression constructed from Boolean variables: x, y, z, … Logical connectives: And (&), Or (|), Not (~), Implies () Example formulas : ( x & y ) | (~x & ~y) [ x & y ] x ( x & ~y & ~z) | ( ~x & y & ~z ) | ( ~x & ~y & z ) Boolean formulas appear in many contexts: formalizing puzzles, structure of logical arguments, … Digital circuits constructed out of AND, OR, and NOT gates are really another representation of Boolean formulas
Boolean Formulas: Semantics Truth assignment r : Assignment of 0/1 values to variables r(j) : the value, either 0 or 1, of formula j under truth assignment r Rules for evaluating logical connectives: 1. if r(j) = 0 then r(~j) = 1 else r(~j) = 0 2. if both r(j) = 1 and r(y) = 1 then r( j & y ) = 1 else r( j & y ) = 0 3. if either r(j) = 1 or r(y) = 1 then r( j | y ) = 1 else r( j | y ) = 0 4. if either r(j) = 0 or r(y) = 1 then r( j y ) = 1 else r( j y ) = 0 Note: j y is same as ~j | y Note: j | y is same as ~(~j & ~y )
Example Evaluation Example formula j : ( x & y ) | (~x & ~y) Truth assignment r : Assignment of 0/1 values to variables r(j) : the value, either 0 or 1, of formula j under truth assignment r x y j 1
Boolean Formulas: Evaluation Evaluation problem: Given a formula j and a truth assignment r, find the value of j under this assignment EVAL = { <j, r> | r(j) = 1 } Claim: EVAL is in P Poly-time algorithm for EVAL : obvious
Boolean Satisfiability: SAT A formula is satisfiable if there exists a truth assignment r such that r(j) = 1 SAT = { < j > | j is a satisfiable Boolean formula } Examples: (x & y) | (~x & ~y) Satisfiable ( x y) & ( y z ) & x & ~z Unsatisfiable
SAT is in NP EVAL = { < j, r > | r(j) = 1 } Check if the given truth assignment makes the given formula true SAT = { < j > | j is a satisfiable Boolean formula } Check if there exists a truth assignment that makes given formula true Claim: SAT is in NP Proof: Algorithm for EVAL is a poly-time verifier for SAT Given a formula j with k variables, there are 2k truth assignments, and satisfiability solver needs to search through this exponential space Finding a solution to any NP problem can be encoded as finding a satisfying truth assignment for a Boolean formula
Example: Encoding Graph Coloring as SAT 1 Is this graph 3-colorable ? Goal: Construct a Boolean formula whose satisfying assignments encode assignment of 3 colors to nodes Variable xij stands for node i is assigned color j i = 1, 2, …, 5 and j = 1, 2, 3 for 3 colors 2 3 4 5 Desired formula is conjunction (AND) of many formulas: j1 = (x11 & ~x12 & ~x13 ) | (~x11 & x12 & ~x13 ) | (~x11 & ~x12 & x13 ) To make j1 true, exactly one of x11, x12, x13 must be set to 1 j1 thus says that node 1 should be assigned exactly one color Similarly, for i = 1, 2, 3, 4, 5, let ji = (xi1 & ~xi2 & ~xi3 ) | (~xi1 & xi2 & ~xi3 ) | (~xi1 & ~xi2 & xi3 )
Example: Encoding Graph Coloring as SAT For each vertex i, consider ji =(xi1 &~xi2 &~xi3)| (~xi1 & xi2 & ~xi3 )| (~xi1& ~xi2& xi3) Let j = j1 & j2 & j3 & j4 & j5 A satisfying assignment for j encodes a coloring 1 2 3 4 5 Need constraints to ensure that nodes with an edge don’t share color y12 = ~(x11 & x21) & ~(x12 & x22 ) & ~(x13 & x23) To satisfy this, for no color j, both x1j and x2j can be set to 1 More generally, let yij = ~(xi1 & xj1) & ~(xi2 & xj2 ) & ~(xi3 & xj3) This says that node i and node j cannot have same color
Example: Encoding Graph Coloring as SAT For each vertex i, consider ji =(xi1 &~xi2 &~xi3)| (~xi1 & xi2 & ~xi3 )| (~xi1& ~xi2& xi3) Let j = j1 & j2 & j3 & j4 & j5 Let yij = ~(xi1 & xj1) & ~(xi2 & xj2 ) & ~(xi3 & xj3) And y = y12 & y13 & y23 & y24 & y35 & y45 1 2 3 4 5 Formula j says that each vertex has exactly one color and formula y says that vertices connected by an edge don’t have same color Claim: Graph is 3-colorable if and only if j & y is satisfiable More generally, to check if a graph G with n vertices is k-colorable, we can construct a boolean formula with n.k variables, and check its satisfiability
Complement of class NP: CoNP UNSAT = { < j > | j is an unsatisfiable Boolean formula } To check that a formula is unsatisfiable, doesn’t suffice to find one truth assignment that makes it False, rather need to show that each of the exponentially many truth assignments makes it False CoNP: Complement of class NP: A language L is in CoNP if the complement language ~L is in NP UNSAT is in CoNP, and so are { < G, k > | graph G does not have a clique of size k } { < G, s, t > | there is no Hamiltonian path from s to t in graph G } { < G, k > | graph G cannot be colored using k colors }
Validity of Boolean Formulas A formula j is called valid (also called a tautology) if for every truth assignment r, r(j) = 1 (i.e. formula is always true) Examples (dating back to Socrates for valid forms of logical arguments): (x & y ) x Valid [ ( x y ) & x ] y Valid ( x y ) ( y x ) Invalid VALID = { < j > | j is a valid Boolean formula } What class contains VALID ? Complement: INVALID = {< j > | r(j) = 0 for some truth assignment r } INVALID is in NP and VALID is in CoNP
Relationship of CoNP with other Complexity Classes EXPTIME Both P and EXPTIME are closed under complement: because defined using deterministic Turing machines for complement toggle accept/reject state, doesn’t change complexity It follows that P is contained in CoNP and CoNP contained in EXPTIME Also: P = NP implies P = NP = CoNP
Theory of NP-Completeness Cook’s Theorem (1972): SAT is NP-complete If SAT is in P then P = NP If someone can find a (deterministic) algorithm of polynomial-time complexity to solve SAT, then every problem in NP can be solved in polynomial-time Open question: to find a poly-time algorithm for SAT or to prove that no such algorithm exists Turns out that many problems in NP are “equivalent” to SAT, and this is the class of NP-complete problems Key to developing this theory: Polynomial-time problem reduction
Problem/Language Reduction A reduces to B, A < B: possible to construct a machine/algorithm for A using that for B To establish A < B, find a (computable) transformation f so that w is in A if and only if f(w) is in B If A reduces to B, then: 1. If B is decidable then so is A 2. If A is undecidable, so is B Machine for A Accept (f(w) in B) Accept (w in A) w Transformation f f(w) Machine for B Reject (otherwise) Reject (otherwise)
Polynomial-time Problem/Language Reduction A reduces to B, A < B: possible to construct a machine/algorithm for A using that for B This reduction is a polynomial-time reduction, written A <P B, if the transformation is computable by a polynomial-time TM/program If A <P B, then: If B is in P, then A is also in P Machine for A Accept (f(w) in B) Accept (w in A) w Transformation f f(w) Machine for B Reject (otherwise) Reject (otherwise)
Poly-time Reduction: Graph Coloring to SAT Show how to construct poly-time algorithm for graph coloring using that for SAT Given graph G and k, show how to construct in polynomial-time a Boolean formula j=f(G,k) which is satisfiable if and only if G is k-colorable Machine for COLOR f(G,k) is satisfiable G has k-coloring G Transformation f f(G,k) Machine for SAT k No No
COLOR to SAT Transformation Given graph G with n nodes and k, consider variables: xij for i = 1, …, n and j = 1, …, k For each vertex i, let ji = (xi1 & … & ~xik) | … | (~xi1 & … & xik) Let j = j1 & ... & jn For each edge (i,j) in G, let yij = ~(xi1 & xj1) & … & ~(xik & xjk), and y is AND of such formulas corresponding to all edges Formula j says that each vertex has exactly one color and formula y says that vertices connected by an edge don’t have same color, so set f(G,k) = j & y Claim: Graph G is k-colorable if and only if f(G,k) is satisfiable Size of formula is O(n.k), and it can be constructed easily (in poly-time)
Independent Set 1 A set U of nodes is an independent set if no two nodes in U are connected by an edge Example: {1, 4} is an independent set IndepSet = {<G,k> | G has an indep set of size k } Which complexity class does it belong to ? 2 3 4 5 Goal: Show that Clique <P IndepSet Given a graph G and k, construct in poly-time an input (G’,k’) for IndepSet such that G has a clique of size k iff G’ has an indep set of size k’
Reduction from Clique to IndepSet G 1 G’ G’ = Complement graph of G One-to-one correspondence between clique in G and indep set in G’ 1 2 3 2 3 4 5 4 5 Algorithm for Clique assuming algorithm R for IndepSet: Given undirected graph G = (V, E) and k, 1. Construct graph G’ = (V, E’) such that for two distinct vertices u and v, (u,v) is in E’ iff (u,v) is not in E 2. Check if G’ has independent set of size k using R Correctness: Argue that (G,k) is in Clique iff (G’,k) is in IndepSet Time-complexity: Poly-time assuming R is poly-time
Invalidity INVALID = {<j> | there exists a truth assignment r s.t. r(j) = 0 } We know that INVALID is in NP Show that SAT <P INVALID Reduction: Algorithm for SAT using algorithm R for INVALID Given a Boolean formula j, Check if R accepts ~ j : if so, accept else reject
Solving SAT in Practice All NP problems (which includes lots of optimization problems arising in practice) can be reduced to SAT How do we actually solve SAT ? Obvious algorithm: try out all possible truth assignments (exponential) Good news: search can terminate once one satisfying assignment is found Search heuristics: Which path to explore first (choose a variable x, and set it to 0 or 1) ? If a variable x is assigned 0, simplify the formula as much as possible
Modern SAT Solvers Annual competition of SAT solvers: see satcompetition.org Highly optimized implementations Example: Z3 from Microsoft Research Modern SAT solver can check satisfiability for formulas with over 100K variables Now used routinely for real-world problems Example: Generating test inputs for programs