Presentation is loading. Please wait.

Presentation is loading. Please wait.

Provably hard problems below the satisfiability threshold

Similar presentations


Presentation on theme: "Provably hard problems below the satisfiability threshold"— Presentation transcript:

1 Provably hard problems below the satisfiability threshold
A sharp threshold in proof complexity yields lower bounds for satisfiability search Provably hard problems below the satisfiability threshold Paul Beame Univ. of Washington Dimitris Achlioptas Microsoft Research Michael Molloy Univ. of Toronto

2 CNF Satisfiability (x1  x2  x4)  (x1  x3)  (x3  x2)  (x4  x3) NP-complete but many heuristics because of its practical importance presumably exponential in the worst case If you know formula is satisfiable How hard is it to find assignment? No lower bounds known for interesting heuristics.

3 Satisfiability Algorithms
Local search (incomplete) GSAT [Selman,Levesque,Mitchell 92] Walksat [Kautz,Selman 96] Backtracking search (complete) DPLL [Davis,Putnam 60] [Davis,Logeman,Loveland 62] DPLL + “clause learning”

4 Backtracking search/DPLL
Select* a literal l (some x or x) Remove all clauses containing l Shrink all clauses containing l While there are 1-clauses Pick some (arbitrary) 1-clause, satisfy it and simplify If there is a 0-clause (contradiction) Backtrack to last free step Free step Yields `residual formula’ *many options for select

5 Resolution Start with clauses of CNF formula F Resolution rule
Given (A  x), (B  x) can derive (A  B) F is unsatisfiable  0-clause derivable Proof size = # of clauses Running DPLL (with any select) on an unsatisfable formula F results in a tree-resolution proof of  F

6 Random CNF formulas Random 2-CNF formula with sn clauses
is satisfiable w.h.p. for s  1 and simple DPLL will find a satisfying assignment in linear time w.h.p. is unsatisfiable w.h.p. for s  1 and simple DPLL will finish and yield a resolution proof of unsatisfiability in linear time w.h.p.

7 DPLL on random 3-CNF* D ratio of clauses to variables 4.26 # of DPLL
backtracks Can prove 2W(n/D1+e ) time is required for unsatisfiable formulas above the threshold 1 probability satisfiable What about satisfiable formulas below threshold? D ratio of clauses to variables * n = 50 variables

8 Phase transitions and algorithmic complexity
Easy connection Hardest random problems will always be at a monotone sharp threshold bn if it exists Can randomly reduce satisfiable problems of lower density to those at the threshold Given a formula with Dn clauses D b can always add (b-D-e) n random clauses to make it a random problem nearly at the threshold and use that soln Can reduce unsatisfiable problems of larger density to those at the threshold Given a formula with Dn clauses D b ignore all but the first (b+e) n of them

9 Hard satisfiable formulas?
With non-deterministic select we could simply guess n correct value assignments. .... How can a satisfiable formula possibly be hard? Any implementation of select must run in polynomial time. …. Very simple heuristics used in practice

10 Some standard select rules for DPLL algorithms
UC Pick variables in a fixed order Always set True first UCwm Apply a majority vote among 3-clauses for assigning each value GUC Pick a variable v in a shortest clause C Set v to satisfy C

11 Contributions These natural DPLL algorithms take exponential time on satisfiable formulas  family of unsatisfiable random formulas parametrized by s s.t. w.h.p. s  1  linear size resolution proofs s  1  only exponential size resolution proofs possible

12 Key property of each of the select rules we’ve seen
On random 3-CNF, before the first backtrack occurs, the residual formula is a uniformly random mix of 2-clauses and 3-clauses If it has m2 2-clauses and m3 3-clauses then it is equally likely to be any formula with these properties key property  proofs of algorithms’ success without backtracking

13 What do long runs look like?
Residual formula at each node is a mix of 2- and 3-clauses Residual formula at is unsatisfiable say node vs. residual formula every dpll + clause learning all dpll are tree resolution 2-sat must be sat 2rn Every resolution Algorithm’s proof of unsatisfiability is exponentially long

14 Proof Complexity Theorem. A random CNF formula with Dn 3-clauses
and sn 2-clauses where s  1 has no resolution refutation of size 2rn w.h.p. [Chvátal-Szemerédi 88] [Achlioptas,B.,Molloy 2001] Formula is unsatisfiable w.h.p. for D  4.57 s  1-e and D  ????

15 [Kirkpatrick, Monasson, Selman, Zecchina 97]
Non-rigorous results [Kirkpatrick, Monasson, Selman, Zecchina 97] 2-clause ratio s We can add 2/3 n 3-clauses but not n 2-clauses 1 4.57 UNSAT SAT 2/3 4.26 3-clause ratio D

16 Rigorous results [Achlioptas, Kirousis, Kranakis, Krizanc 97]
2-clause ratio We can add 2/3 n 3-clauses but not n 2-clauses 1 ? ? UNSAT s ? SAT 2/3 2.28 8/3 4.57 D 3-clause ratio

17 Proof Complexity Theorem. A random CNF formula with Dn 3-clauses
and sn 2-clauses where s  1 has no resolution refutation of size 2rn w.h.p. [Achlioptas,B.,Molloy 2001] Formula is unsatisfiable w.h.p. for D  4.57 D  and s  1-e for e  .0001 Sharp threshold since resolution is linear for s  1+e

18 These DPLL algorithms follow trajectories
2-clause ratio 1 [Chao,Franco 88] [Frieze,Suen 95] s [Achlioptas 00] [Achlioptas,Sorkin 00] UC GUC 2/3 8/3 3.26 3-clause ratio D

19 DPLL crossing into the bad zone
2-clause ratio Algorithm Trajectory 1 Provably UNSAT & Hard s Provably SAT & Easy 3.26 4.26 4.57 3-clause ratio D

20 Exponential lower bounds far below the threshold.
Theorem. Let A {UC, UCwm, GUC}. Let DUC = DUCwm = DGUC = W.h.p. algorithm A takes more than 2rn steps on a random 3-CNF with DAn clauses Lower bound also applies to any resolution-based algorithm that extends the ‘first’ branch of the execution of A

21 Related Work Experiments suggested DPLL algorithms may not be polynomial all the way to the threshold [Cocco, Monasson 01] applied non-rigorous methods to suggest exponential GUC behavior below the threshold Assumed every branch of GUC tree operates like an independent version of the first branch Independent of our work

22 Implications for phase transitions and algorithmic complexity
Difference between polynomial and exponential hardness is not necessarily a function of the phase transition Applies in both phases, not just the over-constrained phase Algorithmically dependent A good algorithm will have a transition in a different place from a bad algorithm Can’t study the hardness transition in the absence of the study of algorithms

23 Proof Ideas Connection between pure literals and resolution proof size [Chvátal,Szemerédi 88] [Ben-Sasson,Wigderson 99] pure literals are those that occur only positively or only negatively in a formula Digraph structure of random 2-CNF subformula New graph-theoretic notion “clan” generalization of connected component Sharp concentration properties for clan size moment generating function argument Amortization of pure literals across clans

24 Resolution proof size and pure literals [Ben-Sasson,Wigderson 99]
If formula has an a s.t. Every subformula with  a n clauses has at least one pure literal Every subformula with between a n and a n clauses has a linear # of pure literals Then all resolution proofs of the formula require size 2rn

25 Basic idea of argument By sparsity of the 2-clause part of the formula, any subset of the 2-clauses will have lots of pure literals Clan size analysis & amortization In a subformula involving both 2-clauses and 3-clauses, either there are so many 3-clauses that they create lots of new pure literals on their own , or so few 3-clauses that they can’t cover all the pure literals in the 2-clauses - analysis of clans easy case

26 2-CNF Digraph on literals
x x c c y y w w z z d d (d  y) (y  x) (z  y) (c  w) (x  w) (w  z)

27 Hyper/Digraph on literals
x y z w a c b d x c y w f g z d (a  b  z) (f  g  w)

28 Pure literals x x c c y y w w f g z z d d a b

29 Pure cycle x x c c y y w w f g z z d d a b

30 Pure Items & Clans of G Clans small subgraphs of G
one clan per vertex; they cover G analog of connected components in sparse random graphs pure items typically two per clan  leaves in acyclic connected components in an ordinary graph mostly constant size never more than log3n vertices if x clan(y) then y clan(x)

31 What are clans? Simpler notion first in(y) for vertex y
in an ordinary digraph

32 in(y) in ordinary digraph
v x y y t w z Subgraph of vertices that can reach y = Ancestors of y u

33 clan(y) in ordinary digraph
v x y y t w z Descendants of ancestors of y u

34 clan(y) in 2-CNF digraph

35 A complication - bad events
x x c c w w z z y y d d (d  y) (z  y) (c  w) (x  w) (w  z) (w  d)

36 in(y) in a bad case y

37 clan(y) in a bad case y This can cascade and get even worse!

38 Analysis If we ignore bad edges |in(y)| is dominated by a component process in a sub-critical random undirected graph like trimmed out-trees [Bollobás,Borgs,Chayes,Kim,Wilson] Ignoring bad edges |clan(y)| is dominated by a 2-level process run a component process to get in(y) take the union of |in(y)| independent component processes added to in(y)

39 Analysis w.h.p. no more than one bad event happens per clan
|in(y)| is always dominated by the 2-level component process w.h.p. no more than Clog n bad events occur in the whole digraph fewer than polylog n literals interact with bad clans rest of clans dominated by 2-level process

40 This is false for a 3-level component process!
Analysis Ordinary sub-critical component process on 2n vertices w.h.p. # of vertices with component size  i is at most 2n (1-s)i for some fixed s 0 We show sub-critical 2-level component process on 2n vertices w.h.p. for i  i0, # of vertices with 2-level size  i is at most 2n (1-t)i for some fixed t 0 This is false for a 3-level component process!

41 Open problem Conjecture. For every D > 2/3 there exists an s  1
such that a random (2,3)-CNF with Dn 3-clauses and sn 2-clauses is w.h.p. unsatisfiable 1 ? UNSAT SAT 2/3 3.26 4.57

42 Open problem Conjecture. For every D > 2/3 there exists an s  1
such that a random (2,3)-CNF with Dn 3-clauses and sn 2-clauses is w.h.p. unsatisfiable Implies. For every card-game algorithm A there exists a critical density DA such that for random 3-CNF formulas with Dn clauses For D  DA w.h.p. A takes linear time For D  DA w.h.p. A takes exponential time


Download ppt "Provably hard problems below the satisfiability threshold"

Similar presentations


Ads by Google