Presentation on theme: "1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University."— Presentation transcript:
1 Backdoor Sets in SAT Instances Ryan Williams Carnegie Mellon University Joint work in IJCAI03 with: Carla Gomes and Bart Selman Cornell University
2 Significant progress in Complete search methods! Software and hardware verification – complete methods are critical - e.g. for verifying the correctness of chip design, using SAT encodings Current methods can verify automatically the correctness of a large fraction of a Pentium IV. (Complete = always returns SAT or unSAT)
3 A “real world” example (Thanks to: Oliver Kullmann)
4 i.e. (( x 1 ) or x 7 ) and (( x 1 ) or x 6 ) and … etc. Bounded Model Checking instance:
5 (x 177 or x 169 or x 161 or x 153 … or x 17 or x 9 or x 1 or ( x 185 )) clauses / constraints are getting more interesting… 10 pages later: …
pages later: … ?!! a 59-cnf clause…
7 Finally, 15,000 pages later: The MiniSat solver (Een&Sorensson) solves this instance in 2 seconds. Note that:… !!!
8 Gap between Theory and Practice The good scaling behavior of state-of-the art SAT solvers seems to defy our complexity-theoretic intuition that SAT is NP-complete! How can we explain this gap between theory and practice? What makes this possible? Our answer: Hidden tractable substructure in real-world problems. Can we make this more precise? Proposal: We consider structures we call backdoor sets. Idea came out of study of heavy-tailed phenomena in runtime distributions for some SAT solvers.
9 Backdoor Sets – Initial Motivation Explains why restarting a solver often is an effective strategy Implies a wide range of possible solution times, often including short runs How to explain short runs? Heavy-tailed distributions and Randomization. Certain problems, when solved by randomized backtracking, yield a runtime distribution that is heavy-tailed Pr[solution found in time t] ~ 1/t^c, 0 < c < 2
Explaining short runs: Backdoors to tractability Informally: A backdoor set to a given problem instance is a subset of its variables such that, once assigned values, the remaining instance simplifies to a tractable class. Formally: We define notion of a “sub-solver” (handles tractable substructure of problem instance) backdoor set and strong backdoor set
11 Defining a sub-solver Definition is general enough to encompass many polynomial time propagation methods. (Also those for which we do not know a clean characterization of the tractable subclass.) Valid for other encoding languages besides SAT: e.g., Mixed Integer Programming and Constraint Satisfaction Problems
12 Backdoor set (for satisfiable instances): Strong backdoor set (applies to satisfiable or inconsistent instances): Defining backdoors
Backdoors can be surprisingly small: Backdoors help explain how a solver can get “lucky” on certain runs: backdoor sets are identified early on in backtracking search. Most recent: Other combinatorial domains. E.g. Graphplan planning, near constant size backdoors (2 or 3 variables) in certain domains. (Hoffman, Gomes, Selman ’03) Backdoors capture critical problem resources (bottlenecks).
14 Constraint Satisfaction Problem The Constraint Satisfaction Problem (CSP): A finite set of n variables is given and with each variable is associated a non-empty finite domain. A constraint on k variables X 1,…,X k is a relation R(X 1,…,X k ) D 1 x …x D k. A solution to a CSP is an assignment of values to all the variables, satisfying all the constraints. (Satisfaction of a constraint = the relation holds) (Dechter 86, Freuder 82, Mackworth 77, Tsang 93, van Beek and Dechter 97)
15 Explicit Algorithms for Finding/Exploiting Backdoor Sets We cover three kinds of strategies for dealing with instances with small backdoor sets: A deterministic algorithm A randomized algorithm –Provably better worst-case performance over the deterministic one A heuristic randomized algorithm –Assumes existence of a good heuristic for choosing variables to branch on –We believe this is close to what happens in practice
17 Assumption: There exists a backdoor whose size is bounded by a function of n (call it B(n)) Idea: Repeatedly choose random subsets of variables that are slightly larger than B(n), searching these subsets for the backdoor Randomized Generalized Iterative Deepening
18 Randomized Generalized Iterative Deepening
19 Deterministic Versus Randomized Deterministic algorithm Randomized algorithm Suppose variables have 2 possible values (e.g. SAT) k For B(n) = n/k, algorithm runtime is c n c
20 Complete Randomized Depth First Search with Heuristic Assume we have the following. DFS, a generic depth first search randomized backtrack search solver with: (polytime) sub-solver A Heuristic H that (randomly) chooses variables to branch on, in polynomial time H has probability 1/h of choosing a backdoor variable (h is a fixed constant) Call this ensemble (DFS, H, A)
21 Polytime Restart Strategy for (DFS, H, A) Essentially: If there is a small backdoor, then (DFS, H, A) has a restart strategy that runs in polytime.
Runtime Table for Algorithms DFS,H,A B(n) = upper bound on the size of a backdoor, given n variables When the backdoor is a constant fraction of n, there is an exponential improvement between the randomized and deterministic algorithm
Summary Introduced notion of a “backdoor set” of variables. 1)More closely captures combinatorics of a problem instance, as dealt with in practice. 2)Provides insight into restart strategies. 3) Backdoors can be surprisingly small in practice. 4) Search heuristics + randomization can be used to find them, provably efficiently.