Minimizing Unsatisfiable Formulas SAT/SMT seminar 07/05/2017 Minimizing Unsatisfiable Formulas Alexander Ivrii IBM
Disclaimers: All the experts in the room are welcome to speak at any time! This is NOT a comprehensive overview This is probably highly subjective Many slides are borrowed from other presentations
Motivation Given an unsatisfiable CNF, understand why it is unsatisfiable
Many variants of the problem Given an unsatisfiable CNF formula, we can look for: A smaller unsatisfiable core A minimal unsatisfiable core (MUC or MUS) A minimum-sized core A lexicographically/anti-lexicographically preferred minimal unsatisfiable core Find MUS which contains/avoids clauses (Clauses are ordered, subsets of clauses are ordered lexicographically) … And we can also look for more than one core Enumerate all MUSes
Many useful generalizations MUSes – minimal unsatisfiable subformulas Goal: Find a subset-minimal set of clauses that is unsatisfiable GMUS / HLMUC – group/high-level MUSes Clauses are partitioned into disjoint groups (sets) of clauses There is also a special “don-t care” group-0 consisting of “non-interesting” clauses Goal: Find a subset-minimal set of interesting groups that (together with group-0) is unsatisfiable VMUS – variable MUSes Goal: Find a subset-minimal set of variables so that the set of all clauses containing these variables is unsatisfiable LMUS – labeled MUSes Each clause has an associated set of labels (possibly empty) Goal: find a subset-minimal set of labels so that the set of all clauses with at least one label from this set is unsatisfiable Similar to GMUSes, but the groups do no need to be disjoint Simple & useful & drove a lot of research Contains all others as special cases, allows preprocessing
Example
Example MUSes: {C1, C2, C3} {C1, C2, C4, C6} GMUSes – assume G0 = {C1, C2}, G1 = {C3, C4}, G2 = {C5, C6} {G1} VMUSes: {p, q} LMUSes – assume L(C1) = {}, L(C2) = {}, L(C3) = {1}, L(C4) = {1}, L(C5) = {2}, L(C6) = {2} {1} LMUSes – assume L(C1) = {1}, L(C2) = {2}, L(C3) = {1,2}, L(C4) = {1,3}, L(C5) = {1,2}, L(C6) = {2,3} {1,2}
Application: Abstraction Refinement Given a model checking problem (Init, Tr, P) and a bound k, a bounded model checking problem (with bound k) checks if Init(X0) Tr(X0, X1) … Tr(Xk-1, Xk) (P0 … Pk) is satisfiable. When unsatisfiable, may want to construct a latch-level or gate-level abstraction Can be naturally formulated as a GMUS (or a VMUS) problem References: Alexander Nadel: “Boosting minimal unsatisfiable core extraction”, FMCAD 2010
Application: Minimal Equivalent Subformulas Problem Statement: Given a satisfiable CNF formula F, find a subset-minimal set of clauses with the same set of satisfying assignments Idea: Given SF, S has the same set of assignments as F iff S F is unsatisfiable Reduction to GMUS: Given F = {c1, …, cn}, define: Don’t-care group G0 = CNF(F) For each clause ciF, define a group Gi = {ci} Claim: E is a MES of F if and only if { Gi | ciE } is a GMUS of G0 G1 … Gn References: Anton Belov, Mikolás Janota, Inês Lynce, João Marques-Silva: “On Computing Minimal Equivalent Subformulas”, CP 2012
Application: Minimal Independent Support An independent support of a Boolean formula F is a subset of variables whose values uniquely determine the values of the remaining variables in any satisfying assignment to the formula Example: F = (c a) (c b) (c a b). Then {a, b} is an independent support of F Problem Statement: Given a CNF formula F, find its minimal independent support References: Alexander Ivrii, Sharad Malik, Kuldeep Meel, and Moshe Vardi: “On computing Minimal Independent Support and its applications to sampling and counting”, Constraints 21(1) 2016
Application: Minimal Independent Support Equivalently, S Vars(F) is an independent support of F if F(x1, …, xn) F(y1, …, yn) xiS(xi = yi) xiVars(F)(xi = yi) Reduction to GMUS: Given F(x1, …, xn), define: Don’t-care group G0 = F(x1, …, xn) F(y1, …, yn) (xiVars(F) (xi yi)) For each variable viVars(F), define a group Gi = {xi = yi} Claim: S is a MIS of F if and only if { Gi | xiS } is a GMUS of G0 G1 … Gn
Algorithms for computing MUSes Basic Deletion-based algorithm Basic Insertion-based algorithm Many optimizations exist Hybrid algorithm Some features of both insertion-based and deletion-based algorithms Follow-up research Various algorithms based on the MUS – MCS duality Most of the algorithms and optimizations extend to compute LMUSes
Backup slide: MUS-MCS duality A correction subset S of an unsatisfiable CNF formula F is such that F \ S is satisfiable A minimal correction subset (MCS) is a subset-minimal such subset Easy to see that every MUS of F and every MCS of F must intersect The MUS-MCS duality states that M is a MUS of F iff M is a minimal hitting set for AllMCSes(F) Dually, M is an MCS of F iff M is a minimal hitting set for AllMUSes(F) Mostly used to compute multiple MUSes
A note on MUSers and competitions Three state-of-the-art approaches to extract a MUS: Resolution-based using resolution graphs (HaifaMUC) Assumption-based using selector variables (Muser2) Assumption-based with assumptions organized in a form of a partial resolution graph (MinisatAbb) The only competition in this area was associated with SAT’11 Features both a MUS track and a GMUS track Problems were selected from easy unsatisfiable instances used in SAT competitions
Clause-Set Trimming Motivation Reduce the formula before running the MUS extraction algorithm Iteratively compute smaller unsatisfiable clause-sets Run SAT-solver on F0 = F The query should be unsatisfiable Extract core F1 Run SAT-solver on F1 Extract core F2 … Stop based on some criteria Number of iterations, number/percentage of clauses removed, etc.
Model Rotation for MUS problem Motivation Cheaply produce more necessary clauses on each SAT outcome Idea Suppose that F is unsatisfiable formula Further suppose that is an assignment satisfying all clauses except for c (cF) Then c is necessary (belongs to every MUS of F) Flip values of certain variables in the hope to produce satisfying assignments to F\cj for other clauses cj Variants: Model Rotation Recursive Model Rotation Extended Recursive Model Rotation Can be extended to rotation for LMUS formulas
Redundancy Removal Motivation Make SAT queries easier Idea Suppose that F is unsatisfiable formula Checking whether F\c is satisfiable is equivalent to checking whether (F\c)(c) is satisfiable As c consists of unit assumptions, the problem usually becomes simpler to solve Extensions: Path Strengthening (Alexander Nadel, Vadim Ryvchin, Ofer Strichman: “Efficient MUS extraction with resolution, FMCAD 2013) Backbone Literals (Alexander Ivrii, Vadim Ryvchin, Ofer Strichman: “Mining Backbone Literals in Incremental SAT - A New Kind of Incremental Data”, SAT 2015) Drawback: clause-set refinement (removing clauses after an UNSAT result) becomes tricky
Preprocessing Motivation Use standard preprocessing techniques (BCP, subsumption, self-subsumption, variable elimination, blocked clause elimination) to make MUS extraction easier Problem: Simply preprocessing the original formula does not work! Example: F = (x p) (x p q) (p) (x q) (x) F’ = (x p) ( p q) (p) (x q) (x) Here (p q) was obtained by self-subsumption of (x p q) with (x p) M’ = { (p q), (p), (x q), (x)} is a MUS of F’ As (p q) is derived from (x p q) with (x p), we are tempted to say that M = {(x p), (x p q), (p), (x q), (x)} is a MUS of F But M is NOT a MUS of F! The situation is even worse for GMUS computations!
Preprocessing and LCNF All standard preprocessing techniques can be easily extended to work with labeled clauses Example: self-subsumption reduces (x p){1} (x p q){2} to (x p){1} (p q){1,2} And become sound for LMUS extraction!
Cool new idea! In my experiments with Valeriy Balabanov: VE on LCNF formula coming from GMUS problems had a highly beneficial effect Further improved by allowing VE to eliminate a variable even if this increases the CNF size (cl-lim=200, grow=40) Reason: aggressive-VE significantly helps rotation and does not hurt solving But no improvement on formulas coming from MUS problems Reason: aggressive-VE significantly helps rotations but hurts solving
Cool new idea! Idea: Certain preprocessing techniques make rotation simpler, while certain preprocessing techniques make solving simpler Let solving and rotation operate on different (yet synchronized) formulas For example: the CNF formula for rotation can be obtained from the original formula by the variable elimination, subsumption, self-subsumption, blocked clause elimination The CNF formula for solving can be obtained from the original formula by blocked clause addition
Thank you!