# Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.

## Presentation on theme: "Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT."— Presentation transcript:

Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT.

Why SAT? Fundamental problem from theoretical point of view –Cook theorem, 1971: the first NP-complete problem. Numerous applications: –Solving any NP problem... –Verification: Model Checking, theorem-proving,... –AI: Planning, automated deduction,... –Design and analysis: CAD, VLSI –Physics: statistical mechanics (models for spin-glass material)

SAT made some progress…

The SAT competitions

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction.More techniques: decision heuristics, deduction Stochastic SAT solvers: the GSAT approach

The K-Coloring problem: Given an undirected graph G(V,E) and a natural number k, is there an assignment color: Formulation of famous problems as SAT: k- Coloring (1/2)

Formulation of famous problems as SAT: k- Coloring (2/2) x i,j = node i is assigned the ‘color’ j (1  i  n, 1  j  k) Constraints: i) At least one color to each node: (x 1,1  x 1,2  … x 1,k  …) ii) At most one color to each node: iii) Coloring constraints: for each i, j such that ( i, j ) 2 E:

Given a property p : (e.g. “always signal_a = signal_b”) Is there a state reachable within k cycles, which satisfies  p ?... s0s0 s1s1 s2s2 s k-1 sksk pp p pp p Formulation of famous problems as SAT: Bounded Model Checking

The reachable states in k steps are captured by: The property p fails in one of the cycles 1.. k : Formulation of famous problems as SAT: Bounded Model Checking

The safety property p is valid up to cycle k iff  k  is unsatisfiable:... s0s0 s1s1 s2s2 s k-1 sksk pp p pp p Formulation of famous problems as SAT: Bounded Model Checking

Example: a two bit counter Property: G (  l   r ). 00 01 10 11 For k = 2,  k  is unsatisfiable. For k = 4  k  is satisfiable Initial state: I : : l Æ : r Transition: R : l ’ = ( l  r ) Æ r ’ = : r Formulation of famous problems as SAT: Bounded Model Checking

Bounded Model Checking k = 0 BMC(M, ,k) yes k++ k ¸ ?k ¸ ? no

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

What is SAT? SATisfying assignment! Given a propositional formula in CNF, find an assignment to Boolean variables that makes the formula true:  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 ) A = {x 1 =0, x 2 =1, x 3 =0, x 4 =1}  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 ) A = {x 1 =0, x 2 =1, x 3 =0, x 4 =1}

CNF-SAT Conjunctive Normal Form: Conjunction of disjunction of literals. Example: ( : x 1 Ç : x 2 ) Æ (x 2 Ç x 4 Ç : x 1 ) Æ... Experience shows that CNF-SAT solving is faster than solving a general propositional formula. There exists a polynomial transformation due to Tseitin (1970) of a general propositional formula  to CNF, with addition of |  | variables.

Tseitin’s encoding by example  = x 1 Ç : ( x 2 Æ x 3 )  ’ = a 0 Æ (a 0 \$ x 1 Ç a 1 ) Æ (a 1 \$ : a 2 ) Æ (a 2 \$ x 2 Æ x 3 ) It is left to transform  ’ to CNF. x1x1 x2x2 x3x3 Æ : Ç a0a0 a1a1 a2a2

Tseitin’s encoding: CNF encodings of gates And gate. e.g., for a i \$ x 1 Æ x 2 add to S (a i Ç : x 1 Ç : x 2 ), ( : a i Ç x 1 ), ( : a i Ç x 2 ) Or gate. e.g. for a i \$ x 1 Ç x 2 add to S ( : a i Ç x 1 Ç x 2 ), (a i Ç : x 1 ), (a i Ç : x 2 ) Not gate. e.g. for a i \$ : x 1 add to S ( : a i Ç : x 1 ), (a i Ç x 1 )

Tseitin’s encoding For each Boolean gate instance g i in , add a new auxiliary variable a i, and add to a stack S the CNF clauses encoding a i \$ g i. Let a 0 denote the auxiliary variable encoding the main operator of . Let Theorem (Tseitin):  is satisfiable iff  ’ is satisfiable.

(CNF) SAT basic definitions: literals A literal is a variable or its negation. Var( l ) is the variable associated with a literal l. A literal is called negative if it is a negated variable, and positive otherwise.

SAT basic definitions: literals If var( l ) is unassigned, then l is unresolved. Otherwise, l is satisfied by an assignment  if  (var( l )) = 1 and l is positive, or  (var( l )) = 0 and l is negative, and unsatisfied otherwise.

SAT basic definitions: clauses The state of an n -long clause C under a partial assignment  is: Satisfied if at least one of C’s literals is satisfied, Conflicting if all of C’s literals are unsatisfied, Unit if n -1 literals in C are unsatisfied and 1 literal is unresolved, and Unresolved otherwise.

SAT basic definitions: clauses Example

SAT basic definitions: the unit clause rule The unit clause rule: in a unit clause the unresolved literal must be satisfied.

Given  in CNF: (x,y,z),(-x,y),(-y,z),(-x,-y,-z) Decide() Deduce() Resolve_Conflict()  X XX XX  A Basic SAT algorithm

Basic Backtracking Search Organize the search in the form of a decision tree Each node corresponds to a decision Depth of the node in the decision tree is called the decision level Notation: x = v @ d x is assigned v 2 {0,1} at decision level d

Backtracking Search in Action  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 )  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 ) x 1 x 1 = 0@1 {(x 1,0), (x 2,0), (x 3,1)} x 2 x 2 = 0@2 {(x 1,1), (x 2,0), (x 3,1), (x 4,0)} x 1 = 1@1  x 3 = 1@2  x 4 = 0@1  x 2 = 0@1  x 3 = 1@1 No backtrack in this example, regardless of the decision!

Backtracking Search in Action  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 )  4 = (  x 1  x 2   x 3 )  1 = (x 2  x 3 )  2 = (  x 1   x 4 )  3 = (  x 2  x 4 )  4 = (  x 1  x 2   x 3 ) Add a clause  x 4 = 0@1  x 2 = 0@1  x 3 = 1@1 conflict {(x 1,0), (x 2,0), (x 3,1)} x 2 x 2 = 0@2  x 3 = 1@2 x 1 = 0@1 x 1 x 1 = 1@1

While (true) { if (!Decide()) return (SAT); while (!Deduce()) if (!Resolve_Conflict()) return (UNSAT); } Choose the next variable and value. Return False if all variables are assigned Apply unit clause rule. Return False if reached a conflict Backtrack until no conflict. Return False if impossible A Basic SAT algorithm ( DPLL-based)

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

Maintain a counter for each literal: in how many unresolved clauses it appears ? Decide on the literal with the largest counter. Requires O(#literals) queries for each decision. Decision heuristics DLIS (Dynamic Largest Individual Sum)

Compute for every clause  and every literal l : J ( l ) := Choose a variable l that maximizes J ( l ). This gives an exponentially higher weight to literals in shorter clauses. Decision heuristics Jeroslow-Wang method

Let f *( x ) be the # of unresolved smallest clauses containing x. Choose x that maximizes: (( f *( x ) + f *(! x )) * 2 k + f *( x ) * f *(! x ) k is chosen heuristically. The idea: Give preference to satisfying small clauses. Among those, give preference to balanced variables (e.g. f *( x ) = 3, f *(! x ) = 3 is better than f *( x ) = 1, f *(! x ) = 5). Decision heuristics MOM (Maximum Occurrence of clauses of Minimum size).

Pause... We will see other (more advanced) decision Heuristics soon. These heuristics are integrated with a mechanism called Learning with Conflict- Clauses, which we will learns next.

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

Implication graphs and learning  1 = (  x 1  x 2 )  2 = (  x 1  x 3  x 9 )  3 = (  x 2   x 3  x 4 )  4 = (  x 4  x 5  x 10 )  5 = (  x 4  x 6  x 11 )  6 = (  x 5   x 6 )  7 = (x 1  x 7   x 12 )  8 = (x 1  x 8 )  9 = (  x 7   x 8   x 13 )  1 = (  x 1  x 2 )  2 = (  x 1  x 3  x 9 )  3 = (  x 2   x 3  x 4 )  4 = (  x 4  x 5  x 10 )  5 = (  x 4  x 6  x 11 )  6 = (  x 5   x 6 )  7 = (x 1  x 7   x 12 )  8 = (x 1  x 8 )  9 = (  x 7   x 8   x 13 ) Current truth assignment: {x 9 =0@1,x 10 =0@3, x 11 =0@3, x 12 =1@2, x 13 =1@2} Current decision assignment: {x 1 =1@6} 66 66  conflict x 9 =0@1 x 1 =1@6 x 10 =0@3 x 11 =0@3 x 5 =1@6 44 44 55 55 x 6 =1@6 22 22 x 3 =1@6 11 x 2 =1@6 33 33 x 4 =1@6 We learn the conflict clause  10 : ( : x 1 Ç x 9 Ç x 11 Ç x 10 ) and backtrack to the highest (deepest) dec. level in this clause (6).

Implication graph, flipped assignment x 1 =0@6 x 11 =0@3 x 10 =0@3 x 9 =0@1 x 7 =1@6 x 12 =1@2 77 77 x 8 =1@6 88  10 99 99 ’’ x 13 =1@2 99 Due to the conflict clause  1 = (  x 1  x 2 )  2 = (  x 1  x 3  x 9 )  3 = (  x 2   x 3  x 4 )  4 = (  x 4  x 5  x 10 )  5 = (  x 4  x 6  x 11 )  6 = (  x 5  x 6 )  7 = (x 1  x 7   x 12 )  8 = (x 1  x 8 )  9 = (  x 7   x 8   x 13 )  10 : ( : x 1 Ç x 9 Ç x 11 Ç x 10 )  1 = (  x 1  x 2 )  2 = (  x 1  x 3  x 9 )  3 = (  x 2   x 3  x 4 )  4 = (  x 4  x 5  x 10 )  5 = (  x 4  x 6  x 11 )  6 = (  x 5  x 6 )  7 = (x 1  x 7   x 12 )  8 = (x 1  x 8 )  9 = (  x 7   x 8   x 13 )  10 : ( : x 1 Ç x 9 Ç x 11 Ç x 10 ) We learn the conflict clause  11 : ( : x 13 Ç x 9 Ç x 10 Ç x 11 Ç : x 12 ) and backtrack to the highest (deepest) dec. level in this clause (3).

Non-chronological backtracking Non- chronological backtracking x 1 4 5 6 ’’ Decision level Which assignments caused the conflicts ? x 9 = 0@1 x 10 = 0@3 x 11 = 0@3 x 12 = 1@2 x 13 = 1@2 Backtrack to decision level 3 3 These assignments Are sufficient for Causing a conflict.

Non-chronological backtracking (option #1) So the rule is: backtrack to the largest decision level in the conflict clause. Q: What if the flipped assignment works ? A: continue to the next decision level, leaving the current one without a decision variable. Backtracking back to this level will lead to another conflict and further backtracking.

Non-chronological Backtracking x 1 = 0 x 2 = 0 x 3 = 1 x 4 = 0 x 5 = 0 x 7 = 1 x 9 = 0 x 6 = 0... x 5 = 1 x 9 = 1 x 3 = 0

More Conflict Clauses Def: A Conflict Clause is any clause implied by the formula Let L be a set of literals labeling nodes that form a cut in the implication graph, separating the conflict node from the roots. Claim: Ç l 2 L : l is a Conflict Clause. 55 55 x 6 =1@6 66 66  conflict x 9 =0@1 x 1 =1@6 x 10 =0@3 x 11 =0@3 x 5 =1@6 44 44 22 22 x 3 =1@6 11 x 2 =1@6 33 33 x 4 =1@6 1. (x 10 Ç : x 1 Ç x 9 Ç x 11 ) 2. (x 10 Ç : x 4 Ç x 11 ) 3. (x 10 Ç : x 2 Ç : x 3 Ç x 11 )  1 2 3 Skip alternative learning

Conflict clauses How many clauses should we add ? If not all, then which ones ? Shorter ones ? Check their influence on the backtracking level ? The most “influential” ? The answer requires two definitions: Asserting clauses Unique Implication points (UIP’s)

Asserting clauses Def: An Asserting Clause is a Conflict Clause with a single literal from the current decision level. Backtracking (to the right level) makes it a Unit clause. Modern solvers only consider Asserting Clauses.

Unique Implication Points (UIP’s) Def: A Unique Implication Point (UIP) is an internal node in the Implication Graph that all paths from the decision to the conflict node go through it. The First-UIP is the closest UIP to the conflict. The method of choice: an asserting clause that includes the first UIP. In this case ( x 10 Ç : x 4 Ç x 11 ). 55 55 66 66  conflict 44 44 22 22 11 33 33 UIP x 10 =0@3 x 11 =0@3 x 4 =1@6

Conflict-driven backtracking (option #2) Previous method: backtrack to highest decision level in conflict clause (and erase it). A better method (empirically): backtrack to the second highest decision level in the clause, without erasing it. The asserted literal is implied at that level. In our example: ( x 10 Ç : x 4 Ç x 11 ) Previously we backtracked to decision level 6. Now we backtrack to decision level 3. x 4 = 0@3 is implied. 633

Conflict-driven Backtracking x 1 = 0 x 2 = 0 x 3 = 1 x 4 = 0 x 5 = 0 x 5 = 1 x 7 = 1 x 3 = 1 x 9 = 0 x 9 = 1 x 6 = 0...

Conflict-Driven Backtracking So the rule is: backtrack to the second highest decision level dl, but do not erase it. If the conflict clause has a single literal, backtrack to decision level 0. Q: It seems to waste work, since it erases assignments in decision levels higher than dl, unrelated to the conflict. A: indeed. But allows the SAT solver to redirect itself with the new information.

Decision Conflict Decision Level Time work invested in refuting x =1 (some of it seems wasted) C x =1 Refutation of x =1 C1C1 C5C5 C4C4 C3C3 C2C2 Progress of a SAT solver BCP

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

Conflict clauses and Resolution The Binary-resolution is a sound inference rule: Example:

Consider the following example: Conflict clause: c 5 : ( x 2 Ç : x 4 Ç x 10 ) Conflict clauses and resolution

Conflict clause: c 5 : ( x 2 Ç : x 4 Ç x 10 ) Resolution order: x 4, x 5, x 6, x 7 T1 = Res(c 4,c 3,x 7 ) = ( : x 5 Ç : x 6 ) T2 = Res(T1, c 2, x 6 ) = ( : x 4 Ç : x 5 Ç X 10 ) T3 = Res(T2,c 1,x 5 ) = (x 2 Ç : x 4 Ç x 10 ) Conflict clauses and resolution

Applied to our example: Finding the conflict clause: cl is asserting the first UIP

The Resolution-Graph keeps track of the “inference relation” 11 22 33 44 55 66  10 77 88 99  11 55 55 66 66  conflict 44 44 22 22 11 33 33 77 77 88  10 99 99  ’ conflict 99 Resolution Graph

The resolution graph What is it good for ? Example: for computing an Unsatisfiable core [Picture Borrowed from Zhang, Malik SAT’03]

Resolution graph: example L :L : Inferred clauses Empty clause Original clauses Unsatisfiable core learning

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

4. Periodically, all the counters are divided by a constant. 3. The unassigned variable with the highest counter is chosen. 2. When a clause is added, the counters are updated. 1. Each variable in each polarity has a counter initialized to 0. (Implemented in Chaff) Decision heuristics VSIDS (Variable State Independent Decaying Sum)

Chaff holds a list of unassigned variables sorted by the counter value. Updates are needed only when adding conflict clauses. Thus - decision is made in constant time. Decision heuristics VSIDS (cont’d)

VSIDS is a ‘quasi-static’ strategy: - static because it doesn’t depend on current assignment - dynamic because it gradually changes. Variables that appear in recent conflicts have higher priority. This strategy is a conflict-driven decision strategy. “..employing this strategy dramatically (i.e. an order of magnitude) improved performance... “ Decision heuristics VSIDS (cont’d)

Decision Heuristics - Berkmin Keep conflict clauses in a stack Choose the first unresolved clause in the stack If there is no such clause, use VSIDS Choose from this clause a variable + value according to some scoring (e.g. VSIDS) This gives absolute priority to conflicts.

Berkmin heuristic tail- first conflict clause

Deduction() allocates new implied variables and conflicts. How can this be done efficiently ? Observation: More than 90% of the time SAT solvers perform Deduction(). More engineering aspects of SAT solvers

Hold 2 counters for each clause  : val1 (  ) - # of negative literals assigned 0 in  + # of positive literals assigned 1 in . val0 (  ) - # of negative literals assigned 1 in  + # of positive literals assigned 0 in . Grasp implements Deduction() with counters

Every assignment to a variable x results in updating the counters for all the clauses that contain x. Grasp implements Deduction() with counters  is satisfied iff val1(  ) > 0  is unsatisfied iff val0(  ) = |  |  is unit iff val1(  ) = 0  val0(  ) = |  | - 1  is unresolved iff val1(  ) = 0  val0(  ) < |  | - 1. Backtracking: Same complexity.

Observation: during Deduction(), we are only interested in newly implied variables and conflicts. These occur only when the number of literals in  with value ‘ false ’ is greater than |  | - 2 Conclusion: no need to visit a clause unless (val0(  ) > |  | - 2) How can this be implemented ? Chaff implements Deduction() with a pair of observers

Define two ‘ observers ’ : O1(  ), O2(  ). O1(  ) and O2(  ) point to two distinct  literals which are not ‘ false ’.  becomes unit if updating one observer leads to O1(  ) = O2(  ). Visit clause  only if O1(  ) or O2(  ) become ‘ false ’. Chaff implements Deduction() with a pair of observers

Both observers of an implied clause are on the highest decision level present in the clause. Therefore, backtracking will un-assign them first. Conclusion: when backtracking, observers stay in place. V[1]=0 V[5]=0, v[4]= 0 V[2]=0 O1O2 Unit clause Backtrack v[4] = v[5]= X v[1] = 1 Backtracking: No updating. Complexity = constant. Chaff implements Deduction() with a pair of observers

The choice of observing literals is important. Best strategy is - the least frequently updated variables. The observers method has a learning curve in this respect: 1. The initial observers are chosen arbitrarily. 2. The process shifts the observers away from variables that were recently updated (these variables will most probably be reassigned in a short time). In our example: the next time v[5] is updated, it will point to a significantly smaller set of clauses. Chaff implements Deduction() with a pair of observers

Agenda Modeling problems in Propositional Logic SAT basics Decision heuristics Non-chronological Backtracking Learning with Conflict Clauses SAT and resolution More techniques: decision heuristics, deduction. Stochastic SAT solvers: the GSAT approach

GSAT: stochastic SAT solving for i = 1 to max_tries { T := randomly generated truth assignment for j = 1 to max_flips { if T satisfies  return TRUE choose v s.t. flipping v’s value gives largest increase in the # of satisfied clauses (break ties randomly). T := T with v’s assignment flipped. } } Given a CNF formula , choose max_tries and max_flips Many alternative heuristics

Numerous progressing heuristics Hill-climbing Tabu-list Simulated-annealing Random-Walk Min-conflicts...

Improvement # 1: clause weights Initial weight of each clause: 1 Increase by k the weight of unsatisfied clauses. Choose v according to max increase in weight Clause weights is another example of conflict-driven decision strategy.

Improvement # 2: Averaging-in Q: Can we reuse information gathered in previous tries in order to speed up the search ? A: Yes! Rather than choosing T randomly each time, repeat ‘good assignments’ and choose randomly the rest.

Let X1, X2 and X3 be equally wide bit vectors. Define a function bit_average : X1  X2  X3 as follows: b 1 i b 1 i = b 2 i randomotherwise (where b j i is the i-th bit in Xj, j  {1,2,3}) Improvement # 2: Averaging-in (cont’d) b 3 i :=

Improvement # 2: Averaging-in (cont’d) Let T i init be the initial assignment (T) in cycle i. Let T i best be the assignment with highest # of satisfied clauses in cycle i. T 1 init := random assignment. T 2 init := random assignment.  i > 2, T i init := bit_average(T i-1 best, T i-2 best )

Download ppt "Heuristics for Efficient SAT Solving As implemented in GRASP, Chaff and GSAT."

Similar presentations