Consequence Generation, Interpolants, and Invariant Discovery Ken McMillan Cadence Berkeley Labs.

Slides:

Advertisements

Similar presentations

Model Checking Base on Interoplation

Advertisements

Assertion Checking over Combined Abstraction of Linear Arithmetic and Uninterpreted Functions Sumit Gulwani Microsoft Research, Redmond Ashish Tiwari SRI.

Combining Abstract Interpreters Sumit Gulwani Microsoft Research Redmond, Group Ashish Tiwari SRI RADRAD.

A Randomized Satisfiability Procedure for Arithmetic and Uninterpreted Function Symbols Sumit Gulwani George Necula EECS Department University of California,

Automated abstraction refinement II Heuristic aspects Ken McMillan Cadence Berkeley Labs.

The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.

Exploiting SAT solvers in unbounded model checking

A practical and complete approach to predicate abstraction Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs.

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.

SAT, Interpolants and Software Model Checking Ken McMillan Cadence Berkeley Labs.

Applications of Craig Interpolation to Model Checking K. L. McMillan Cadence Berkeley Labs.

Relevance Heuristics for Program Analysis Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Demand-driven inference of loop invariants in a theorem prover

Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Modeling issues Book: chapters 4.12, 5.4, 8.4, 10.1.

Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Synthesis, Analysis, and Verification Lecture 04c Lectures: Viktor Kuncak VC Generation for Programs with Data Structures “Beyond Integers”

50.530: Software Engineering

Satisfiability Modulo Theories (An introduction)

SMT Solvers (an extension of SAT) Kenneth Roe. Slide thanks to C. Barrett & S. A. Seshia, ICCAD 2009 Tutorial 2 Boolean Satisfiability (SAT) ⋁ ⋀ ¬ ⋁ ⋀

Dana Nau: Lecture slides for Automated Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:

Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.

Interpolation and Widening Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.

Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.

What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs.

Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.

ISBN Chapter 3 Describing Syntax and Semantics.

SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:

The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.

Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto.

Using Statically Computed Invariants Inside the Predicate Abstraction and Refinement Loop Himanshu Jain Franjo Ivančić Aarti Gupta Ilya Shlyakhter Chao.

Plan for today Proof-system search ( ` ) Interpretation search ( ² ) Quantifiers Equality Decision procedures Induction Cross-cutting aspectsMain search.

Discrete Mathematics Lecture 4 Harper Langston New York University.

CS 267: Automated Verification Lectures 14: Predicate Abstraction, Counter- Example Guided Abstraction Refinement, Abstract Interpretation Instructor:

Formal Verification Group © Copyright IBM Corporation 2008 IBM Haifa Labs SAT-based unbounded model checking using interpolation Based on a paper “Interpolation.

Predicate Abstraction for Software and Hardware Verification Himanshu Jain Model checking seminar April 22, 2005.

1 Abstraction Refinement for Bounded Model Checking Anubhav Gupta, CMU Ofer Strichman, Technion Highly Jet Lagged.

Describing Syntax and Semantics

Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

Programming Language Semantics Denotational Semantics Chapter 5 Part III Based on a lecture by Martin Abadi.

Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.

Ofer Strichman, Technion Deciding Combined Theories.

272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.

Deciding a Combination of Theories - Decision Procedure - Changki pswlab Combination of Theories Daniel Kroening, Ofer Strichman Presented by Changki.

By: Pashootan Vaezipoor Path Invariant Simon Fraser University – Spring 09.

SAT and SMT solvers Ayrat Khalimov (based on Georg Hofferek‘s slides) AKDV 2014.

Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.

CS 363 Comparative Programming Languages Semantics.

Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,

Reasoning about programs March CSE 403, Winter 2011, Brun.

Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.

Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.

SMT and Its Application in Software Verification (Part II) Yu-Fang Chen IIS, Academia Sinica Based on the slides of Barrett, Sanjit, Kroening, Rummer,

Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.

Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.

AN INTERPOLATING THEOREM PROVER K.L. McMillan Cadence Berkley Labs.

CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.

CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University

Computer Systems Laboratory Stanford University Clark W. Barrett David L. Dill Aaron Stump A Framework for Cooperating Decision Procedures.

© Anvesh Komuravelli Spacer Model Checking with Proofs and Counterexamples Anvesh Komuravelli Carnegie Mellon University Joint work with Arie Gurfinkel,

Satisfiability Modulo Theories and DPLL(T) Andrew Reynolds March 18, 2015.

Reasoning About Code.

Reasoning about code CSE 331 University of Washington.

SMT-Based Verification of Parameterized Systems

Introduction to Software Verification

Lifting Propositional Interpolants to the Word-Level

Predicate Abstraction

Presentation transcript:

Consequence Generation, Interpolants, and Invariant Discovery Ken McMillan Cadence Berkeley Labs

Automated abstraction Abstraction means throwing away information about a system not needed to prove a given property. Automated abstraction has become a key element in the practice of model checking. –Verification of sequential circuits without abstraction, up to about 100 registers with abstraction, > 200,000 registers! –Software model checking without abstraction, finite-state only with abstraction, infinite-state, > 100,000 loc Note, we are talking about very shallow properties!

Predicate Abstraction Terminology: A safety invariant is an inductive invariant that implies the safety condition of a program. Given a set of atomic predicates P, construct the strongest inductive invariant of a program expressible as a Boolean combination of P. –That is, we are restricted to the language of Boolean combinations of P. Example: let P = {i=j,x=y} x=i; y=j; while(x!=0) {x--; y--;} if (i == j) assert y==0; But where do the predicates come from? strongest inductive invariant: i = j ) x = y Graf Saïdi

Iterative refinement CounterExample Guided Abstraction Refinement (CEGAR) –Diagnostic information is an abstract counterexample –Refinement adds information sufficient to refute counterexample. –In the infinite state case this refinement loop can diverge. This talk is concerned with avoiding diverging of the refinement loop, thus –guaranteeing a limited kind of completeness. Refine Abstraction Verify Abstraction new abstraction safe diagnostic information

Completeness of abstraction An abstraction is a restricted language L –Example: predicate abstraction (without refinement) L is the language of Boolean combinations of predicates in P –We try to compute the strongest inductive invariant of a program in L An abstraction refinement heuristic chooses a sequence of sublangauges L 0 µ L 1,... from a broader langauge L. –Example: predicate abstraction with refinement L is the set of quantifier-free FO formulas (QF) L i is characterized by a set of atomic predictes P i Completeness relative to a language: An abstraction refinement heuristic is complete for language L, iff it always eventually chooses a sublanguage L i µ L containing a safety invariant whenever L contains a safety invariant.

Where do abstractions come from? Existing methods based on the idea of generalizing from the proof of particular cases. Heuristic: Information that is used to prove a particular case is likely to be useful in the general case. Examples Prove all executions of just k steps are safe Prove a particular program path is safe Refute a particular "abstract counterexample"

x=i,y=j [x!=0] x--, y-- [x==0] [i==j] [y!=0] Error! Structured Proofs A sequence of formulas assigned to the states of a program path, s.t. –Each is a postcondition of its predecessor –Starts with true, ends with false Example, path that executes our program loop once. x=i; y=j; while(x!=0) {x--; y--;} if (i == j) assert y==0; False i 0 =j 0 ) x 1 =y 1 True i 0 =j 0 ) x 2 =y 2 Extract predicates from proof: P = {i=j,x=y} x 1 =i 0,y 1 =j 0 x 1 0 x 2 =x 1 -1 y 2 =y 1 -1 x 2 =0 i 0 = j 0,y 2 0 SSA form!

Good proofs and bad proofs Bad example: refute path using "weakest precondition" x=i; y=j; while(x!=0) {x--; y--;} if (i == j) assert y==0; x=i,y=j [x!=0] x--, y-- [x==0] [i==j] [y!=0] Error! False True i=j Æ x=0 ) y=0 Extract predicates from proof: P = {i=j, x=0, y=0 x=1, y=1...} i=j Æ x=1 ) y=1 As we unwind the loop further, these predicates diverge...

Two questions How to generate structured proofs –Proofs generated by decision procedures will not be structured –Solution: We can rewrite unstructured proofs into structured ones How to guarantee completeness –Bad proofs lead to divergence –Solution: a structured prover

Interpolation Lemma Notation: L ( ) is the set of FO formulas over the symbols of If A B = false, there exists an interpolant A' for (A,B) such that: A A' A' B = false A' 2 L (A) Å L (B) Example: –A = p q, B = q r, A' = q Interpolants from proofs –in certain quantifier-free theories, we can obtain an interpolant for a pair A,B from a refutation in linear time. [TACAS05] –in particular, we can have linear arithmetic,uninterpreted functions, and arrays (Craig,57)

Interpolants for sequences Let A 1...A n be a sequence of formulas A sequence A 0...A n is an interpolant for A 1...A n when –A 0 = True –A i -1 Æ A i ) A i, for i = 1..n –A n = False –and finally, A i 2 L (A 1...A i ) Å L (A i+1...A n ) A1A1 A2A2 A3A3 AkAk... A' 1 A' 2 A' 3 A' k-1... TrueFalse )))) In other words, the interpolant is a structured refutation of A 1...A n

Structured proofs are interpolants x=i,y=j [x!=0] x--, y-- [x==0] [i==j] [y!=0] False i 0 =j 0 ) x 1 =y 1 True x 1 =i 0,y 1 =j 0 x 1 0 x 2 =x 1 -1 y 2 =y 1 -1 x 2 =0 i 0 = j 0,y 2 0 i 0 =j 0 ) x 2 =y 2 ) ) ) 1. Each formula implies the next 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false Abstraction refinement procedure SSA sequence Prover Interpolation Extract predicates proof structured proof idea: R. Jhala

Enforcing completeness x=0 x=1 x=2 L Lattice of sublanguages x=y L0L0 L1L1 L2L Stratify L into finite languages L 0 µ L 1 µ 2. Refute counterexample at lowest possible level If a saftey invariant exists in L k, then we never exit L k. Since this is f finite language, abstraction refinement must converge.

Restriction Language Example Difference-bound formulas –Let L k be the Boolean combinations of constraints of the form: x · y + c, or x · c, where |c| · k. Restrict the interpolants to L 0 False i 0 =j 0 ) x 1 =y 1 True x 1 =i 0,y 1 =j 0 x 1 0 x 2 =x 1 -1 y 2 =y 1 -1 x 2 =0 i 0 = j 0,y 2 0 i 0 =j 0 ) x 2 =y 2 L 0 -restricted: False True i 0 =j 0 Æ x 2 =0 ) y 2 =0 i 0 =j 2 Æ x 1 =1 ) y 1 =1 not L 0 -restricted: x 1 =i 0,y 1 =j 0 x 1 0 x 2 =x 1 -1 y 2 =y 1 -1 x 2 =0 i 0 = j 0,y 2 0 Restriction forces us to generalize!

Consequence finders A consequence finder takes as input a set of hypothese and returns a set of consequences of. Consequence finder R is complete for L-generation iff, for any 2 L ² implies R ( ) Å L ² That is, the consequence finder need not generate all consequences of in L, but the generated L-consequences must imply all others. [McIlraith & Amir, 2001]

Split prover Divide the prover into a sequence of communicating consequence finders... R1R1 R2R2 R3R3 RnRn Each R i knows just i R i and R i+1 exchange only facts in L (A 1...A i ) ÅL (A i+1...A n ) i R i is the composition of the R i s Theorem: If each R i is complete for L (A i+1...A n )-generation, then i R i is complete for refutation [McIlraith & Amir, 2001].

L-restricted split prover In the L-restricted composition, L R i, the provers can exchange only formulas in L. R1R1 R2R2 R3R3 RnRn Theorem: If each R i is complete for L ÅL (A i+1...A n )-generation, then A i+1...A n has an L-interpolant exactly when is refuted by L R i. LLLL Moreover, the refutation generated by L R i, induces an L-interpolant.

L-restricted interpolants That is, if we can build complete consequence generators for some restriction language L, we have a complete procedure to generate L- restricted interpolants. LRiLRi Interpolation L-restricted interpolant split proof structured proof L-restricted split prover SSA sequence

Complete abstraction heuristic Given finite languages L 0 µ L 1, µ where [ L i = QF... Theorem: This procedure is complete for QF invariants. That is, if a safety invariant exists in QF, we conclude "safe". Pred Abs P={} safe program path not refutable with P L k -restricted interpolant? no k=k+1 add AP's of interpolant to P yes k=0

Proof idea Let be a program path ending in an error location. Let 2 L k be a safety invariant of the program. Then n-1 is an L k -interpolant Thus, the split prover must find an L k -interpolant Moreover, must contain an AP not in P – (else predicate abstraction would have refute the path with P) Thus, we must add some AP in L k to P at each iteration This must terminate, since L k is a finite langauge – (over the program variables)

Building a split prover First you have to choose your hierarchy L 0,L 1,... We will consider QF formulas with –integer difference bound constraints (e.g., x · y + c) –equality and uninterpreted functions –restrict use of array operations "select" and "store" These are sufficient to prove simple properties of programs with arrays Our restriction language L k will be determined by –The finite set C D of allowed constants c in x · y + c –The finite set C B of allowed constants c in x · c –The bound b f on the depth of nesting of function sybols For a finite vocabulary, L k is finite and every formula is included in some L k.

Lazy architecture Note: propositional part of refutation has non-local steps, but generated interpolant is still L-restricted, because propositional interpolation rules dont introduce new atomic predicates. SAT solver Ground Decision Procedure Constraints Satisfying minterms Refutations Refutation L-restricted Split Prover Split Refutations Interpolation

Prover architecture Lazy approach means split prover must refute only minterms Convexity: theory is convex if Horn consequences are complete –In convex case, provers only exchange literals [Nelson & Oppen, 1980] Simple proof rules for complete unit consequence finding in L k –For EUF, us Shostak-style canonizer order terms appropriately, to generate desired consequences –For linear arithmetic, use Fourier-Motzkin need weakening rule: x · y + c ! x · y + d, if c < d in case of a non-convexity, split cases in SAT solver integers and array store operations introduce non-convexities. Multiple theories handled by hierarchical decomposition =,f ·+·+ ·+·+ ·+·+ ·+·+ These and other optimizations can result in a relatively efficient prover...

Performance comparison Refuting counterexamples for two hardest Windows device driver examples in the Blast benchmark set. Compare split prover against Nelson-Oppen style, same SAT solver

Some "trivial" benchmarks main(){ char x[*], z[*]; int from,to,i,j,k; i = from; j = 0; while(x[i] != 0 && i < to){ z[j] = x[i]; i++; j++; } /* prove strlen(z) >= j */ assert !(k >= 0 && k < j && z[k] == 0); } example: substring copy

Results exampleSatAbsMagicBlastBlast (new) simple loopXX array copyX two loopsXX array fill (increment)X array fill (fixed size)XX zero fillXX scan for zeroXX string overflowXX string concat (size)X string concat (ovfl)XX slow string copyXX substring (size)XX substring (ovfl)XX X = refine fail, = bug, = diverge, TO = timeout, = verified safe

Summary An abstraction refinement heuristic is complete for language L if it guarantees to find a safety invariant if one exists in L Existing PA heuristics are incomplete and diverge on trivial programs CEGAR can be made complete by... –Stratifying L into a hierarchy of finite sublanguages L 0, L 1,... –Refuting counterexamples with L k -restricted split prover –Staying at the lowest possible level of the hierarchy A split prover can be made efficient enough to use in practice –(at least for some useful theories) Theoretical completeness can acutally lead to improved practical performance.

Abstraction in infinite lattices In a lattice of infinite height, fixed point computation may not converge ) widening ) incompleteness x¸0x¸0 x¸1x¸1 x¸2x¸2 L L0L0 µ µ µ.... L1L1 L2L2 L By stratifying L into a sequence of lattices of finite hieght, we avoid widening, and guarantee to find a fixed point proving a given property. (Though not the least fixed point)

Pure interpolant approach Progressively unfold the program, computing interpolants... IF T I F T T I F T TT... If program has a safety invariant in QF, interpolants must eventually contain a safety invariant. –This gives an alternative to predicate abstraction [CAV06] –Note, this is neither a fixed point iteration, nor a parameterized approach.

Quantified invariants Even very simple properties often require quantified invariants This can be handled be a method called indexed predicate abstraction –Predicates can contain index variables the are implicitly quantified –Computes strongest quantified inductive invariant expressible as a Boolean combination of the the given atomic predicates To obtain completeness in this case, we need to restrict the number of quantifiers in L k (else L k is not finite, and we may diverge) Questions: –Is there a resolution strategy that is complete for consequence generation with a restricted number of free variables? –Can we extend to richer theories, including, e.g., transitive closure?