Interpolation and Widening Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.

Slides:



Advertisements
Similar presentations
Model Checking Base on Interoplation
Advertisements

The behavior of SAT solvers in model checking applications K. L. McMillan Cadence Berkeley Labs.
Exploiting SAT solvers in unbounded model checking
A practical and complete approach to predicate abstraction Ranjit Jhala UCSD Ken McMillan Cadence Berkeley Labs.
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Exploiting SAT solvers in unbounded model checking K. L. McMillan Cadence Berkeley Labs.
Consequence Generation, Interpolants, and Invariant Discovery Ken McMillan Cadence Berkeley Labs.
Relevance Heuristics for Program Analysis Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
An Abstract Interpretation Framework for Refactoring P. Cousot, NYU, ENS, CNRS, INRIA R. Cousot, ENS, CNRS, INRIA F. Logozzo, M. Barnett, Microsoft Research.
Software Model Checking with SMT Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 11.
Methods of Proof Chapter 7, second half.. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound)
Methods of Proof Chapter 7, Part II. Proof methods Proof methods divide into (roughly) two kinds: Application of inference rules: Legitimate (sound) generation.
Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.
Interpolants from Z3 proofs Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Axiomatic Verification I Prepared by Stephen M. Thebaut, Ph.D. University of Florida Software Testing and Verification Lecture 17.
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
Program Proving Notes Ellen L. Walker.
1 Semantic Description of Programming languages. 2 Static versus Dynamic Semantics n Static Semantics represents legal forms of programs that cannot be.
1/22 Programs : Semantics and Verification Charngki PSWLAB Programs: Semantics and Verification Mordechai Ben-Ari Mathematical Logic for Computer.
SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:
Revisiting Generalizations Ken McMillan Microsoft Research Aws Albarghouthi University of Toronto.
Prof. Necula CS Lecture 121 Decision-Procedure Based Theorem Provers Tactic-Based Theorem Proving Inferring Loop Invariants CS Lecture 12.
Methods of Proof Chapter 7, second half.
Formal Verification Group © Copyright IBM Corporation 2008 IBM Haifa Labs SAT-based unbounded model checking using interpolation Based on a paper “Interpolation.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
Software Verification Bertrand Meyer Chair of Software Engineering Lecture 2: Axiomatic semantics.
Describing Syntax and Semantics
Invisible Invariants: Underapproximating to Overapproximate Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
1 First order theories. 2 Satisfiability The classic SAT problem: given a propositional formula , is  satisfiable ? Example:  Let x 1,x 2 be propositional.
272: Software Engineering Fall 2012 Instructor: Tevfik Bultan Lecture 4: SMT-based Bounded Model Checking of Concurrent Software.
Reading and Writing Mathematical Proofs
1 Inference Rules and Proofs (Z); Program Specification and Verification Inference Rules and Proofs (Z); Program Specification and Verification.
Quantified Invariant Generation using an Interpolating Saturation Prover Ken McMillan Cadence Research Labs TexPoint fonts used in EMF: A A A A A.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
0 What logic is or should be Propositions Boolean operations The language of classical propositional logic Interpretation and truth Validity (tautologicity)
An Introduction to Artificial Intelligence – CE Chapter 7- Logical Agents Ramin Halavati
Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,
Program Analysis and Verification Spring 2014 Program Analysis and Verification Lecture 4: Axiomatic Semantics I Roman Manevich Ben-Gurion University.
A Logic of Partially Satisfied Constraints Nic Wilson Cork Constraint Computation Centre Computer Science, UCC.
Automated Reasoning Early AI explored how to automated several reasoning tasks – these were solved by what we might call weak problem solving methods as.
Ch. 13 Ch. 131 jcmt CSE 3302 Programming Languages CSE3302 Programming Languages (notes?) Dr. Carter Tiernan.
CS6133 Software Specification and Verification
Verification & Validation By: Amir Masoud Gharehbaghi
© Copyright 2008 STI INNSBRUCK Intelligent Systems Propositional Logic.
1 First order theories (Chapter 1, Sections 1.4 – 1.5) From the slides for the book “Decision procedures” by D.Kroening and O.Strichman.
SMT and Its Application in Software Verification (Part II) Yu-Fang Chen IIS, Academia Sinica Based on the slides of Barrett, Sanjit, Kroening, Rummer,
Daniel Kroening and Ofer Strichman Decision Procedures An Algorithmic Point of View Deciding Combined Theories.
1 Propositional Logic Limits The expressive power of propositional logic is limited. The assumption is that everything can be expressed by simple facts.
1 Proving program termination Lecture 5 · February 4 th, 2008 TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: A.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
Logical Agents Chapter 7. Outline Knowledge-based agents Propositional (Boolean) logic Equivalence, validity, satisfiability Inference rules and theorem.
CSC3315 (Spring 2009)1 CSC 3315 Languages & Compilers Hamid Harroud School of Science and Engineering, Akhawayn University
C HAPTER 3 Describing Syntax and Semantics. D YNAMIC S EMANTICS Describing syntax is relatively simple There is no single widely acceptable notation or.
Some Thoughts to Consider 5 Take a look at some of the sophisticated toys being offered in stores, in catalogs, or in Sunday newspaper ads. Which ones.
Proof Methods for Propositional Logic CIS 391 – Intro to Artificial Intelligence.
Logical Agents. Outline Knowledge-based agents Logic in general - models and entailment Propositional (Boolean) logic Equivalence, validity, satisfiability.
Programming Languages and Compilers (CS 421)
Programming Languages 2nd edition Tucker and Noonan
Predicate Transformers
Methods of Proof Chapter 7, second half.
Programming Languages and Compilers (CS 421)
Programming Languages 2nd edition Tucker and Noonan
Presentation transcript:

Interpolation and Widening Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A

Interpolation and Widening Widening/Narrowing and Craig Interpolation are two approaches to computing inductive invariants of transition systems. Both are essentially methods of generalizing from proofs about bounded executions to proofs about unbounded executions. In this talk, we'll consider the relationship between these two approaches, from both theoretical and practical points of view. Consider only property proving applications, since interpolation only applies with a property to prove.

Intuitive comparison stronger weaker iterations... lfp stronger weaker iterations... lfp inductivewidening/narrowing interpolation

Abstractions as proof systems We will view both widening/narrowing and interpolation as proof systems –In particular, local proof systems A proof system (or abstraction) consists of: –A logical language L (abstract domain) –A set of sound deduction rules A choice of proof system constitutes a bias, or domain knowledge –Rich proof system = weak bias –Impoverished proof system = strong bias By restricting the logical language and deduction rules, the analysis designer expresses a space of possible proofs in which the analysis tool should search.

Fundamental problems Relevance –We must avoid a combinatorial explosion of deductions –Thus, deduction must be restricted to facts relevant to the property Convergence –Eventually the proofs for bounded executions must generalize to a proof of unbounded executions.

Different approaches Widening/narrowing relies on a restricted proof system –Relevance is enforced by strong bias –Convergence is also enforced in this way, but proof of a property is not guaranteed Interpolation uses a rich proof system –Relevance is determined by Occam's razor relevant deductions occur in simple property proofs –Convergence is not guaranteed, but approached heuristically again using Occam's razor We will see that the two methods have many aspects in common, but take different approaches to these fundamental problems. In the interpolation approach, we rely on well-developed theorem proving approaches to search large spaces for simple proofs.

Proofs A proof is a series of deductions, from premises to conclusions Each deduction is an instance of an inference rule Usually, we represent a proof as a tree... P1P1P1P1 P2P2P2P2 P3P3P3P3 P4P4P4P4 P5P5P5P5 C Premises Conclusion P 1 P 2 C

Inference rules The inference rules depend on the theory we are reasoning in p _  : p _  _ _ _ _  Resolution rule: Boolean logic Linear arithmetic x1 · y1x1 · y1x1 · y1x1 · y1 x2 · y2x2 · y2x2 · y2x2 · y2 x 1 +x 2 · y 1 +y 2 Sum rule:

Invariants from unwindings A simple way to generalize from bounded to unbounded proofs: –Consider just one program execution path, as straight-line program –Construct a proof for this straight-line program –See if this proof contains an inductive invariant proving the property Example program: x = y = 0; while(*) x++; y++; while(x != 0) x--; y--; assert (y == 0); {x == y} invariant:

{x = 0 ^ y = 0} {x = y} {x = 0 ) y = 0} {False} {True} {y = 0} {y = 1} {y = 2} {y = 1} {y = 0} {False} {True} Unwind the loops Proof of inline program contains invariants for both loops Assertions may diverge as we unwind A practical method must somehow prevent this kind of divergence! x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] How can we find relevant proofs of program paths?

Interpolation Lemma Let A and B be first order formulas, using –some non-logical symbols (predicates, functions, constants) –the logical symbols ^, _, :, 9, 8, (),... If A  B = false, there exists an interpolant A' for (A,B) such that: A  A' A' ^ B = false A’ uses only common vocabulary of A and B [Craig,57] A p  q B  q  r A’ = q

Interpolants as Floyd-Hoare proofs False x 1 =y 0 True y 1 >x 1 ) ) ) 1. Each formula implies the next 2. Each is over common symbols of prefix and suffix 3. Begins with true, ends with false Proving in-line programs SSA sequence Prover Interpolation Hoare Proof proof x=y; y++; [x=y] x 1 = y 0 y 1 =y 0 +1 x1y1x1y1 {False} {x=y} {True} {y>x} x = y y++ [x == y]

Local proofs and interpolants x=y; y++; [y · x] x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 y0 · x1y0 · x1y0 · x1y0 · x1 x 1 +1 · y 1 y 1 · x 1 +1 y 1 · y · 0 FALSE x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 TRUE x 1 · y x 1 +1 · y 1 FALSE This is an example of a local proof...

Definition of local proof x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 y0y0y0y0 scope of variable = range of frames it occurs in y1y1y1y1 x1x1x1x1 vocabulary of frame = set of variables “in scope” {x 1,y 0 } {x 1,y 0,y 1 } {x 1,y 1 } x 1 +1 · y 1 x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 deduction “in scope” here Local proof: Every deduction written Every deduction written in vocabulary of some in vocabulary of some frame. frame.

Forward local proof x 1 =y 0 y 1 =y 0 +1 y1·x1y1·x1 {x 1,x 0 } {x 1,y 0,y 1 } {x 1,y 1 } Forward local proof: each deduction can be assigned a frame such that all the deduction arrows go forward. x 1 +1 · y 1 1 · 0 FALSE x1 · y0x1 · y0x1 · y0x1 · y0 y 0 +1 · y 1 For a forward local proof, the (conjunction of) assertions crossing frame boundary is an interpolant. TRUE x 1 · y x 1 +1 · y 1 FALSE

Proofs and relevance x 1 =y 0 +1 z 1 =x 1 +1 x 1 · y 0 y 0 · z 1 {x 1,y 0 } {x 1,y 0,z 1 } TRUE x 1 =  y FALSE z 1 =  2 z 1 = y 0  2 1·01·01·01·0 FALSE x 1 =  y Æ z 1 =  2 x 1 =  y Æ z 1 = y 0  2 By dropping unneeded inferences, we weaken the interpolant and eliminate irrelevant predicates. 0 · 2 x 1 =  y Interpolants are neither weakest pre not strongest post.

Applying Occam's Razor Define a (local) proof system –Can contain whatever proof rules you want Define a cost metric for proofs –For example, number of distinct predicates after dropping subscripts Exhaustive search for lowest cost proof –May restrict to forward or reverse proofs  x = e  e/x] FALSE  unsat. Allow simple arithmetic rewriting. Simple proofs are more likely to generalize Even this trivial proofs system allows useful flexibility

Loop example x 0 = 0 y 0 = 0 x 1 =x 0 +1 y 1 =y 0 +1 TRUE x 0 = 0 Æ y 0 = 0... x 1 =1 Æ y 1 = 1 x 2 =x 1 +1 y 2 =y x 1 = 1 y 1 = 1 x 2 = 2 y 2 = cost: 2N x 2 =2 Æ y 2 = 2 x 0 = y 0 x 1 = y 0 +1 x 1 = y 1 x 2 = y 1 +1 x 2 = y 2 TRUE x 0 = y 0... x 1 = y 1 cost: 2 x 2 = y 2 Lowest cost proof is simpler, avoids divergence.

Interpolation Generalize from bounded proofs to unbounded proofs Weak bias –Rich proof system (large space of proofs) –Apply Occam's razor (simple proofs more likely to generalize) Occam's razor is applied to –Avoid combinatorial explosion of deductions (relevance) –Eventually generalize to inductive proofs (convergence) Apply theorem proving technology to search large space of possible proofs for simple proofs –DPLL, SMT solvers, etc.

Widening operators this chain eventually stabilizes.

Upward iteration sequence The widening properties guaranteeThe widening properties guarantee –over-approximation –stabilization over-approximate eventually stable!... Narrowing similar but contracting

Widening as local deduction Since widening loses information, we can think of it as a deduction rule In fact, we may have several deduction rules at our disposal: abstract post join widen

{False} {True} Widening with octagons x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] Because we proved the property, we have computed an interpolant But note the irrelevant fact! Our proof rules are too coarse to eliminate this fact.

{True} Over-widening (with intervals) x = y = 0; x=1-x; y++; [x==2]; Note if we had waited on step to widen we would have a proof. {False}

Safe widening Let us define a safe widening sequence as one that ends in a safe state. Suppose we apply a sequence of rules and fail... We may postpone a widening to achieve a safety proof

Incompleteness Incomplete proof system on purpose We restrict the proof system (strong bias) to enforce –relevance focus –convergence These properties are obtained at the risk of over-widening Incompleteness derives only from incompleteness of underlying logic – –For example, in Presburger arithmetic we have completeness Relevance focus and convergence rely on general heuristics – –Occam's razor (simple proofs tend to generalize) – –Rely on theorem proving techniques – –Choice of logic and axioms also represents a weak bias Widening/narrowing Interpolation

Consequences of strong bias Widening requires domain knowledge, which entails a careful choice of the logical language L. –Octagons: easy –Unions of octagons: harder –Presburger arithmetic formulas: ??? This entails incompleteness, as a restricted language implies loss of information. This also means we can tailor the representation for efficiency. –Octagons: use half-space representation, not convex hull of vertices –Polyhedra: mixed representation

Advantages of weak bias Boolean logic (e.g., hardware verification) –Language L is Boolean circuits over system state variables –There is no obvious a priori widening for this language –Interpolation techniques are the most effective known for this problem McMillan CAV 2003 (feasible interpolation using SAT solvers) Bradley VMCAI 2011 (interpolation by local proof) –Note rapid convergence is very important here Infinite state cases requiring disjunctions –Hard to formula a widening a priori –Weak bias can be used to avoid combinatorial explosion of disjuncts Example: IMPACT Scaling to large number of variables –Weak bias can allow focus just on relevant variables Weak bias can be used in cases where domain knowledge is lacking.

Simple example for(i = 0; i < N; i++) a[i] = i; for(j = 0; j < N; j++) assert a[j] = j; { 8 x. 0 · x ^ x < i ) a[x] = x} invariant:

Partial Axiomatization Axioms of the theory of arrays (with select and update) 8 (A, I, V) (select(update(A,I,V), I) = V 8 (A,I,J,V) (I  J ! select(update(A,I,V), J) = select(A,J)) Axioms for arithmetic [ integer axiom] etc... We use a (local) first-order superposition prover to generate interpolants, with a simple metric for proof complexity.

i = 0; [i < N]; a[i] = i; i++; [i < N]; a[i] = i; i++; [i >= N]; j = 0; [j < N]; j++; [j < N]; a[j] != j; Unwinding simple example Unwind the loops twice i 0 = 0 i 0 < N a 1 = update(a 0,i 0,i 0 ) i 1 = i i 1 < N a 2 = update(a 1,i 1,i 1 ) i 2 = i i ¸ N ^ j 0 = 0 j 0 < N ^ j 1 = j j 1 < N select(a 2,j 1 )  j 1 invariant {i 0 = 0} {0 · U ^ U < i 1 ) select(a 1,U)=U} {0 · U ^ U < i 2 ) select(a 2,U)=U} {j · U ^ U < N ) select(a 2,U)=U} weak bias prevents constants diverging as 0, succ(0), succ(succ(0)),...

i = 0; [i < N]; a[i] = i; i++; [i < N]; a[i] = i; i++; [i >= N]; j = 0; [j < N]; j++; [j < N]; a[j] != j; With strong bias Something like array segmentation functor of C + C + Logozzo note: it so happened here our first try a widening was safe, but this may not always be so....

Comparison Axioms and proof bias are generic – –Little domain knowledge is represented Uses a generic theorem prover to generate local proofs – –No domain specific tuning Not as scalable as the strong bias approach Widening/narrowing Interpolation

List deletion example Add a few axioms about reachability Invariant synthesized with 3 unwindings (after some: simplification): a = create_list(); while(a){ tmp = a->next; free(a); a = tmp; } {rea(next,a,nil) ^ 8 x (rea(next,a,x) ! x = nil _ alloc(x))} No need to craft a new specialized domain for linked lists. Weak bias can be used in cases where domain knowledge is lacking.

Are interpolants widenings? A safe widening sequence is an interpolant. An interpolant is not necessarily a widening sequence, however. –Does not satisfy the expansion property –Does not satisfy the eventual stability property as we increase the sequence length. A consequence of giving up stabilization is that inductive invariants (post-fixed points) are typically found in the middle of the sequence, not at an eventual stabilization point. –Early formulas tend to be too strong (influenced by initial condition) –Late formulas tend to be too weak (influenced by final condition)

Typical interpolant sequence x = y = 0; x++; y++; [x!=0]; x--; y--; [x!=0]; x--; y--; [x == 0] [y != 0] {False} {True} Too strong Too weak Weakened, but not expansive Does not stabilize at invariant No matter how far we unwind, we may not get stabilization

Conclusion Widening/narrowing and interpolation are methods of generalizing from bounded to unbounded proofs Formally, widening/narrowing satisfies stronger conditions soundnessexpanding/contractingstabilizingwidening/narrowingsoundness interpolation stabilization is not obtained when proving properties, however

Conclusion, cont. Heuristically, the difference is weak v. strong bias restricted proof system incompleteness smaller search space domain knowledge efficient representations strong bias rich proof system completeness large search space Occam's razor generic representations weak bias Can we combine strong and weak heuristics? – –Fall back on weak heuristics when strong fails – –Use weak heuristics to handle combinatorial complexity – –Build known widenings into theory solvers in SMT?