Presentation on theme: "Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko."— Presentation transcript:
Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko and Nikolaj Bjorner Copyright 2013 Kenneth L. McMillan. All rights reserved.
Copyright 2013, Kenneth L. McMillan Integration of program analyzers Analyzer B Analyzer A Goal: share information between analyzers C Given: source program and properties. Transform A Transform B IL source-to-source IL-to-IL heap modeling Invariant ??? To share facts, we undo A transforms and apply B transforms. Invariant can fail because of divergent interpretations of the source language.
Copyright 2013, Kenneth L. McMillan The logical alternative Goal: share information between analyzers C Given: source program and properties. VC gen SMT translate to logical constraint solving Invariant OK! To share facts, we pass logical constraints. Invariant can't fail because it is a logical fact. Solver B Solver A
Copyright 2013, Kenneth L. McMillan Duality of proofs and models Finding a proof about one system can often be cast as finding a solution of another. Example: Duality in linear programming Primal constraintsDual constraints solutionrefutation solution The solution of the dual system is a proof that the primal system is infeasible.
Copyright 2013, Kenneth L. McMillan Program proving as SMT In program proving, we decorate a program with auxiliary assertions, such as –Loop invariants –Procedure summaries –Environment conditions Analysis of the program yields logical verification conditions (VC's) that we can discharge with a theorem prover. Leaving the auxiliary assertions as unknown symbolic constants, the problem of program proof becomes an SMT problem: –Find values of the unknown relations that make the VC's valid. We will examine the consequences of viewing the program analysis as a satisfiability problem.
Copyright 2013, Kenneth L. McMillan Example The verification conditions are: symbolic assertion invariant holds on entry loop preserves invariant assertion holds on loop exit Duality: a proof corresponds to a model of the VC's.
Copyright 2013, Kenneth L. McMillan Is it really this simple?
Copyright 2013, Kenneth L. McMillan Advantages of this view C VC gen SMT Solver B Solver A 1) Separation of concerns interpreting program semantics proof search 2) Simplified tools uses logic. not IL 3) Allows interoperation common language and model 4) Amortizes effort can be highly optimized 5) Established standards SMTLIB
Copyright 2013, Kenneth L. McMillan VC generators and solvers We will consider first the VC generation problem, then the VC solving problem. Andrey Rybalchenko's observation: –Many common proof strategies produce VC's in the form of constrained Horn clauses. –Many familiar verification and analysis algorithms can be cast in terms of solving constrained Horn clauses. This allows us to generalize these algorithms over a range of verification problems.
Copyright 2013, Kenneth L. McMillan Constrained Horn Clause A constrained Horn clause: example:
Copyright 2013, Kenneth L. McMillan Procedural programs Consider a simple procedure (without parameters): precondition postcondition Procedural abstraction, Boogie style: VC's are constrained Horn clauses:
Copyright 2013, Kenneth L. McMillan Solving for procedure summaries solve... This is an over-approximate procedure summary
Copyright 2013, Kenneth L. McMillan The non-linear case Suppose we have a symbolic invariant before a procedure: The VC's have degree 2 Nonlinear VC's make solving more involved – –counterexamples are trees, not paths
Copyright 2013, Kenneth L. McMillan Modular concurrent proofs Consider two parallel processes:
Copyright 2013, Kenneth L. McMillan Modular VC's initiation consecution non-interference environment abstraction These VC's are: – –in CHC form (the constraints are the transition relations) – –non-linear (the non-interference rule)
Copyright 2013, Kenneth L. McMillan Inference of dependent types Using an abstraction called Liquid Types, inference of refinement types can be reduced to solving constrained horn clauses.
Copyright 2013, Kenneth L. McMillan CHC solvers Many familiar analysis and model checking approaches can be adapted to CHC solving, and thus can potentially solve all these inference problems. –Predicate abstraction –Lazy abstraction with interpolants –Incremental inductive invariant generation –Other lazy techniques, such as Yogi, Lazy annotation, etc. –Bounded model checking We will now consider some approaches to solving CHC's based on different proof search strategies.
Copyright 2013, Kenneth L. McMillan Predicate abstraction
Copyright 2013, Kenneth L. McMillan PA example, cont. Our VC's are: base case of P recursive case of P Q is P twice property to prove (query) We want to synthesize a solution for these VC's using these predicates: Strategy: start with false and use counterexamples to the VC's to weaken the relational interpretation.
Copyright 2013, Kenneth L. McMillan PA execution All the VC's are now solved, so our property is proved. This is the strongest solution expressible using our predicates. But consider the query: failed VC relational interpretation We can't repair this by weakening a relation. If it's false, we need more predicates!
Copyright 2013, Kenneth L. McMillan Refinement using interpolants
Copyright 2013, Kenneth L. McMillan Derivation tree Start with negation of query (we want to refute it) Unify each P-fact with a P-rule The derivation tree characterizes a set of ground derivations
Copyright 2013, Kenneth L. McMillan Solving the derivation tree By solving the constraints in the derivation tree, we derive a ground fact that contradicts the query. not true! Note the constraint tree is just a BMC formula. BMC = solving for a proof of a ground fact!
Copyright 2013, Kenneth L. McMillan Interpolating the derivation tree If the constraints are UNSAT, we can compute an interpolant. Interpolant formulas are: – –bottom up refutation – –only over head variables – –upper bound on derivable facts predicates from interpolants
Copyright 2013, Kenneth L. McMillan Predicate abstraction as unwinding We can think of predicate abstraction as unwinding Each time inductiveness fails, we add an new instance of a clause
Copyright 2013, Kenneth L. McMillan Lazy predicate refinement When query fails, build a derivation tree for the unwinding, and compute interpolants. unwinding solved! solution inductive! predicates from interpolants: eager propagation
Copyright 2013, Kenneth L. McMillan What have we done? Given a purely logical account of predicate abstraction with CEGAR Generalize the technique to: –Interprocedural analysis –Modular proofs of concurrent programs –Inference of refinement types –... A single implementation solves all these problems. All we need is a VC generator, and these already exist. This approach (more or less) implemented in QARMC
Copyright 2013, Kenneth L. McMillan Lazy abstraction with interpolants In the IMPACT algorithm, we don't compute consequences eagerly using predicate abstraction. Instead, we simply decorate the unwinding with the interpolants from the failed derivation of a counterexample. We can generalize IMPACT from the linear case to the non-linear –In IMPACT, counterexample derivations are paths –in Duality, they are trees.
Copyright 2013, Kenneth L. McMillan Duality algorithm We unwind the CHC's without any eager deduction Each time inductiveness fails, we add an new instance of a clause
Copyright 2013, Kenneth L. McMillan Fixing the proof When query fails, build a derivation tree for the unwinding, and compute interpolants,then update the solution with the interpolants. unwinding solved! solution inductive!
Copyright 2013, Kenneth L. McMillan Property-driven reachability We can generalize PDR to the non-linear case [BjornerHoder2012] In PDR, when we fail to prove a conjecture locally, we form proof sub- goals and propagate them downward. unwinding solved! solution inductive!
Copyright 2013, Kenneth L. McMillan The story so far We've seen that program analysis can be viewed ass solving the VC's Existing algorithms can be transferred to this context: –(Lazy) Predicate abstraction –Lazy abstraction with interpolants –Property-driven reachability analysis In the process we... –Generalize these algorithms to the nonlinear case, so they can compute procedure summaries, modular proofs, refinement types, etc... –Abstract away from program languages and representations. –Allows re-use of VC generation tools We also have lots of flexibility in generating VC's –Different granularity -- blocks, loops, procedures, etc. –Different proof rules give different proof decompositions –By expressing the auxiliary relations in the right form, we can guide the proof
Copyright 2013, Kenneth L. McMillan Performance The key remaining question is how much performance we may sacrifice to gain this flexibility. –Are there important optimizations we will miss? –In particular, what is lost if we don't explicitly mode control flow? We'll look a two cases of comparison between generic logical tools and highly refined program-specific tools to try to answer this question.
Copyright 2013, Kenneth L. McMillan Verifying Boolean programs We compare two tools for inter-procedural analysis of Boolean programs [Bjorner and Hoder, 2012]: –Bebop (a BDD-based tool used in SLAM) –CHC solver using PDR
Copyright 2013, Kenneth L. McMillan Full device driver verification We compare Duality with Yogi, a software model checker extensively tuned for this application domain. Benchmarks: randomly selected SDV examples Procedure-level VC's generated by Boogie Solved using duality algorithm with interpolating Z3.
Copyright 2013, Kenneth L. McMillan Adding localization reduction Hypothesis: large overhead due to encoding of heap using many global maps (one per structure field). Test: Localize using bounded model checking (a standard technique). This shows what potentially could be achieved by integrating localization incrementally, or perhaps different heap encoding
Copyright 2013, Kenneth L. McMillan Open questions
Copyright 2013, Kenneth L. McMillan Conclusion We've seen that program verification can be viewed as solving the VC's to infer the necessary auxiliary constructs such as loop invariants, procedure summaries, non-interference conditions and so on. Many existing verification techniques can be applied to this problem –Generalizing to the non-linear case –Allowing application to many proofs systems and languages This allows a separation of concerns between programming language interpretation and verification algorithms –Re-use existing VC generators (Boogie, VCC, etc...) –Reduce barrier to entry in the field It allow allows inter-operation of tools, since they speak a common language. Database of program verification problems in SMTLIB format at: https://svn.sosy-lab.org/software/sv-benchmarks/trunk/clauses/