Proofs from Tests Nels E. Beckman Aditya V. Nori Sriram K. Rajamani Robert J. Simmons Carnegie Mellon UniversityMicrosoft Research India Carnegie Mellon.

Slides:



Advertisements
Similar presentations
A SAT characterization of boolean-program correctness K. Rustan M. Leino Microsoft Research, Redmond, WA 14 Nov 2002 IFIP WG 2.4 meeting, Schloβ Dagstuhl,
Advertisements

Demand-driven inference of loop invariants in a theorem prover
Technologies for finding errors in object-oriented software K. Rustan M. Leino Microsoft Research, Redmond, WA Lecture 1 Summer school on Formal Models.
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
Synthesis, Analysis, and Verification Lecture 04c Lectures: Viktor Kuncak VC Generation for Programs with Data Structures “Beyond Integers”
Semantics Static semantics Dynamic semantics attribute grammars
A Program Transformation For Faster Goal-Directed Search Akash Lal, Shaz Qadeer Microsoft Research.
Abstraction and Modular Reasoning for the Verification of Software Corina Pasareanu NASA Ames Research Center.
Hoare’s Correctness Triplets Dijkstra’s Predicate Transformers
Logic as the lingua franca of software verification Ken McMillan Microsoft Research TexPoint fonts used in EMF: A A A A A Joint work with Andrey Rybalchenko.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
Automatic Predicate Abstraction of C-Programs T. Ball, R. Majumdar T. Millstein, S. Rajamani.
Leonardo de Moura and Nikolaj Bjørner Microsoft Research.
Symmetry-Aware Predicate Abstraction for Shared-Variable Concurrent Programs Alastair Donaldson, Alexander Kaiser, Daniel Kroening, and Thomas Wahl Computer.
SAT and Model Checking. Bounded Model Checking (BMC) A.I. Planning problems: can we reach a desired state in k steps? Verification of safety properties:
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Rahul Sharma Işil Dillig, Thomas Dillig, and Alex Aiken Stanford University Simplifying Loop Invariant Generation Using Splitter Predicates.
Termination Proofs for Systems Code Andrey Rybalchenko, EPFL/MPI joint work with Byron Cook, MSR and Andreas Podelski, MPI PLDI’2006, Ottawa.
Using Statically Computed Invariants Inside the Predicate Abstraction and Refinement Loop Himanshu Jain Franjo Ivančić Aarti Gupta Ilya Shlyakhter Chao.
CSE503: SOFTWARE ENGINEERING SYMBOLIC TESTING, AUTOMATED TEST GENERATION … AND MORE! David Notkin Spring 2011.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Synergy: A New Algorithm for Property Checking
1 Predicate Abstraction of ANSI-C Programs using SAT Edmund Clarke Daniel Kroening Natalia Sharygina Karen Yorav (modified by Zaher Andraus for presentation.
Hoare-style program verification K. Rustan M. Leino Guest lecturer Rob DeLine’s CSE 503, Software Engineering University of Washington 26 Apr 2004.
Alternation for Termination William Harris, Akash Lal, Aditya Nori Sriram Rajamani
CS 267: Automated Verification Lectures 14: Predicate Abstraction, Counter- Example Guided Abstraction Refinement, Abstract Interpretation Instructor:
Automatically Validating Temporal Safety Properties of Interfaces Thomas Ball and Sriram K. Rajamani Software Productivity Tools, Microsoft Research Presented.
Range Analysis. Intraprocedural Points-to Analysis Want to compute may-points-to information Lattice:
Houdini: An Annotation Assistant for ESC/Java Cormac Flanagan and K. Rustan M. Leino Compaq Systems Research Center.
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
1 Advanced Material The following slides contain advanced material and are optional.
Computing Over­Approximations with Bounded Model Checking Daniel Kroening ETH Zürich.
1 Formal Engineering of Reliable Software LASER 2004 school Tutorial, Lecture1 Natasha Sharygina Carnegie Mellon University.
Formal Verification of SpecC Programs using Predicate Abstraction Himanshu Jain Daniel Kroening Edmund Clarke Carnegie Mellon University.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
CUTE: A Concolic Unit Testing Engine for C Technical Report Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
1 Automatic Refinement and Vacuity Detection for Symbolic Trajectory Evaluation Orna Grumberg Technion Haifa, Israel Joint work with Rachel Tzoref.
Rule Checking SLAM Checking Temporal Properties of Software with Boolean Programs Thomas Ball, Sriram K. Rajamani Microsoft Research Presented by Okan.
Aditya V. Nori, Sriram K. Rajamani Microsoft Research India.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
Lazy Annotation for Program Testing and Verification Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang November 26,
COP4020 Programming Languages Introduction to Axiomatic Semantics Prof. Robert van Engelen.
Logic Programming and Prolog Goal: use formalism of first-order logic Output described by logical formula (theorem) Input described by set of formulae.
Verification of Synchronization in SpecC Description with the Use of Difference Decision Diagrams Thanyapat Sakunkonchak Masahiro Fujita Department of.
Symbolic and Concolic Execution of Programs Information Security, CS 526 Omar Chowdhury 10/7/2015Information Security, CS 5261.
The Yogi Project Software property checking via static analysis and testing Aditya V. Nori, Sriram K. Rajamani, Sai Deep Tetali, Aditya V. Thakur Microsoft.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
CUTE: A Concolic Unit Testing Engine for C Koushik SenDarko MarinovGul Agha University of Illinois Urbana-Champaign.
Synergy: A New Algorithm for Property Checking Bhargav S. Gulavani (IIT Bombay)‏ Yamini Kannan (Microsoft Research India)‏ Thomas A. Henzinger (EPFL)‏
Ranjit Jhala Rupak Majumdar Interprocedural Analysis of Asynchronous Programs.
Extended Static Checking for Java Cormac Flanagan Joint work with: Rustan Leino, Mark Lillibridge, Greg Nelson, Jim Saxe, and Raymie Stata Compaq Systems.
CS357 Lecture 13: Symbolic model checking without BDDs Alex Aiken David Dill 1.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
Lazy Annotation for Program Testing and Verification (Supplementary Materials) Speaker: Chen-Hsuan Adonis Lin Advisor: Jie-Hong Roland Jiang December 3,
Verifying Component Substitutability Nishant Sinha Sagar Chaki Edmund Clarke Natasha Sharygina Carnegie Mellon University.
CSE 331 SOFTWARE DESIGN & IMPLEMENTATION SYMBOLIC TESTING Autumn 2011.
#1 Having a BLAST with SLAM. #2 Software Model Checking via Counter-Example Guided Abstraction Refinement Topic: Software Model Checking via Counter-Example.
SOFTWARE TESTING LECTURE 9. OBSERVATIONS ABOUT TESTING “ Testing is the process of executing a program with the intention of finding errors. ” – Myers.
Null Dereference Verification Via Over-approximated Weakest Precondition analysis Ravichandhran Madhavan Microsoft Research, India Joint work with Raghavan.
Presentation Title 2/4/2018 Software Verification using Predicate Abstraction and Iterative Refinement: Part Bug Catching: Automated Program Verification.
Having a BLAST with SLAM
Hoare-style program verification
Pointer analysis.
CUTE: A Concolic Unit Testing Engine for C
Graphplan/ SATPlan Chapter
Graphplan/ SATPlan Chapter
Predicate Abstraction
COP4020 Programming Languages
Presentation transcript:

Proofs from Tests Nels E. Beckman Aditya V. Nori Sriram K. Rajamani Robert J. Simmons Carnegie Mellon UniversityMicrosoft Research India Carnegie Mellon University

The Problem Given – a sequential program P with inputs I (say, written in C) – an assertion “ assert(e) ” Questions – Bug finding: Does there exist an execution of the program P for some input I such that the assertion is violated? – Verification: Does the assertion hold for all possible inputs?

Possible solution: Testing The “old-fashioned” way Generate test cases and see if we can find an input that violates the assertion Possible approaches: – Random test case generation – Symbolic execution – “Concolic” execution (more recent, e.g. DART/CUTE)

What’s wrong with testing? If we view testing as a “black-box” activity, Dijkstra is right! After executing many tests, we still don’t know if there is another test that can violate the assertion

If we view testing as a “white-box” activity, and “observe” what happens inside the program (along with symbolic execution), we can do several interesting things: – We can generate test cases in a directed manner to find the bug – We can prove that the assertion holds for all inputs! Our hypothesis

Tests and Proofs

Tests and Proofs a=true, b=false, limit= × × × × × × × × × × × × × × × × ×

Tests and Proofs

3’ Tests and Proofs 1 3’’ 4’5’’ 6’’ 2’ 10’ 7’8’’ 9’’ ’’ 2’’ 5’ 4’’ 6’ 8’ 7’’ 9’

DASH: Proofs from Tests – Algorithm uses only test case generation operations – Maintains two data structures: A forest of reachable concrete states (tests) – Under-approximates executions of the program A region graph (an abstraction) – Over-approximates all executions of the program – Our goal: bug finding and proving If a test reaches an error, we have found bug If we refine the abstraction so that there is *no* path from the initial region to error region, we have a proof – Handles the richness of C New operator WP α uses only aliases α that are present along concrete tests that are executed Algorithm uses recursive invocations to handle inter-procedural analysis

Empirical Evaluation Current Status Yogi works on 904 (driver, property) pairs! 31 properties on which Yogi terminates and SLAM “times/spaces out”

Key Idea - I Frontier: Boundary between tested and untested regions × × × × × × × × × frontier

Key Idea 2 WP α : New refinement operation that does not depend on whole program alias information.

DASH Algorithm Main workhorse: test case generation Use counterexamples from current abstraction to “extend frontier” and generate tests When test case generation fails, use this information to “refine” abstraction at the frontier Use only aliases that happen on the tests! Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ

Example

Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ

τ=(0,1,2,3,4,7,8,9) Example y = 1 Symbolic execution + Theorem proving frontier Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ × × × × × × × × × × × × × × 10 ×

Symbolic execution + Theorem Proving τ=(0,1,2,3,4,7,8,9) yy0y0 lock.stateL xy0y0 (x =y) = (y 0 = y 0 ) = T (lock.state != L) = (L != L) = F symbolic memory constraints

Example Symbolic execution + Theorem proving frontier Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ × × × × × × × × × × × × × × 10 ×

Template-based refinement × × × × × × × × × × × × × × 10 × 8:¬ ρ 8:ρ ρ= (lock.state != L) ××

Template-based refinement 8:¬ ρ 8:ρ ρ= (lock.state != L) ×× :¬ρ 9 × × × × × × × × × × × × × × 10 × 8:ρ

Example τ=(0,1,2,3,4,7,,9) :¬ρ 9 × × × × × × × × × × × × × × 10 × 8:ρ Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ frontier

Proof! ⋀¬s 5⋀¬s 6⋀¬r 9 × × × × × × × × × × × 7⋀¬q × 8⋀¬p × 4⋀s 5⋀s 6⋀r 7⋀q 8⋀p × Can extend test beyond frontier? Refine abstraction Construct initial abstraction Construct random tests Test succeeded? Bug! Abstraction succeeded? τ = error path in abstraction f = frontier of error path yes no yes no Proof! yes no Input: Program P Property ψ 10

Template-based refinement frontier op IF(i>=j) ASSGN(i=i+j) CALL(foo(i,j)) op S k-1 SkSk × S k-2 T × witness

Template-based refinement S k-1 SkSk × S k-2 T × op S k-1 ∧¬ρ S k-1 ∧ρ SkSk × S k-2 T × op suitable predicate No theorem prover calls!

Candidates for suitable predicates S k-1 ∧¬ρ S k-1 ∧ρ SkSk × S k-2 T × op A.Strongest postcondition (SP) B.Weakest precondition (WP) Increased number of iterations, leading to non- termination in many cases Explodes in the presence of aliasing

What’s wrong with WP ? ASSGN(i=j) S k-1 *a<10 × S k-2 T ×

What’s wrong with WP ? S k-1 ∧¬ρ S k-1 ∧ρ *a<10 × S k-2 T × ASSGN(i=j) ρ = (a≠&i ∧ *a<10) ∨ (a=&i ∧ j<10) ρ = WP(*a<10, “i = j”)

What’s wrong with WP ? S k-1 ∧¬ρ S k-1 ∧ρ *a+*b<10 × S k-2 T × ASSGN(i=j) ρ = (a≠&i ∧ b≠&i ∧ *a+*b<10) ∨ (a=&i ∧ b≠&i ∧ j+*b<10) ∨ (a≠&i ∧ b=&i ∧ *a+j<10) ∨ (a=&j ∧ b=&i ∧ j+j<10)

What’s wrong with WP ? ¬((a≠&i ∧ b≠&i ∧ *a+*b<10) || (a=&i ∧ b≠&i ∧ j+*b<10) || (a≠&i ∧ b=&i ∧ *a+j<10) || (a=&j ∧ b=&i ∧ j+j<10)) *a+*b<10 × ASSGN(i=j) (a≠&i ∧ b≠&i ∧ *a+*b<10) || (a=&i ∧ b≠&i ∧ j+*b<10) || (a≠&i ∧ b=&i ∧ *a+j<10) || (a=&j ∧ b=&i ∧ j+j<10) In practice a global alias analysis required to prune the formula generated by WP

Deriving a suitable predicate *a+*b<10 ASSGN(i=j) a≠&i ∧ b≠&i ∧ *a+*b≥10a≠&i ∧ b≠&i ∧ *a+*b<10 a=&i ∧ b≠&i ∧ j+*b≥10 a≠&i ∧ b=&i ∧ *a+j≥10 a=&i ∧ b=&i ∧ j+j≥10 a=&i ∧ b≠&i ∧ j+*b<10 a≠&i ∧ b=&i ∧ *a+j<10 a=&i ∧ b=&i ∧ j+j<10 ×

Deriving a suitable predicate *a+*b<10 ASSGN(i=j) a≠&i ∧ b≠&i ∧ *a+*b≥10a≠&i ∧ b≠&i ∧ *a+*b<10 a=&i ∧ b≠&i ∧ j+*b≥10 a≠&i ∧ b=&i ∧ *a+j≥10 a=&i ∧ b=&i ∧ j+j≥10 a=&i ∧ b≠&i ∧ j+*b<10 a≠&i ∧ b=&i ∧ *a+j<10 a=&i ∧ b=&i ∧ j+j<10 ×

Refining with suitable predicate WP α *a+*b<10 ASSGN(i=j) a=&i ∧ b≠&i ∧ j+*b≥10 a ≠ &i ∨ b=&i ∨ j+*b<10 × - No global alias analysis required! - WP α stronger than WP and weaker than SP !

WP α :Template-based refinement Theorem: WP α (S k, op) is a suitable predicate for template-based refinement No theorem prover calls! S k-1 SkSk × S k-2 T × op S k-1 ∧¬ρ S k-1 ∧ρ SkSk × S k-2 T × op suitable predicate

Example

p = p1 p2 = malloc(); p2->lock = 0 p1 = malloc(); p1->lock = 0 Aliasing Example assume(p1->lock =1  p2->lock=1) p->lock = assume(!(p1->lock =1  p2->lock=1)) p = p2

Aliasing Example × × × × × × × frontier ρ = WP α = (p1->lock=1  p2->lock=1)

Aliasing Example : ¬ρ × × × × × × × 3: ρ frontier  = WP α = ¬((p≠p1  p≠p2)  ¬(p1->lock=1  p2->lock=1))

2 :  2:¬  Aliasing Example 0 1 3: ¬ρ × × × × × × × 3: ρ

2 :  2:¬  Aliasing Example - Proof 0 1: ¬μ 3: ¬ρ × × × × × × × 3: ρ 1: μ

Generalized Example

What about procedures? Key idea Perform a recursive Dash query on the called procedure and use the result to either generate a test or compute WP α S k-1 SkSk × S k-2 T × CALL(foo(i,j)) frontier

Interprocedural analysis S k-1 SkSk × S k-2 T × CALL(foo(i,j)) frontier

Interprocedural analysis S k-1 SkSk × S k-2 T × CALL(foo(i,j)) Dash[assume(φ 1 ), foo(i, j), assert(¬φ 2 )] - pass: perform refinement - fail: generate test

Soundness and Complexity Theorem. If Dash terminates on (P,φ), then either of the following is true: – If Dash returns (“pass”, Σ ≃ ), then Σ ≃ is a proof that P cannot reach ¬ φ – If Dash returns (“fail”, t ), then t certifies that P reaches ¬ φ Theorem. The complexity of Dash is precisely one theorem-prover call per iteration

Soundness and Complexity Theorem. If Dash terminates on (P,φ), then either of the following is true: – If Dash returns (“pass”, Σ ≃ ), then Σ ≃ is a proof that P cannot reach ¬ φ – If Dash returns (“fail”, t ), then t certifies that P reaches ¬ φ Theorem. Proofs at the same complexity as testing!

Empirical Evaluation Current Status Yogi works on 904 (driver, property) pairs! 31 properties on which Yogi terminates and SLAM “times/spaces out”

Acknowledgments Tom Ball Nikolaj Bjorner Leonardo de Moura Patrice Godefroid Akash Lal Jim Larus Rustan Leino Kanika Nema G. Ramalingam Sai Tetali Aditya Thakur

Rigorous Software Engineering Microsoft Research India