Final exam week Three things on finals week: –final exam –final project presentations –final project report.

Slides:



Advertisements
Similar presentations
1 Verification by Model Checking. 2 Part 1 : Motivation.
Advertisements

Advanced programming tools at Microsoft
Demand-driven inference of loop invariants in a theorem prover
Automated Theorem Proving Lecture 1. Program verification is undecidable! Given program P and specification S, does P satisfy S?
De necessariis pre condiciones consequentia sine machina P. Consobrinus, R. Consobrinus M. Aquilifer, F. Oratio.
Compilation 2011 Static Analysis Johnni Winther Michael I. Schwartzbach Aarhus University.
Reasoning About Code; Hoare Logic, continued
Background for “KISS: Keep It Simple and Sequential” cs264 Ras Bodik spring 2005.
Rigorous Software Development CSCI-GA Instructor: Thomas Wies Spring 2012 Lecture 13.
Introducing BLAST Software Verification John Gallagher CS4117.
A survey of techniques for precise program slicing Komondoor V. Raghavan Indian Institute of Science, Bangalore.
ISBN Chapter 3 Describing Syntax and Semantics.
Fall Semantics Juan Carlos Guzmán CS 3123 Programming Languages Concepts Southern Polytechnic State University.
Automated Soundness Proofs for Dataflow Analyses and Transformations via Local Rules Sorin Lerner* Todd Millstein** Erika Rice* Craig Chambers* * University.
The Software Model Checker BLAST by Dirk Beyer, Thomas A. Henzinger, Ranjit Jhala and Rupak Majumdar Presented by Yunho Kim Provable Software Lab, KAIST.
Common Sub-expression Elim Want to compute when an expression is available in a var Domain:
ESC Java. Static Analysis Spectrum Power Cost Type checking Data-flow analysis Model checking Program verification AutomatedManual ESC.
Proof-system search ( ` ) Interpretation search ( ² ) Main search strategy DPLL Backtracking Incremental SAT Natural deduction Sequents Resolution Main.
Automatically Proving the Correctness of Compiler Optimizations Sorin Lerner Todd Millstein Craig Chambers University of Washington.
Synergy: A New Algorithm for Property Checking
CS 536 Spring Global Optimizations Lecture 23.
Speeding Up Dataflow Analysis Using Flow- Insensitive Pointer Analysis Stephen Adams, Tom Ball, Manuvir Das Sorin Lerner, Mark Seigle Westley Weimer Microsoft.
Advanced Compilers CSE 231 Instructor: Sorin Lerner.
CS 267: Automated Verification Lectures 14: Predicate Abstraction, Counter- Example Guided Abstraction Refinement, Abstract Interpretation Instructor:
Automatically Validating Temporal Safety Properties of Interfaces Thomas Ball and Sriram K. Rajamani Software Productivity Tools, Microsoft Research Presented.
Houdini: An Annotation Assistant for ESC/Java Cormac Flanagan and K. Rustan M. Leino Compaq Systems Research Center.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
Advanced Compilers CSE 231 Instructor: Sorin Lerner.
Copyright © 2006 The McGraw-Hill Companies, Inc. Programming Languages 2nd edition Tucker and Noonan Chapter 18 Program Correctness To treat programming.
ESP: Program Verification Of Millions of Lines of Code Manuvir Das Researcher PPRC Reliability Team Microsoft Research.
From last time S1: l := new Cons p := l S2: t := new Cons *p := t p := t l p S1 l p tS2 l p S1 t S2 l t S1 p S2 l t S1 p S2 l t S1 p L2 l t S1 p S2 l t.
ESC Java. Static Analysis Spectrum Power Cost Type checking Data-flow analysis Model checking Program verification AutomatedManual ESC.
Prof. Fateman CS 164 Lecture 221 Global Optimization Lecture 22.
ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked.
Review: forward E { P } { P && E } TF { P && ! E } { P 1 } { P 2 } { P 1 || P 2 } x = E { P } { \exists … }
A simple approach Given call graph and CFGs of procedures, create a single CFG (control flow super-graph) by: –connecting call sites to entry nodes of.
Comparison Caller precisionCallee precisionCode bloat Inlining context-insensitive interproc Context sensitive interproc Specialization.
Provably Correct Compilers (Part 2) Nazrul Alam and Krishnaprasad Vikram April 21, 2005.
Describing Syntax and Semantics
Direction of analysis Although constraints are not directional, flow functions are All flow functions we have seen so far are in the forward direction.
Improving the Precision of Abstract Simulation using Demand-driven Analysis Olatunji Ruwase Suzanne Rivoire CS June 12, 2002.
Pointer analysis. Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis Andersen and.
Symbolic Path Simulation in Path-Sensitive Dataflow Analysis Hari Hampapuram Jason Yue Yang Manuvir Das Center for Software Excellence (CSE) Microsoft.
Cormac Flanagan University of California, Santa Cruz Hybrid Type Checking.
Prof. Bodik CS 164 Lecture 16, Fall Global Optimization Lecture 16.
Scalable Defect Detection Manuvir Das, Zhe Yang, Daniel Wang Center for Software Excellence Microsoft Corporation.
Software Engineering Prof. Dr. Bertrand Meyer March 2007 – June 2007 Chair of Software Engineering Static program checking and verification Slides: Based.
Proof Carrying Code Zhiwei Lin. Outline Proof-Carrying Code The Design and Implementation of a Certifying Compiler A Proof – Carrying Code Architecture.
Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.
Inferring Specifications to Detect Errors in Code Mana Taghdiri Presented by: Robert Seater MIT Computer Science & AI Lab.
On Reducing the Global State Graph for Verification of Distributed Computations Vijay K. Garg, Arindam Chakraborty Parallel and Distributed Systems Laboratory.
Type Systems CS Definitions Program analysis Discovering facts about programs. Dynamic analysis Program analysis by using program executions.
ISBN Chapter 3 Describing Semantics -Attribute Grammars -Dynamic Semantics.
CS 363 Comparative Programming Languages Semantics.
Reasoning about programs March CSE 403, Winter 2011, Brun.
Convergence of Model Checking & Program Analysis Philippe Giabbanelli CMPT 894 – Spring 2008.
Semantics In Text: Chapter 3.
1 Predicate Abstraction and Refinement for Verifying Hardware Designs Himanshu Jain Joint work with Daniel Kroening, Natasha Sharygina, Edmund M. Clarke.
An Undergraduate Course on Software Bug Detection Tools and Techniques Eric Larson Seattle University March 3, 2006.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
CIS 540 Principles of Embedded Computation Spring Instructor: Rajeev Alur
Chapter 4 Static Analysis. Summary (1) Building a model of the program:  Lexical analysis  Parsing  Abstract syntax  Semantic Analysis  Tracking.
Proving Optimizations Correct using Parameterized Program Equivalence University of California, San Diego Sudipta Kundu Zachary Tatlock Sorin Lerner.
CS223: Software Engineering Lecture 26: Software Testing.
Presentation Title 2/4/2018 Software Verification using Predicate Abstraction and Iterative Refinement: Part Bug Catching: Automated Program Verification.
ESP: Program Verification Of Millions of Lines of Code
Advanced Compiler Design
Predicate Abstraction
BLAST: A Software Verification Tool for C programs
Presentation transcript:

Final exam week Three things on finals week: –final exam –final project presentations –final project report

Program analysis and Software Reliability

Software reliability: issues What are the issues?

Software reliability: issues What is software reliability? How to measure it? –Bug counts ? Will we ever have bug-free software? –How many 9’s ? –Service Level Agreements ? What is a bug? –Adherence to specifications –But what is a specification… –User unhappy: is that a bug? –Different levels of severity

Software reliability: issues Cost of the methods for achieving reliability –Independently develop 5 versions of the software, run them all in parallel ) less likely that they fail at the same time in the same way. But… cost… is… high –For tools, cost of development of the tools Burden on the programmer –fully automated vs. semi-automated methods –allow progressive adoption Scalability vs. precision –start with scalability and get precision later? –Or the other way around?

Software reliability: issues Level of guarantee provided by the method –Hard guarantees, statistical guarantees, no formal guarantee –What if tool is broken: trusted computing base When is the method used? –compile-time, link-time, load-time, run-time What does the tool see? –source code, assembly, the whole program or part of the program

One way of dividing the spectrum Compiler if (…) { x := …; } else { y := …; } …;

One way of dividing the spectrum if (…) { x := …; } else { y := …; } …; Static techniques Testing techniques Run-time techniques Compiler if (…) { x := …; } else { y := …; } …;

One way of dividing the spectrum if (…) { x := …; } else { y := …; } …; Static techniques Testing techniques Run-time techniques Static techniques

Static Techniques Spec: says what code should and should not do Complete spec: specifies all behaviors (hard to formalize) Incomplete spec: only defines some behaviors –e.g. “no null derefs”, “requests received are eventually processed” Many formalisms exist for specs (Pre/Post conditions, FSMs, Temporal Logic, Abstract State Machines etc.) if (…) { x := …; } else { y := …; } …; Spec   «  ¬   $  \ r t  l Code satisfies spec?

Static Techniques Language Design –Clean language design –Type Systems –Domain-specific languages –…–… if (…) { x := …; } else { y := …; } …; Spec   «  ¬   $  \ r t  l Code satisfies spec? Program Analysis –Dataflow analysis –WP/SP –Model checking –Automated Theorem Proving –… Interaction between the two CleanL TSys DSL DFA WP/SP MC ATP

ESC/Java [Leino et al PLDI 2002] object Foo { PRE (FORMULA) method bar(...) {... } POST (FORMULA) } Compute Weakest Precondition WP(POST, bar) = weakest condition Q such that Q at entry to bar establishes POST at exist ) Programmer annotates code with pre- and post- conditions, tool verifies that these hold Automated Theorem Prover CleanL TSys DSL DFA WP/SP MC ATP

How does Program Analysis fit in? First, WP computation is a kind of program analysis –in fact, can be seen as a dataflow analysis –predicates as sets of states –WP operator acts as flow function –WP-based concrete semantics Second, can use global points-to analysis to help WP computation –do better job with pointer stores *x :=... –or with field assignments o.f :=...

Parser Code Gen Compiler DSL Opt DSL Opt DSL Opt DSL Opt DSL Opt DSL Opt Checker Rhodium [Lerner et al POPL 2005] CleanL TSys DSL DFA WP/SP MC ATP

DSL Opt DSL Opt DSL Opt Checker DSL Opt Checker DSL Opt Checker DSL Opt Rhodium [Lerner et al POPL 2005] CleanL TSys DSL DFA WP/SP MC ATP

Rhodium [Lerner et al POPL 2005] VCGen Local VC Automatic Theorem Prover Rdm Opt Checker Lemma For any Rhodium opt: If Local VC is true Then opt is OK Proof   «  ¬   $  \ r t  l Opt-independent Opt- dependent CleanL TSys DSL DFA WP/SP MC ATP

How does Program Analysis fit in? First, Rhodium is a system for expressing and checking the correctness of DFAs Second, the Rhodium system is an example of program analysis of a domain-specific language

ESP [Das et al PLDI 2002] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked Violated rules ) crashes –Failed runtime checks –Unreliable software CleanL TSys DSL DFA WP/SP MC ATP

ESP [Das et al PLDI 2002] C Program Safe Not Safe Rules ESP CleanL TSys DSL DFA WP/SP MC ATP

ESP [Das et al PLDI 2002] ESP is a program analysis that keeps track of object state at each program point –e.g.: is file handle open or closed? Challenge: scale to large programs –One of scalability issues: merge nodes –Always analyze both sides of merge node ) exponential (or non-terminating) program analyses ESP has a heuristic for handling merges that –avoids exponential blow-up and runs fast in practice –maintains enough precision to verify programs CleanL TSys DSL DFA WP/SP MC ATP

How does Program Analysis fit in? The core of ESP is a dataflow analysis that keeps track of the state of objects –we’ve seen examples of this before in class More on ESP later today

BLAST [Henzinger et al POPL 2000] Interface usage rules in documentation –Order of operations, data access –Resource management –Incomplete, wordy, not checked Violated rules ) crashes –Failed runtime checks –Unreliable software CleanL TSys DSL DFA WP/SP MC ATP

BLAST [Henzinger et al POPL 2000] C Program Safe Error Trace Rules BLAST CleanL TSys DSL DFA WP/SP MC ATP

BLAST [Henzinger et al POPL 2000] C Program Safe Error Trace Rules BLAST CleanL TSys DSL DFA WP/SP MC ATP

BLAST [Henzinger et al POPL 2000] Perform “Predicate Abstraction” C Program Rules start with a set of predicates Safe Refine set of predicates No errors found Analyze trace Trace feasible Error Trace Trace infeasible BLAST error trace found augmented set of predicates CleanL TSys DSL DFA WP/SP MC ATP

BLAST [Henzinger et al POPL 2000] Perform “Predicate Abstraction” C Program Rules start with a set of predicates Safe Refine set of predicates No errors found Analyze trace Trace feasible Error Trace Trace infeasible BLAST error trace found augmented set of predicates CleanL TSys DSL DFA WP/SP MC ATP

How does Program Analysis fit in? Predicate abstraction can be seen as a dataflow analysis that keeps track of certain predicates Flow functions are computed using a theorem prover More on predicate abstraction later today

Two examples in more detail Property simulation –algorithm behind ESP Predicate abstraction –one of the (many) algorithms used in BLAST and SLAM

Precision vs. scalability Precision Scalability

30 Prop Sim example: stdio usage in gcc void main () { if (dump) fil = fopen(dumpFile,”w”); if (p) x = 0; else x = 1; if (dump) fclose(fil); }

31 Prop Sim example: stdio usage in gcc void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }

32 Prop Sim example: stdio usage in gcc $uninit Opened $error Open Print Open Close Print/Close * void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }

33 Property Simulation Goal: compute set of states an object can be in –in this case, what states files are in The backdrop: context-sensitive RHS-style algorithm that propagates info about set of states objects can be in Key novelty in Prop Sim lies in path sensitivity technique

34 Dataflow property analysis Track only FSM state Ignore non-state-changing code At control flow join points: –Accumulate FSM states

35 Example entry dump p x = 0x = 1 Open Close exit dump T T T F F F

36 Example {$uninit,Opened} {$error,$uninit,Opened} {$uninit} {$uninit,Opened} entry dump p x = 0x = 1 Open Close exit dump T T T F F F

37 Path-sensitive property analysis Symbolically evaluate the program Track FSM state and execution state At branch points: –Execution state implies branch direction? Yes: process appropriate branch No: split state and process both branches

38 Example entry dump p x = 0x = 1 Open Close exit dump T T T F F F

39 Example [Opened|dump=T] [$uninit|dump=T] [Opened|dump=T,p=T] [Opened|dump=T,p=T,x=0] [$uninit|dump=T,p=T,x=0] [$uninit] entry dump p x = 0x = 1 Open Close exit dump T T T F F F

40 What happens in correct code? $uninit Opened $error Open Print Open Close Print/Close * void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; } void main () { if (dump) Open; if (p) x = 0; else x = 1; if (dump) Close; }

41 When is a branch relevant? Precise answer –When the value of the branch condition determines the property FSM state Heuristic answer –When the property FSM is driven to different states along the arms of the branch statement

42 Property simulation Same as path-sensitive analysis, except –At control flow join points: States agree on property state? –Yes: merge states –No: process states separately

43 Example entry dump p x = 0x = 1 Open Close exit dump T T T F F F

44 Example [Opened|dump=T] [$uninit] [$uninit|dump=T] entry dump p x = 0x = 1 Open Close exit dump T T T F F F [$uninit|dump=F] [Opened|dump=T,p=T,x=0] [Opened|dump=T,p=F,x=1] [Opened|dump=T] [$uninit|dump=F] [$uninit|dump=T] [$uninit|dump=F] [$uninit]

45 Reality bites $uninit Opened $error Open Print Open Close Print/Close * void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); if (dump1) fclose(fil1); if (dump2) fclose(fil2); } TransitionSource code pattern Close fclose(e) Open e = fopen(_) void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); if (dump1) fclose(fil1); if (dump2) fclose(fil2); } void main () { if (dump1) Open(fil1); if (dump2) Open(fil2); if (dump1) Close(fil1); if (dump2) Close(fil2); } TransitionSource code pattern Close fclose(e) Open e = fopen(_)

46 Property simulation, bit by bit void main () { if (dump1) Open; if (dump2) ID; if (dump1) Close; if (dump2) ID; } void main () { if (dump1) ID; if (dump2) Open; if (dump1) ID; if (dump2) Close; }

47 Property simulation, bit by bit One FSM at a time + Avoids exponential property state + Fewer branches are relevant + Lifetimes are often short + Embarassingly parallel − Cannot correlate FSMs

48 Reality bites, again void main () { if (dump1) fil1 = fopen(dumpFile1,”w”); if (dump2) fil2 = fopen(dumpFile2,”w”); fil3 = fil1; if (dump1) fclose( fil3 ); if (dump2) fclose( fil2 ); } –Run-time values have state –Syntactic names hold run-time values –Values come from creation patterns

49 Value flow analysis When does a pattern on name e cause a state transition for value v? –When v may flow to (be held by) e ESP uses an alias analysis to figure out if two one value flows to another –intuition: a flows to b if in the points to graph *a and *b are represented using the same node (ie: *a and *b alias) ESP uses Generalized One Level Flow (GOLF)

50 Recall: Steensgaard x x = y yxyxy Steensgaard

51 One-level Flow x x = y yxyxy xy Steensgaard One-level flow

52 One-level Flow One-level flow keeps around one level of directionality constraints, and merges the rest x x = y yxyxy xy Steensgaard One-level flow

53 One-level Flow One-level flow keeps around one level of directionality constraints, and merges the rest x x = y yxyxy xy Steensgaard One-level flow get the same for y = x don’t get the same for y = x

54 Generalized One-level flow Generalized version of one-level flow –Context-sensitive –Largely flow-insensitive –Highly scalable Need to adapt flow queries

55 Putting it all together Property simulation –Identify and track only relevant execution state Bit vector analysis for safety properties –Process one FSM at a time Identify and isolate individual FSMs –Syntactic patterns + scalable value flow

56 Case study: stdio usage in gcc cc1 from gcc version (Spec95) Does cc1 always print to opened files? cc1 is a complex program: –140K non-blank, non-comment lines of C –2149 functions, 66 files, 1086 globals –Call graph includes one 450 function SCC

57 Simplified skeleton of cc1 source FILE *f1, *f2; int p1, p2; void compileFile() { if (p1) f1 = fopen(…); if (p2) f2 = fopen(…); restOfComp(); if (p1) fclose(f1); if (p2) fclose(f2); } void restOfComp() { if (p1) printRtl(f1); if (p2) printRtl(f2); restOfComp(); } void printRtl(FILE *f) { fprintf(f); }

58 Experimental results Precision –Verification succeeds for every file handle –No transitions to $error ; no false errors Scalability –Average per handle: 72.9 seconds, 49.7 MB –Single 1GHz PIII laptop with 512 MB RAM Proved that: –Each of the 646 calls to fprintf in the source code prints to a valid, open file

59 ESP follow-up ESP has since been run on large real-world applications ESP/X: local intra-procedural version PSE: post-mortem analysis –run ESP backwards to figure out what cause a crash

60 Next lecture BLAST, SLAM and predicate abstraction