Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.

Slides:



Advertisements
Similar presentations
Intermediate Code Generation
Advertisements

Chapter 7 Introduction to Procedures. So far, all programs written in such way that all subtasks are integrated in one single large program. There is.
Programming Languages and Paradigms
50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.
Making Choices in C if/else statement logical operators break and continue statements switch statement the conditional operator.
Slides prepared by Rose Williams, Binghamton University ICS201 Exception Handling University of Hail College of Computer Science and Engineering Department.
1 A Balanced Introduction to Computer Science, 2/E David Reed, Creighton University ©2008 Pearson Prentice Hall ISBN Chapter 17 JavaScript.
COEN Expressions and Assignment
Copyright © 2006 Addison-Wesley. All rights reserved.1-1 ICS 410: Programming Languages Chapter 3 : Describing Syntax and Semantics Axiomatic Semantics.
ISBN Chapter 3 Describing Syntax and Semantics.
Michael Ernst, page 1 Learning and repair tools background Michael Ernst MIT Lab for Computer Science Joint work with Jake.
Compiler Construction
Dynamic Invariant Discovery Modified from Tevfik Bultan’s original presentation.
272: Software Engineering Fall 2008 Instructor: Tevfik Bultan Lecture 16: Dynamic Invariant Discovery.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael Ernst, Jake Cockrell, William Griswold, David Notkin Presented by.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Elementary Data Types Scalar Data Types Numerical Data Types Other
CS 330 Programming Languages 09 / 16 / 2008 Instructor: Michael Eckmann.
Chair of Software Engineering Automatic Verification of Computer Programs.
Describing Syntax and Semantics
Guide To UNIX Using Linux Third Edition
Program Analysis Mooly Sagiv Tel Aviv University Sunday Scrieber 8 Monday Schrieber.
Ernst, ICSE 99, page 1 Dynamically Detecting Likely Program Invariants Michael Ernst, Jake Cockrell, Bill Griswold (UCSD), and David Notkin University.
1CMSC 345, Version 4/04 Verification and Validation Reference: Software Engineering, Ian Sommerville, 6th edition, Chapter 19.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
CS 501: Software Engineering Fall 1999 Lecture 16 Verification and Validation.
Ryan Chu. Arithmetic Expressions Arithmetic expressions consist of operators, operands, parentheses, and function calls. The purpose is to specify an.
Microsoft Visual Basic 2008 CHAPTER NINE Using Arrays and File Handling.
Microsoft Visual Basic 2005 CHAPTER 9 Using Arrays and File Handling.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
 2004 Prentice Hall, Inc. All rights reserved. 1 Chapter 11 - JavaScript: Arrays Outline 11.1 Introduction 11.2 Arrays 11.3 Declaring and Allocating Arrays.
Using Arrays and File Handling
1 Debugging and Testing Overview Defensive Programming The goal is to prevent failures Debugging The goal is to find cause of failures and fix it Testing.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Presented By: Wes Toland, Geoff Gerfin Michael D. Ernst, Jake Cockrell,
1 Program Correctness CIS 375 Bruce R. Maxim UM-Dearborn.
C++ Programming: From Problem Analysis to Program Design, Third Edition Chapter 4: Control Structures I (Selection)
6.3 List Boxes and Loops Some Properties, Methods, and Events of List Boxes List Boxes Populated with Strings List Boxes Populated with Numbers Searching.
Chapter 25 Formal Methods Formal methods Specify program using math Develop program using math Prove program matches specification using.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
02/12/2014 Presenter: Yuanhang Wang Instructor: Christoph Csallner 1 Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael.
An Object-Oriented Approach to Programming Logic and Design Fourth Edition Chapter 5 Arrays.
Introduction to Exception Handling and Defensive Programming.
Dynamically Discovering Likely Program Invariants All material in this presentation is derived from documentation online at the Daikon website,
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
Chapter 15 Introduction to PL/SQL. Chapter Objectives  Explain the benefits of using PL/SQL blocks versus several SQL statements  Identify the sections.
4.4 JavaScript (JS) Deitel Ch. 7, 8, 9, JavaScript & Java: Similarities JS (JavaScript) is case-sensitive Operators –arithmetic: unary +, unary.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Loops (cont.). Loop Statements  while statement  do statement  for statement while ( condition ) statement; do { statement list; } while ( condition.
Chapter 5: More on the Selection Structure
1. 2 Preface In the time since the 1986 edition of this book, the world of compiler design has changed significantly 3.
Chapter 3 Part II Describing Syntax and Semantics.
Programming Logic and Design Fourth Edition, Comprehensive Chapter 8 Arrays.
Software Development Problem Analysis and Specification Design Implementation (Coding) Testing, Execution and Debugging Maintenance.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
CSE Winter 2008 Introduction to Program Verification January 15 tautology checking.
Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.
Principle of Programming Lanugages 3: Compilation of statements Statements in C Assertion Hoare logic Department of Information Science and Engineering.
1 Assertions. 2 A boolean expression or predicate that evaluates to true or false in every state In a program they express constraints on the state that.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
/ PSWLAB Evidence-Based Analysis and Inferring Preconditions for Bug Detection By D. Brand, M. Buss, V. C. Sreedhar published in ICSM 2007.
C HAPTER 3 Describing Syntax and Semantics. D YNAMIC S EMANTICS Describing syntax is relatively simple There is no single widely acceptable notation or.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Definition of the Programming Language CPRL
Scripts & Functions Scripts and functions are contained in .m-files
Java Modeling Language (JML)
CSE 1020:Software Development
50.530: Software Engineering
Presentation transcript:

Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented by: Nick Rutar

Program Invariants Useful in software development  Protect programmers from making errant changes  Verify properties of a program Can be explicitly stated in programs  Programmers can annotate code with invariants  This can take time and effort  Many important invariants will be missed

Could there be a way to dynamically discover program invariants???

Daikon: An Invariant Detector Pick a source program (Daikon is language independent) Instrument source program to trace variables of interest Run instrumented program over test cases Infer variants over  Instrumented variables (variables present in source)  Derived variables Created variables that might be of interest

Derived Variables From any Sequence s  Length: size(s)  Extremal elements: s[0], s[1], s[-1], s[-2] From a numeric sequence  sum(s), min(s), max(s) Any Sequence s and numeric variable(i)  Element at index: s[i], s[i-1]  Subsequences: s[0…i], s[0…i-1] From Function Invocations:  Number of calls so far

Example Program (taken from “The Science of Programming”) i, s = 0; do i ≠ n  i, s = i + 1, s + b[i] Precondition: n ≥ 0 Postcondition: s = (  j : 0 ≤ j < n : b[j]) Loop Invariant: 0 ≤ i ≤ n and s = (  j : 0 ≤ j < i : b[j])

Daikon results from the program (100 randomly generated input arrays of length 7-13) ENTER  N = size(B)  N in [7 … 13]  B - All elements ≥ -100 EXIT  N = I = orig(N) = size(B)  B = orig(B)  S = sum(B)  N in [7 … 13]  B - All elements ≥ -100 LOOP  N = size(B)  S = sum(B[0 … I -1])  N in [7 … 13]  I in [0 … 13]  I ≤ N  B - all elements in [ ]  sum(B) in [ ]  B[0] nonzero in [-99.96]  B[-1] in [-88.99]  N != B[-1]  B[0] != B[-1] *boxes indicate generated invariants that match expected ones

Original Program Instrumented Program Instrument Test Suite Run Detect Invariants Data Trace Invariants Architecture of the Daikon tool

Daikon has instrumenters for Java, C, and Lisp Source to Source Translation Determines which variables are in scope Inserts code to dump the variables into an output file Creates a declaration file  Variables being instrumented  Types in the original program  Representations in the trace file  Sets of variables that may be sensibly compared Operates only on scalar numbers and arrays of numbers.  Scalar numbers includes characters and booleans  Any other type is converted to one of these forms Original Program Instrumented Program Instrument

At each program point of interest  Instrumented Program writes to a data trace file All variables in scope  Global Variables  Procedure Arguments  Local Variables  Return Values (at procedure exits) Modification bit  Whether a value has been set since last time For small programs runtime may be I/O bound Instrumented Program Run Data Trace

Single variable invariants (numeric or sequence)  Constant value: x = a (variable is a constant)  Uninitialized: x = uninit (variable is never set)  Modulus: x ≡ a mod b (x mod b = a always holds) Multiple variables up to 3 (numeric or sequence)  Linear relationship: y = ax + b.  Reversal: x is the reverse of y  Invariants over x - y, x + y These are just a few  Complete list can be found in the paper  Domain-Specific invariants can easily be coded in Detect Invariants Data Trace Invariants

Run Time of Daikon Informally, can be characterized as  Time = O( (vars³ x falsetime + trueinvs x testsuite) x program) vars is the number of variables at a program point (in scope)  Most invariants are falsified quickly  Only true invariants are checked for the entire run  Potentially cubic because invariants involve at most 3 variables falsetime is the (small constant) time to falsify a potential invariant trueinvs is the (small) number of true invariants at a program point testsuite is the size of the test suite  Must balance accuracy versus runtime program is the number of instrumented program points  The default is proportional to the size of the program  Users can control the extent of instrumentation

Invariant Stability Size of Test Suite  Too Small Small number of invariants More false invariants  Too large Increases runtime linearly  Interesting vs. Uninteresting Different size test suites will have more/less invariants Uninteresting  Difference in a bound on a variable’s range  Different small set of possible values Interesting – everything else

Invariant Type/Test Cases Identical Unary Missing Unary Diff Unary Interesting Uninteresting Identical binary Missing Binary Diff Binary Interesting Uninteresting Invariant differences(2500-element test suite)

Invariants and Program Correctness Compare invariants detected across programs Correct versions of programs have more invariants than incorrect ones Examination of 424 intro C programs from U of Washington  Given # of students, amount of money, # of pizzas, calculates whether the students can afford the pizzas. Chose eight relevant invariants  people – [1…50]  pizzas – [1…10]  pizza_price – {9,11}  excess_money – [0...40]  slices = 8 * pizza  slices = 0 (mod 8)  slices_per – {0,1,2,3}  slices_left  people - 1

Relationship of Grade and Goal Invariants Grade Invariants Detected

Other Applications of Invariants Inserted as assert statements for testing Double-check existing documentation  Check against existing assert statements  Useful when program self-checks are ineffective Discovering Bugs Generate test cases or validate existing test suites Could possibly direct a correctness proof

Ongoing and Future Work Increasing Relevance  Invariant is relevant if it assists programmer  Repress invariants logically implied by others  Unrelated variables don’t need to be compared  Ignore variables not assigned since last time Viewing and Managing Invariants  Overwhelming for a programmer to sort through  Various tools for selective reporting of invariants Ordering by category Retrieves invariants based on supplied property List of invariants by program point

More Ongoing Work Improving Performance  Balance between invariant quality and runtime  Number of Derived Variables used Richer Invariants  Invariants over Pointer based data structures  Computing Conditional Invariants

Resources Daikon website   Contains links to Papers Source Code User Manual Developers Manual

Questions???