Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.

Slides:



Advertisements
Similar presentations
Symbol Table.
Advertisements

Roles of Variables with Examples in Scratch
Chapter 7 Introduction to Procedures. So far, all programs written in such way that all subtasks are integrated in one single large program. There is.
50.530: Software Engineering Sun Jun SUTD. Week 10: Invariant Generation.
Analysis of Algorithms CS 477/677 Instructor: Monica Nicolescu Lecture 6.
Java Programming, 3e Concepts and Techniques Chapter 3 Section 63 – Manipulating Data Using Methods – Day 2.
Overview Motivations Basic static and dynamic optimization methods ADAPT Dynamo.
Slides prepared by Rose Williams, Binghamton University ICS201 Exception Handling University of Hail College of Computer Science and Engineering Department.
Chapter 10 Introduction to Arrays
Chapter 5: Elementary Data Types Properties of types and objects –Data objects, variables and constants –Data types –Declarations –Type checking –Assignment.
Michael Ernst, page 1 Learning and repair tools background Michael Ernst MIT Lab for Computer Science Joint work with Jake.
Using Programmer-Written Compiler Extensions to Catch Security Holes Authors: Ken Ashcraft and Dawson Engler Presented by : Hong Chen CS590F 2/7/2007.
Dynamic Invariant Discovery Modified from Tevfik Bultan’s original presentation.
Next Section: Pointer Analysis Outline: –What is pointer analysis –Intraprocedural pointer analysis –Interprocedural pointer analysis (Wilson & Lam) –Unification.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
272: Software Engineering Fall 2008 Instructor: Tevfik Bultan Lecture 16: Dynamic Invariant Discovery.
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael Ernst, Jake Cockrell, William Griswold, David Notkin Presented by.
Using Set Operations on Code Coverage Data to Discover Program Properties by Nick Rutar.
Automatically Extracting and Verifying Design Patterns in Java Code James Norris Ruchika Agrawal Computer Science Department Stanford University {jcn,
Michael Ernst, page 1 Improving Test Suites via Operational Abstraction Michael Ernst MIT Lab for Computer Science Joint.
Testing an individual module
Chair of Software Engineering Automatic Verification of Computer Programs.
Recursion Chapter 7. Chapter 7: Recursion2 Chapter Objectives To understand how to think recursively To learn how to trace a recursive method To learn.
Ernst, ICSE 99, page 1 Dynamically Detecting Likely Program Invariants Michael Ernst, Jake Cockrell, Bill Griswold (UCSD), and David Notkin University.
1CMSC 345, Version 4/04 Verification and Validation Reference: Software Engineering, Ian Sommerville, 6th edition, Chapter 19.
Dr. Pedro Mejia Alvarez Software Testing Slide 1 Software Testing: Building Test Cases.
Success status, page 1 Collaborative learning for security and repair in application communities MIT & Determina AC PI meeting July 10, 2007 Milestones.
Verification and Validation Yonsei University 2 nd Semester, 2014 Sanghyun Park.
Microsoft Visual Basic 2008 CHAPTER NINE Using Arrays and File Handling.
Operator Precedence First the contents of all parentheses are evaluated beginning with the innermost set of parenthesis. Second all multiplications, divisions,
 2004 Prentice Hall, Inc. All rights reserved. 1 Chapter 11 - JavaScript: Arrays Outline 11.1 Introduction 11.2 Arrays 11.3 Declaring and Allocating Arrays.
Using Arrays and File Handling
Dynamically Discovering Likely Program Invariants to Support Program Evolution Presented By: Wes Toland, Geoff Gerfin Michael D. Ernst, Jake Cockrell,
© SERG Dependable Software Systems (Mutation) Dependable Software Systems Topics in Mutation Testing and Program Perturbation Material drawn from [Offutt.
Recursion Chapter 7. Chapter Objectives  To understand how to think recursively  To learn how to trace a recursive method  To learn how to write recursive.
Design and Programming Chapter 7 Applied Software Project Management, Stellman & Greene See also:
Bug Localization with Machine Learning Techniques Wujie Zheng
Merge Sort. What Is Sorting? To arrange a collection of items in some specified order. Numerical order Lexicographical order Input: sequence of numbers.
6.3 List Boxes and Loops Some Properties, Methods, and Events of List Boxes List Boxes Populated with Strings List Boxes Populated with Numbers Searching.
Object-Oriented Program Development Using Java: A Class-Centered Approach, Enhanced Edition.
The Daikon system for dynamic detection of likely invariants MIT Computer Science and Artificial Intelligence Lab. 16 January 2007 Presented by Chervet.
CS 61B Data Structures and Programming Methodology July 28, 2008 David Sun.
ISV Innovation Presented by ISV Innovation Presented by Business Intelligence Fundamentals: Data Cleansing Ola Ekdahl IT Mentors 9/12/08.
02/12/2014 Presenter: Yuanhang Wang Instructor: Christoph Csallner 1 Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael.
An Object-Oriented Approach to Programming Logic and Design Fourth Edition Chapter 5 Arrays.
An Object-Oriented Approach to Programming Logic and Design Fourth Edition Chapter 6 Using Methods.
Dynamically Discovering Likely Program Invariants All material in this presentation is derived from documentation online at the Daikon website,
CS Data Structures I Chapter 2 Principles of Programming & Software Engineering.
Data Structures and Algorithms Lecture 1 Instructor: Quratulain Date: 1 st Sep, 2009.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
Programming Logic and Design Fourth Edition, Comprehensive Chapter 8 Arrays.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Using Loop Invariants to Detect Transient Faults in the Data Caches Seung Woo Son, Sri Hari Krishna Narayanan and Mahmut T. Kandemir Microsystems Design.
STL CSSE 250 Susan Reeder. What is the STL? Standard Template Library Standard C++ Library is an extensible framework which contains components for Language.
PROGRAMMING TESTING B MODULE 2: SOFTWARE SYSTEMS 22 NOVEMBER 2013.
Static Techniques for V&V. Hierarchy of V&V techniques Static Analysis V&V Dynamic Techniques Model Checking Simulation Symbolic Execution Testing Informal.
Pruning Dynamic Slices With Confidence Original by: Xiangyu Zhang Neelam Gupta Rajiv Gupta The University of Arizona Presented by: David Carrillo.
Higher Computing Science 2016 Prelim Revision. Topics to revise Computational Constructs parameter passing (value and reference, formal and actual) sub-programs/routines,
Python Programing: An Introduction to Computer Science
Mutation Testing Breaking the application to test it.
Arrays and Lists. What is an Array? Arrays are linear data structures whose elements are referenced with subscripts. Just about all programming languages.
Programming Logic and Design Fifth Edition, Comprehensive Chapter 6 Arrays.
CORRECTNESS ISSUES AND LOOP INVARIANTS Lecture 8 CS2110 – Fall 2014.
Jeremy Nimmer, page 1 Automatic Generation of Program Specifications Jeremy Nimmer MIT Lab for Computer Science Joint work with.
Introduction to Algorithms
Automated Support for Program Refactoring Using Invariants
Introduction to Algorithms
CSE 1020:Software Development
50.530: Software Engineering
Software Testing and QA Theory and Practice (Chapter 5: Data Flow Testing) © Naik & Tripathy 1 Software Testing and Quality Assurance Theory and Practice.
Presentation transcript:

Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented by: Nick Rutar

Program Invariants Useful in software development  Protect programmers from making errant changes  Verify properties of a program Can be explicitly stated in programs  Programmers can annotate code with invariants  This can take time and effort  Many important invariants will be missed

Daikon - Dynamic Invariant Detector Dynamic -- From Program Executions Step 1: Instrument Source Program  Trace Variables of Interest Step 2: Run Instrumented Program Over Test Suite Step 3: Infer Invariants from  Instrumented Variables  Derived Variables

Example Program (taken from “The Science of Programming”) i = 0; s = 0; do i ≠ n  i = i + 1 s = s + b[i] Precondition: n ≥ 0 Postcondition: s = (  j : 0 ≤ j < n : b[j]) Loop Invariant: 0 ≤ i ≤ n and s = (  j : 0 ≤ j < i : b[j])

Daikon results from the program (100 randomly generated input arrays of length 7-13) ENTER  N = size(B)  N in [7 … 13]  B - All elements ≥ -100 EXIT  N = I = orig(N) = size(B)  B = orig(B)  S = sum(B)  N in [7 … 13]  B - All elements ≥ -100 LOOP  N = size(B)  S = sum(B[0 … I -1])  N in [7 … 13]  I in [0 … 13]  I ≤ N  B - all elements in [ ]  sum(B) in [ ]  B[0] nonzero in [-99.96]  B[-1] in [-88.99]  N != B[-1] (negative)  B[0] != B[-1] (negative)

Instrumentation Insert instrumentation points  Procedure Entry  Procedure Exit  Loop Heads Writes to a file values for  All variables in scope  Global Variables  Procedure arguments  Local Variables  Procedure’s return value Available for Platforms  LISP  C/C++  Java (from Daikon website) Eclipe plug-in available  Perl (from Daikon website)

Inferring invariants System checks for the following (x,y,z variables; a,b,c computed constants):  Any variable constant or small number of values  Numeric variable range (a ≤ x ≤ b) modulus & nonmodulus  Multiple numbers linear relationship (such as x = ay + bz + c) functions (all those in standard lib, e.g. x = abs(y)) comparisons (x < y, x ≥ y, x == y) invariants over x + y and x -y  Sequence: sortedness invariants over all elements (e.g., every element < 100)  Multiple sequences subsequence & lexicographic relationship  Sequence and scalar membership

Inferring invariants (continued) Each potential variant is tested When invariant doesn’t hold, not tested again Negative Invariants  Relationships that are expected but don’t occur from input  Probability limit decides if invariants are included Derived Variables  Expressions treated same as regular variables  Include: From any array: first and last elements, length From numeric array: sum, min, max From array and scalar: element at that index(a[i]), subarray up to, and subarray beyond, that index From function invocation: number of calls so far

Using Invariants Modified Siemens replace (~500 LOC) program  Takes in regular expression and replacement string as input  Copies input stream to output stream replacing matched strings  Added input pattern + to * Use invariants for glimpse on how program runs  Found occurrences where initial belief was contradicted Prevented introducing bugs based on flawed knowledge of code  Found instance of unreported array bounds error

Using invariants (continued) Everything learned from “replace” could have been learned by combination of  Reading the code  Static Analyses  Selected Program Instrumentation Invariants give benefits that other approaches do not  Inferred invariants are abstraction of larger amount of data  Flags raised with unexpected invariants or expected invariants not appearing  Queries against database build intuition about source of invariant  Inferred invariants provide basis for programmer inferences  Invariants provide beneficial degree of serendipity

Results - Time Ran tests with between test inputs for replace Inferred ~71 variables per inst point in replace  6 original, 65 derived, 52 scalars, 19 sequences  On average, 10 derived for every original 1000 test cases  Produce 10,120 samples per instrumentation point  System takes 220 seconds to infer invariants 3000 test cases  33,801 samples  Processing takes 540 seconds Invariant detection time grows quadratically with the number of variables over which invariants are checked Time grows linearly with test suite size

Invariant Stability Relationship between test size suite and invariants Across test suites  Identical - invariant same between two test suites  Missing - invariant is present in one test suite, but not other  Different - invariant is different between two test suites Interesting - Worthy of further study to determine relevance Uninteresting - Peculiarity in the data  S1 in [ 0 … 98 ] (99 values)  S1 >= 0 (96 values)

Invariant Type/Test Cases Identical Unary Missing Unary Diff Unary Interesting Uninteresting Identical binary Missing Binary Diff Binary Interesting Uninteresting Invariant differences(2500-element test suite)

Invariants and Program Correctness Compare invariants detected across programs Correct versions of programs have more invariants than incorrect ones Examination of 424 intro C programs from U of Washington  Given # of students, amount of money, # of pizzas, calculates whether the students can afford the pizzas. Chose eight relevant invariants  people – [1…50]  pizzas – [1…10]  pizza_price – {9,11}  excess_money – [0...40]  slices = 8 * pizza  slices = 0 (mod 8)  slices_per – {0,1,2,3}  slices_left  people - 1

Relationship of Grade and Goal Invariants Grade Invariants Detected

Future Work (from 2001 paper) Increasing Relevance  Invariant is relevant if it assists programmer  Repress invariants logically implied by others Viewing and Managing Invariants  Overwhelming for a programmer to sort through  Various tools for selective reporting of invariants Improving Performance  Balance between invariant quality and runtime  Number of Derived Variables used Richer Invariants  Invariants over Pointer based data structures  Computing Conditional Invariants

Resources Daikon website   Contains links to Papers Source Code User Manual Developers Manual

Questions???