University of Maryland Bug Driven Bug Finding Chadd Williams.

Slides:



Advertisements
Similar presentations
Detecting Bugs Using Assertions Ben Scribner. Defining the Problem  Bugs exist  Unexpected errors happen Hardware failures Loss of data Data may exist.
Advertisements

Slide-1 University of Maryland Five Common Defect Types in Parallel Computing Prepared for Applied Parallel Computing Prof. Alan Edelman Taiga Nakamura.
R4 Dynamically loading processes. Overview R4 is closely related to R3, much of what you have written for R3 applies to R4 In R3, we executed procedures.
Telecooperation/RBG Technische Universität Darmstadt Copyrighted material; for TUD student use only Introduction to Computer Science I Topic 16: Exception.
A Randomized Dynamic Program Analysis for Detecting Real Deadlocks Koushik Sen CS 265.
An Introduction to Java Programming and Object- Oriented Application Development Chapter 8 Exceptions and Assertions.
Software Configuration Management Donna Albino LIS489, December 3, 2014.
Software Quality Assurance Inspection by Ross Simmerman Software developers follow a method of software quality assurance and try to eliminate bugs prior.
(Quickly) Testing the Tester via Path Coverage Alex Groce Oregon State University (formerly NASA/JPL Laboratory for Reliable Software)
Dynamically Discovering Likely Program Invariants to Support Program Evolution Michael D. Ernst, Jake Cockrell, William G. Griswold, David Notkin Presented.
DBA Meeting December Supporting the MINOS MySQL Database at FNAL Nick West.
Visualizing Type Qualifier Inference with Eclipse David Greenfieldboyce Jeffrey S. Foster University of Maryland.
Software Reliability Methods Sorin Lerner. Software reliability methods: issues What are the issues?
PathExpander: Architectural Support for Increasing the Path Coverage of Dynamic Bug Detection S. Lu, P. Zhou, W. Liu, Y. Zhou, J. Torrellas University.
Low level CASE: Source Code Management. Source Code Management  Also known as Configuration Management  Source Code Managers are tools that: –Archive.
. Memory Management. Memory Organization u During run time, variables can be stored in one of three “pools”  Stack  Static heap  Dynamic heap.
CQual: A Tool for Adding Type Qualifiers to C Jeff Foster et al UC Berkeley OSQ Retreat, May
Source Code Management Or Configuration Management: How I learned to Stop Worrying and Hate My Co-workers Less.
High Level: Generic Test Process (from chapter 6 of your text and earlier lesson) Test Planning & Preparation Test Execution Goals met? Analysis & Follow-up.
ISSRE 2006 | November 10, 2006 Automated Adaptive Ranking and Filtering of Static Analysis Alerts Sarah Heckman Laurie Williams November 10, 2006.
Damian Gordon. Requirements testing tools Static analysis tools Test design tools Test data preparation tools Test running tools - character-based, GUI.
Program Input and the Software Design Process ROBERT REAVES.
Expediting Programmer AWAREness of Anomalous Code Sarah E. Smith Laurie Williams Jun Xu November 11, 2005.
Prof. Aiken CS 169 Lecture 71 Version Control CS169 Lecture 7.
A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction Raimund Moser, Witold Pedrycz, Giancarlo Succi.
Unit Testing & Defensive Programming. F-22 Raptor Fighter.
CSC 395 – Software Engineering Lecture 34: Post-delivery Maintenance -or- What’s Worse than Being a Code Monkey?
Software Quality Assurance Lecture #8 By: Faraz Ahmed.
Coding Methodology How to Design Code. © 2005 MIT-Africa Internet Technology Initiative Pay Attention to Detail When implementing or using APIs details.
Identifying Reasons for Software Changes Using Historic Databases The CISC 864 Analysis By Lionel Marks.
CS4723 Software Validation and Quality Assurance
Mining Function Usage Patterns to Find Bugs Chadd Williams.
Software Engineering CS3003
Git – versioning and managing your software L. Grewe.
CS 390- Unix Programming Environment CS 390 Unix Programming Environment Topics to be covered: Distributed Computing Fundamentals.
Computer Science Detecting Memory Access Errors via Illegal Write Monitoring Ongoing Research by Emre Can Sezer.
Pallavi Joshi* Mayur Naik † Koushik Sen* David Gay ‡ *UC Berkeley † Intel Labs Berkeley ‡ Google Inc.
Testing and Debugging Version 1.0. All kinds of things can go wrong when you are developing a program. The compiler discovers syntax errors in your code.
Testing. 2 Overview Testing and debugging are important activities in software development. Techniques and tools are introduced. Material borrowed here.
DEBUGGING. BUG A software bug is an error, flaw, failure, or fault in a computer program or system that causes it to produce an incorrect or unexpected.
Axel Naumann. Outline  Static Code Analysis  Coverity  Reporting Tools, Report Quality  "Demo": Examples Axel Naumann Application Area Meeting2.
Dynamic Memory Allocation. Domain A subset of the total domain name space. A domain represents a level of the hierarchy in the Domain Name Space, and.
1 Brief Introduction to Revision Control Ric Holt.
1 Test Selection for Result Inspection via Mining Predicate Rules Wujie Zheng
University of Maryland Mining Source Code Change History for Program Understanding Chadd Williams.
Chapter 8 Lecture 1 Software Testing. Program testing Testing is intended to show that a program does what it is intended to do and to discover program.
Administration Upcoming deadlines –Milestone 1 code due Monday Feb. 2 –Graphics proposal document: due Friday, Feb. 13 –Milestone 2 on web: due Monday,
1 Splint: A Static Memory Leakage tool Presented By: Krishna Balasubramanian.
Alliance Alliance Performance Status - CREQ Régis ELLING July 2011.
Design - programming Cmpe 450 Fall Dynamic Analysis Software quality Design carefully from the start Simple and clean Fewer errors Finding errors.
Protecting C Programs from Attacks via Invalid Pointer Dereferences Suan Hsi Yong, Susan Horwitz University of Wisconsin – Madison.
12 CVS Mauro Jaskelioff (originally by Gail Hopkins)
1 Lecture07: Memory Model 5/2/2012 Slides modified from Yin Lou, Cornell CS2022: Introduction to C.
1 Debugging (Part 2). “Programming in the Large” Steps Design & Implement Program & programming style (done) Common data structures and algorithms Modularity.
Defensive Programming. Good programming practices that protect you from your own programming mistakes, as well as those of others – Assertions – Parameter.
1 The FreeBSD Project: a Replication Case Study of Open Source Development.
Chapter 25 – Configuration Management 1Chapter 25 Configuration management.
Analyzing Open Source Code February, 2009 David Maxwell Open Source Strategist For Southern California Linux Expo.
Code improvement: Coverity static analysis Valgrind dynamic analysis GABRIELE COSMO CERN, EP/SFT.
Content Coverity Static Analysis Use cases of Coverity Examples
Chapter 8 – Software Testing
APEx: Automated Inference of Error Specifications for C APIs
Chapter 18 Software Testing Strategies
RDE: Replay DEbugging for Diagnosing Production Site Failures
High Coverage Detection of Input-Related Security Faults
Design and Programming
Testing, debugging, and using support libraries
Test Case Test case Describes an input Description and an expected output Description. Test case ID Section 1: Before execution Section 2: After execution.
Version Control CS169 Lecture 7 Prof. Aiken CS 169 Lecture 7.
Presentation transcript:

University of Maryland Bug Driven Bug Finding Chadd Williams

University of Maryland 2 Motivation Finding bugs in software is important Statically checking code has been effective –finds complex errors –no need to run the code Many static checkers available –some with specific bug patterns to find –some allow the user to define the patterns –what kinds of bugs are really out there? Lots of false positive error reports –can we rank the errors better? –can previous bug history help? Where to start? –Bug reporting databases –CVS commit messages

University of Maryland 3 Bug Database Inspect fixed bugs –review bug discussions –tie fixed bug to source code change –classify the type of the bug –look for bugs that can be found statically UsersDevelopers Bug Database

University of Maryland 4 Bug Database: Practical Experience We inspected the Apache httpd bug database –inspected 200 bug reports marked as fixed –not as helpful as we expected Only 24% tied directly back to a source code change –bug reports include a discussion of the problem –rarely is a diff or a CVS revision noted Most are logic errors/feature requests –not the type found by static checkers

University of Maryland 5 Bug Database Bug Types Most classified bugs are logic errors

University of Maryland 6 Bug Database: Practical Experience Most bug reports originate from users –197 out of 200 –does not capture bugs found by developers Most bug reports came against a release of the software, not a CVS-HEAD –198 out of 200 –does not capture bugs between releases What about the bugs that don’t make it into the release? –they may be in the CVS repository…

University of Maryland 7 CVS Repository Commits may contain useful data –any bug fix must show up in a commit –will commit messages lead us to bug fixes? Shows bugs fixed between releases Bugs caught by developers –bugs that could be found by static checking CVS Repository

University of Maryland 8 CVS Repository: Practical Experience Inspected commit messages –looked for ‘fix’, ‘bug’ or ‘crash’ –ignored those with bug number listed –looked at mature source files Commit messages are useful –trivially tied to source code change –less logic errors Common errors found –NULL pointer check –failing to check the return value of a function before use

University of Maryland 9 CVS Repository Bug Types NULL pointer bugs and return value bugs can be found by static analysis

University of Maryland 10 Return Value Check Bug Returning error code and valid data from a function is a common C idiom int foo(){ … if( error ){ return error_code; } …. return data; } … value = foo(); newPosition + = value; // ??? –the return value should be checked before being used –lint checks for this error Error types –completely ignored foo(); –return value used directly as an argument bar(foo()); –others …

University of Maryland 11 Return Value Checker Some functions don’t need their return value checked –no error value returned –could lead to many false positives Naively flagging all unchecked return values leads to many false positives –over 7,000 errors reported for the Apache httpd-2.0 source Need to determine which are most likely true errors –use historical data –present this data to the user

University of Maryland 12 Which return values need checked? Infer from historical data –look for an add of a check of a return value in a CVS commit –implies the programmer thinks it’s important Infer from current usage –does the return value of a function get checked in the current version of the software –how often? … value = foo(); newPosition + = value; // ??? … value = foo(); if( value != Error) { // Check newPosition + = value; } … Commit Bug Fix

University of Maryland 13 Our Tool Static checker that looks for return value check bugs –built on ROSE by Dan Quinlan, et al. Classify each error by category –ignored return value –return value used as argument, etc. Produce a ranking of the errors –group errors by called function –rank most promising errors higher rank functions that most likely need their return value checked higher

University of Maryland 14 Return Value Checker: Ranking Rank errors in two ways Split functions into two groups –functions flagged with a CVS bug fix commit at least one CVS commit adds a check of the function’s return value –functions not flagged with CVS bug fix commit Within each group: –rank by how often the function’s return value is checked in the current software distribution –checked more often means rank higher

University of Maryland 15 Case Study Apache httpd-2.0 on Linux –core system –modules –Apache Runtime Library Checked all the CVS commits for a return value check bug fix –6100 commits checked –2600 commits failed to go through our tool wrong (too new) version of autoconf parser problems compile bugs in the CVS commits

University of Maryland 16 Case Study: Results Our checker marked over 7,000 errors –individual call site for non-void function where the return value is not checked Too many too look at! –expect many are false positives Rank errors –inspect CVS bug fix commit flagged functions –inspect functions with return value checked more than 50% of the time in the current source tree value = foo(); // ERROR newPosition + = value; … result = foo(); // ERROR zoo(result);

University of Maryland 17 Case Study: Error Breakdown Inspected 453 errors (of 7,000) –found 98 that may be bugs! 231 errors associated with a CVS bug fix flagged function –61 of the 98 bugs found here –false positive rate of 74% 222 errors associated with a function that has its return value checked > 50% of the time –37 of the 98 bugs found here –false positive rate of 83%

University of Maryland 18 Case Study: A Bug We investigated an error and found it did crash httpd –error reported near the top of the ranking The called function builds a filename –arguments represent file and pathname –a char array is returned and directly used as an argument to strcmp() –strcmp(foo()) –NULL return value will cause a seg fault –return value is NULL if the path is too long!

University of Maryland 19 Analysis False positive rate too high! –overall false positive rate: 78% (1-(98/453)) A false positive rate closer to 50% would be acceptable –the user is likely as not to find a true error –cluster them near the top of the ranking We did cull 7,000 errors down to 453 –lint would have flagged only the ‘ignored’ errors and not ranked them

University of Maryland 20 Conclusion Bug databases are not useful in understanding much about low-level bugs –good for logic errors –good for misunderstood specifications CVS commit messages give a better picture of low-level bugs –especially bugs that don’t enter a release CVS commits can give useful data to help classify error reports

University of Maryland 21 Future Work What other types of bugs are common? What other checkers can benefit from CVS data? How can we cut the false positive rate? Can we dynamically gather data on functions called via function pointers? –many of the error messages involved calls through function pointers –Dyninst will allow us to instrument function pointer call sites and gather data