Memories of Bug Fixes Sunghun Kim, Kai Pan, and E. James Whitehead Jr., University of California, Santa Cruz Presented By Gleneesha Johnson CMSC 838P,

Slides:



Advertisements
Similar presentations
CPSC 388 – Compiler Design and Construction
Advertisements

Chapter 7 User-Defined Methods. Chapter Objectives  Understand how methods are used in Java programming  Learn about standard (predefined) methods and.
Conditional statements and Boolean expressions. The if-statement in Java (1) The if-statement is a conditional statement The statement is executed only.
 2005 Pearson Education, Inc. All rights reserved Introduction.
Compiler Principle and Technology Prof. Dongming LU Mar. 28th, 2014.
Functional Design and Programming Lecture 1: Functional modeling, design and programming.
Computer & Network Forensics
CMT Programming Software Applications
Introduction to ML - Part 2 Kenny Zhu. What is next? ML has a rich set of structured values Tuples: (17, true, “stuff”) Records: {name = “george”, age.
7. Duplicated Code Metrics Duplicated Code Software quality
Fundamental Programming Structures in Java: Comments, Data Types, Variables, Assignments, Operators.
Chapter 9 Introduction to Arrays
 2007 Pearson Education, Inc. All rights reserved C Arrays.
C++ fundamentals.
Chapter Seven Advanced Shell Programming. 2 Lesson A Developing a Fully Featured Program.
Abstract Data Types (ADTs) and data structures: terminology and definitions A type is a collection of values. For example, the boolean type consists of.
Dependency Tracking in software systems Presented by: Ashgan Fararooy.
© The McGraw-Hill Companies, 2006 Chapter 4 Implementing methods.
Programming in Java Unit 2. Class and variable declaration A class is best thought of as a template from which objects are created. You can create many.
1 Semantic Analysis Aaron Bloomfield CS 415 Fall 2005.
Reviewing Recent ICSE Proceedings For:.  Defining and Continuous Checking of Structural Program Dependencies  Automatic Inference of Structural Changes.
Querying Structured Text in an XML Database By Xuemei Luo.
RELATIONAL FAULT TOLERANT INTERFACE TO HETEROGENEOUS DISTRIBUTED DATABASES Prof. Osama Abulnaja Afraa Khalifah
Chapter 6 Programming Languages (2) Introduction to CS 1 st Semester, 2015 Sanghyun Park.
©2003 Paula Matuszek CSC 9010: Text Mining Applications Document Summarization Dr. Paula Matuszek (610)
Flow of Control Part 1: Selection
ECE 353 Lab 1: Cache Simulation. Purpose Introduce C programming by means of a simple example Reinforce your knowledge of set associative caches.
C++ Programming: From Problem Analysis to Program Design, Fifth Edition Arrays.
1 Evaluating Code Duplication Detection Techniques Filip Van Rysselberghe and Serge Demeyer Lab On Re-Engineering University Of Antwerp Towards a Taxonomy.
M180: Data Structures & Algorithms in Java Arrays in Java Arab Open University 1.
Java™ How to Program, 10/e © Copyright by Pearson Education, Inc. All Rights Reserved.
Presented by: Ashgan Fararooy Referenced Papers and Related Work on:
Dynamic Data Structures and Generics Chapter 10. Outline Vectors Linked Data Structures Introduction to Generics.
Introduction Lecture 1 Wed, Jan 12, The Stages of Compilation Lexical analysis. Syntactic analysis. Semantic analysis. Intermediate code generation.
Chapter 6 Introduction to Defining Classes. Objectives: Design and implement a simple class from user requirements. Organize a program in terms of a view.
Chapter 3 Part II Describing Syntax and Semantics.
Copyright © 2006 Addison-Wesley. All rights reserved. Ambiguity in Grammars A grammar is ambiguous if and only if it generates a sentential form that has.
Chapter 3 Syntax, Errors, and Debugging Fundamentals of Java.
1 Debugging and Syntax Errors in C++. 2 Debugging – a process of finding and fixing bugs (errors or mistakes) in a computer program.
Lecture 10: Modular Programming (functions) B Burlingame 13 April 2015.
XP New Perspectives on XML, 2 nd Edition Tutorial 7 1 TUTORIAL 7 CREATING A COMPUTATIONAL STYLESHEET.
Software Engineering Laboratory, Department of Computer Science, Graduate School of Information Science and Technology, Osaka University 1 Classification.
Objective You will be able to define the basic concepts of object-oriented programming with emphasis on objects and classes by taking notes, seeing examples,
Data Profiling 13 th Meeting Course Name: Business Intelligence Year: 2009.
An Effective SPARQL Support over Relational Database Jing Lu, Feng Cao, Li Ma, Yong Yu, Yue Pan SWDB-ODBIS 2007 SNU IDB Lab. Hyewon Lim July 30 th, 2009.
Classes, Interfaces and Packages
REEM ALMOTIRI Information Technology Department Majmaah University.
© 2006 Pearson Addison-Wesley. All rights reserved 1-1 Chapter 1 Review of Java Fundamentals.
Arrays Declaring arrays Passing arrays to functions Searching arrays with linear search Sorting arrays with insertion sort Multidimensional arrays Programming.
Overview of Compilation Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida Programming Language Principles Lecture 2.
Programming Fundamentals Enumerations and Functions.
VISUAL C++ PROGRAMMING: CONCEPTS AND PROJECTS Chapter 7A Arrays (Concepts)
Chapter 9 Introduction to Arrays Fundamentals of Java.
The PLA Model: On the Combination of Product-Line Analyses 강태준.
LESSON 8: INTRODUCTION TO ARRAYS. Lesson 8: Introduction To Arrays Objectives: Write programs that handle collections of similar items. Declare array.
Static Analysis Tools Emerson Murphy-Hill. A Comparison of Bug Finding Tools for Java Bug pattern detection PMD FindBugs JLint Theorem proving [involves.
User-Written Functions
Computer Programming BCT 1113
Java Primer 1: Types, Classes and Operators
Data Structures Interview / VIVA Questions and Answers
Section 3.2c Strings and Method Signatures
Ruru Yue1, Na Meng2, Qianxiang Wang1 1Peking University 2Virginia Tech
Object Oriented Programming in java
Arrays .
CISC/CMPE320 - Prof. McLeod
CISC124 Labs start this week in JEFF 155. Fall 2018
Arrays in Java.
Classes, Objects and Methods
Bug Localization with Combination of Deep Learning and Information Retrieval A. N. Lam et al. International Conference on Program Comprehension 2017.
A type is a collection of values
Presentation transcript:

Memories of Bug Fixes Sunghun Kim, Kai Pan, and E. James Whitehead Jr., University of California, Santa Cruz Presented By Gleneesha Johnson CMSC 838P, Fall 2006

Source Code Repositories Collection of code changes Most long developed projects have one Typically used to store histories and make backups Knowledge contained within hasn’t been fully leveraged –Previous development experience –Changes that fix bugs are particularly interesting

Horizontal Bug Finding Techniques Applicable across all projects Based on pre-defined bug patterns, theorem proving, and model checking ESC/Java, FindBugs, JLint, PMD

Vertical Bug Finding Techniques Aim for project-specific bugs Different project requirements, business logic, and semantics BugMem –Learn bug patterns over time

Bug Fix Memories Project-specific bug and fix knowledge base –Developed by analyzing history of bug fixes –Can be used to detect bugs and suggests fixes

Building Bug Fix Memories Extract source code, change logs, and deltas –Kenyon File change –Contains list of region pairs that show differences between two file versions. –Regions called hunks Consist of source code lines Identify changes where a bug was fixed –Keyword search –Search for references to bug reports

Hunks Hunk pair (HP) → deleted hunk (DH) (bug hunk (BH)) & added hunk (AH) (fix hunk (FH))

Extracting Components From Hunks Extracts syntax patterns, components, from hunks Parsing – extracts syntax components Normalizing – generalizes syntax to match similar code Filtering – eliminates noise –Information filtering –Diff filtering

Raw Component Extraction (Parsing) Preprocessing Basic syntax lines –Differentiate between composite statements (if, for, while) and simple statements (method call, assignment, etc.) –all multi-line simple statements are on a single line. –conditional predicates of if, for, while, etc. all lie on a single line. Raw components –Based on abstract syntax tree of basic syntax line –4 kinds: static Java call, Java call, user-defined call, and non-call

Normalization If the variable type of a raw component is known, normalize variable – i.e., foo.flag → Foo.flag Generate two components for raw components that contain numeric, boolean, char or string literals –One with and one without normalization for the literal –i.e., i = 1 → int = 1 & int = int

Normalization Continued For a method call, actual parameters are normalized to the type of the parameter Normalization level –Indicates a component’s degree of normalization amongst others extracted from same raw component –Level increases with degree of normalization

Examples

Information Filtering Information value – indicates how much unique information a component carries –Determined by summing information value of its elements Information value threshold – used to filter components with little unique information –Two is used as a threshold in the paper

Diff Filtering Removes code unchanged between bug and fix hunks

Searching Memories Bugs found by searching for matching patterns in bug hunks Suggestions made by returning code in corresponding fix hunks Several options for component searching –Adjust degree of matching and omission of very common components

Options

Evaluation

Evaluation Continued Determine half and full hit rates –Half hit – bug previously seen –Full hit – bug and fix previously seen Five open source projects

Evaluation Continued

Comparison With PMD

Limitations Missing Memories Not applicable in initial stage of a project Doesn’t catch cross-file relationships

Uses Bug Finding Tool –BugMem –Can be integrated into Eclipse Code example repository –Useful to developers new to a project

The End!

Questions?