A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction Raimund Moser, Witold Pedrycz, Giancarlo Succi.

Slides:



Advertisements
Similar presentations
Chapter 24 Quality Management.
Advertisements

Combining Classification and Model Trees for Handling Ordinal Problems D. Anyfantis, M. Karagiannopoulos S. B. Kotsiantis, P. E. Pintelas Educational Software.
Presentation of the Quantitative Software Engineering (QuaSE) Lab, University of Alberta Giancarlo Succi Department of Electrical and Computer Engineering.
Stoimen Stoimenov QA Engineer SitefinityLeads, SitefinityTeam6 Telerik QA Academy Telerik QA Academy.
Mining Metrics to Predict Component Failures Nachiappan Nagappan, Microsoft Research Thomas Ball, Microsoft Research Andreas Zeller, Saarland University.
Prediction of fault-proneness at early phase in object-oriented development Toshihiro Kamiya †, Shinji Kusumoto † and Katsuro Inoue †‡ † Osaka University.
Software Quality Assurance Inspection by Ross Simmerman Software developers follow a method of software quality assurance and try to eliminate bugs prior.
1 Software Architecture CSSE 477: Week 5, Day 1 Statistical Modeling to Achieve Maintainability Steve Chenoweth Office Phone: (812) Cell: (937)
What causes bugs? Joshua Sunshine. Bug taxonomy Bug components: – Fault/Defect – Error – Failure Bug categories – Post/pre release – Process stage – Hazard.
1 Software Maintenance and Evolution CSSE 575: Session 8, Part 3 Predicting Bugs Steve Chenoweth Office Phone: (812) Cell: (937)
Software Engineering II - Topic: Software Process Metrics and Project Metrics Instructor: Dr. Jerry Gao San Jose State University
SE 555 Software Requirements & Specification Requirements Management.
RIT Software Engineering
SE 450 Software Processes & Product Metrics 1 Defect Removal.
CSC 395 – Software Engineering Lecture 25: SCM –or– Expecting Change From Everything But Vending Machines.
Applied Software Project Management Andrew Stellman & Jennifer Greene Applied Software Project Management Applied Software.
1 Finding Predictors of Field Defects for Open Source Software Systems in Commonly Available Data Sources: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb.
SOFTWARE PROJECT MANAGEMENT Project Quality Management Dr. Ahmet TÜMAY, PMP.
12 Steps to Useful Software Metrics
Defect prediction using social network analysis on issue repositories Reporter: Dandan Wang Date: 04/18/2011.
Expediting Programmer AWAREness of Anomalous Code Sarah E. Smith Laurie Williams Jun Xu November 11, 2005.
S Neuendorf 2004 Prediction of Software Defects SASQAG March 2004 by Steve Neuendorf.
1 Forecasting Field Defect Rates Using a Combined Time-based and Metrics-based Approach: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb Mary Shaw Carnegie.
©Ian Sommerville 2004Software Engineering, 7th edition. Chapter 27 Slide 1 Quality Management 1.
University of Palestine software engineering department Testing of Software Systems Fundamentals of testing instructor: Tasneem Darwish.
Software Configuration Management
Automated Fault Prediction The Ins, The Outs, The Ups, The Downs Elaine Weyuker June 11, 2015.
Cmpe 589 Spring Software Quality Metrics Product  product attributes –Size, complexity, design features, performance, quality level Process  Used.
CS4723 Software Validation and Quality Assurance
Validation Metrics. Metrics are Needed to Answer the Following Questions How much time is required to find bugs, fix them, and verify that they are fixed?
By: Md Rezaul Huda Reza 5Ps for SE Process Project Product People Problem.
1 How to Apply Static and Dynamic Analysis in Practice © Software Quality Week ‘97 How to Apply Static and Dynamic Analysis in Practice - Otto Vinter Manager.
University of Maryland Bug Driven Bug Finding Chadd Williams.
Software Engineering CS3003
Presented By : Abirami Poonkundran.  This paper is a case study on the impact of ◦ Syntactic Dependencies, ◦ Logical Dependencies and ◦ Work Dependencies.
Software Test Metrics When you can measure what you are speaking about and express it in numbers, you know something about it; but when you cannot measure,
Software Measurement & Metrics
1 Experience-Driven Process Improvement Boosts Software Quality © Software Quality Week 1996 Experience-Driven Process Improvement Boosts Software Quality.
Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.
Lucian Voinea Visualizing the Evolution of Code The Visual Code Navigator (VCN) Nunspeet,
The influence of developer quality on software fault- proneness prediction Yansong Wu, Yibiao Yang, Yangyang Zhao,Hongmin Lu, Yuming Zhou,Baowen Xu 資訊所.
Managing Change 1. Why Do Requirements Change?  External Factors – those change agents over which the project team has little or no control.  Internal.
THE IRISH SOFTWARE ENGINEERING RESEARCH CENTRELERO© What we currently know about software fault prediction: A systematic review of the fault prediction.
Qi Guo Emory University Ryen White, Susan Dumais, Jue Wang, Blake Anderson Microsoft Presented by Tetsuya Sakai, Microsoft Research.
Using Social Network Analysis Methods for the Prediction of Faulty Components Gholamreza Safi.
Software Maintenance Speaker: Jerry Gao Ph.D. San Jose State University URL: Sept., 2001.
The Longitudinal Student Assessment Project (LSAP)
© ABB Corporate Research January, 2004 Experiences and Results from Initiating Field Defect Prediction and Product Test Prioritization Efforts at.
Test status report Test status report is important to track the important project issues, accomplishments of the projects, pending work and milestone analysis(
1 Experience from Studies of Software Maintenance and Evolution Parastoo Mohagheghi Post doc, NTNU-IDI SEVO Seminar, 16 March 2006.
Mahindra Satyam Confidential Quality Management System Software Defect Prevention.
Quality Assessment based on Attribute Series of Software Evolution Paper Presentation for CISC 864 Lionel Marks.
BSBPMG501A Manage Project Integrative Processes Manage Project Integrative Processes Project Integration Processes – Part 2 Diploma of Project Management.
1 The FreeBSD Project: a Replication Case Study of Open Source Development.
DevCOP: A Software Certificate Management System for Eclipse Mark Sherriff and Laurie Williams North Carolina State University ISSRE ’06 November 10, 2006.
Objective ICT : Internet of Services, Software & Virtualisation FLOSSEvo some preliminary ideas.
Chapter 8: Maintenance and Software Evolution Ronald J. Leach Copyright Ronald J. Leach, 1997, 2009, 2014,
Chapter 25 – Configuration Management 1Chapter 25 Configuration management.
Defect Prediction Techniques He Qing What is Defect Prediction? Use historical data to predict defect. To plan for allocation of defect detection.
Steve Chenoweth Office Phone: (812) Cell: (937)
Software Defects Cmpe 550 Fall 2005
12 Steps to Useful Software Metrics
Software Development CMSC 202.
DEFECT PREDICTION : USING MACHINE LEARNING
Rename Local Variable Refactoring Instances
Design and Programming
Chapter 13 Quality Management
Software metrics.
Exploring Complexity Metrics as Indicators of Software Vulnerability
Presentation transcript:

A Comparative Analysis of the Efficiency of Change Metrics and Static Code Attributes for Defect Prediction Raimund Moser, Witold Pedrycz, Giancarlo Succi Free University of Bolzano-Bozen University of Alberta

Defect Prediction quality standards customer satisfaction Questions: 1. Which metrics are good defect predictors? 2. Which models should be used? 3. How accurate those models? 4. How much does it cost? Benefits?

Approaches for Defect Prediction 1. Product-centric 2.Process-centric 3.Combination of both Change history of source files (number or size of modifications, age of a file) Changes in the team structure Testing effort Technology Other human factors to software defects measures extracted from the: static/dynamic structure of source code design documents design requirements

Previous Work The relationship between software defects and code metrics No agreed answer No cost-insensitive prediction Impact of the software process on the defectiveness of software Two aspects of defect prediction:

Questions to Answer by This Work Questions: 1. Which metrics are good defect predictors? 2. Which models should be used? 3. How accurate those models? 4. How much does it cost? Benefits? Are change metrics more useful? Which change metrics are good? How can cost-sensitive analysis be used? Not how many defects are present in a subsystem but is source file defective?

Outline Experimental Set-Up Assessing Classification Accuracy Accuracy Classification Results Cost-Sensitive Classification Cost-Sensitive Defect Prediction Experiment Using a Cost Factor of 5

Data & Experimental Set-Up Public data set from the Eclipse CVS repository (releases 2.0, 2.1, 3.0) by Zimmermann et al. 18 change metrics concerning change history of files 31 static code attributes metrics that Zimmerman has used at a file level (correlation analysis, logistic regression, and ranking analysis)

One Possible Proposal of Change Metrics renaming or moving software elements the number of files that have been committed together with file x in weeks, starting from release date to its first appearance

Experiments 1. Change Model uses proposed change metrics 2. Code Model uses static code metrics 3. Combined Model uses both types of metrics Build Three Models for Predicting the Presence or Absence of Defects in Files

Outline Experimental Set-Up Assessing Classification Accuracy Accuracy Classification Results Cost-Sensitive Classification Cost-Sensitive Defect Prediction Experiment Using a Cost Factor of 5

Results Assessing Classification Accuracy

Accuracy Classification Results By analyzing the decision trees: Defect Free: Large MAX_CHANGESET or Low REVISIONS Smaller MAX_CHANGESET and Low REVISIONS and REFACTORINGS Defect Prone: High number of BUGFIXES

Outline Experimental Set-Up Assessing Classification Accuracy Accuracy Classification Results Cost-Sensitive Classification Cost-Sensitive Defect Prediction Experiment Using a Cost Factor of 5

Cost-Sensitive Classification Cost-sensitive classification - costs associated with different errors made by a model min  >1 FN implicate higher costs than FP Costly to fix an undetected defect in post release cycle than to inspect defect-free file

Cost-Sensitive Defect Prediction Results for J48 Learner, Release 2.0 Use some heuristics to stop increasing the recall FP<30%  =5

Experiment Using a Cost Factor of 5 Defect predictors based on change data outperform those based on static code attributes. Reject

Limitations Dependability on a specific environment Conclusions are on only three data miners Choice for code and change metrics Reliability of the data mapping between defects and locations in source code extraction of code or change metrics from repositories

Conclusions 18 change metrics, J48 learner,  =5 give accurate results for 3 releases of the Eclipse project: >75% of correctly classified files >80% recall < 30% FP rate Hence, the change metrics contain more discriminatory and meaningful information about the defect distribution that the source code itself. Important change metrics: Defect prone files with high revision numbers large bug fixing activities Defect-free files that are large CVS commits refactored several times files

Future Research Which information in change data is relevant for defect prediction? How to extract this data automatically?