Predict Failures with Developer Networks and Social Network Analysis Andrew Meneely et al.
Introduction Research Question Importance Research Goal Predict failures at the file level Importance Dramatically decrease fixing cost Research Goal Examine human factors in failure prediction by applying social network analysis to code churn information
Introduction (cont.) Method Case study introduce file-based metrics based on SNA as additional predictors of software failures Case study a mature Nortel product (over 3 million LOC) get models using failure data from 2 releases, validated against a subsequent release in 20% files, one model: 58%, optimal: 61% a significant correlation exists between file-based developer network metrics and failures
Definitions of Network Metrics Node, Connection, Path Geodesic path (social distance): shortest path between 2 nodes Diameter: Longest geodesic path Connectivity: measure direct connections Degree: number of connections on a node Hub (a “well-known” developer): degree is above a threshold Disconnected: a node has no edges
Network Metrics (cont.) Centrality: quantify how closely nodes are indirectly connected to the rest of network Closeness: the average distance from v to any other node in the network that can be reached from v Betweenness: the number of geodesic paths that include v divided by the total number of geodesic paths in the network
Get Developer Network Metrics Step 1 Initial code churn information Step 2 Construct developer social network Step 3 Compute developer-based metrics Step 4 Compute file-based metrics
Step 1: code churn information
Step 2: developer social network
Step 3: developer-based metrics
Step 4: file-based metrics
Independent and Dependent Variables Independent Variables Dependent Variables the number of system test failures for a file the number of post-release failures for a file
Model selection and validation Find best combination of variables and a regression Training set and validation set Candidate regression Number of failures for a given file: Negative binomial regression and Poisson regression Probability that a file had at least one failure: Logistic regression
Step One: Initial model selection Determine Combinations of candidate variables Transformation of variables Candidate regressions Weights of variables Evaluated by Goodness-of-fit statistics (training error) calculated in SAS v9.1 using proc genmod
Step Two: Final model selection Cross-validation Training partition and validation partition Catch over-fit models Spearman rank correlation coefficient The two models with the highest average correlation coefficient and the lowest standard deviation become our final models to be validated
Step Three: Model validation Evaluated against the validation set Two criteria Spearman rank correlation coefficient between the estimated values and the observed values Examine the difference between our predicted prioritization and an optimal prioritization
Step Four: Further Analysis Evaluate how well the model works Compared to SLOC model Compare the model with a model containing only code churn metrics and not network metrics, and vice versa Assess network metrics as an early indicator Investigate possible latent factors
Case study An industrial product at Nortel Networks 2,500 files of (11,000 files, 3.17 million LOC) System Testing Model (step 1) negative binomial regression Degree was positively correlated with failures Closeness was negatively correlated The actual beta-weights are not included Cross-validation (step 2) Spearman rank correlation coefficients for the system test model was 0.778 60.5% of the variance was explained
Model validation (step 3) Next release
Model validation (cont.) rate of actual discovery of failures by the Nortel system test team
Compared with other models Model selection and validation
Model as an Early Indicator Use our model early in development perform our analysis of ten-fold cross-validation using data from only the first half of the development time during release Rn+1 average Spearman rank of 0.693 with standard deviation of 0.02, with all correlation coefficients significant (p<0.01)
Conclusion Our model performed significantly well in prioritizing files based on predicted failures. developer networks are useful for failure prediction early in the development phase and provide a useful abstraction of the code churn data
Thank you!