Effort Estimation Models for Contract Cost Proposal Evaluation

Effort Estimation Models for Contract Cost Proposal Evaluation
Authors: Wilson Rosa Corinne Wallshein Nicholas Lanham Co-authors: Barry Boehm, Ray Madachy, Brad Clark Good morning everyone This presentation will introduce a set of estimating models for predicting software development projects at early phase October 18-20, 2016

Outline Introduction Experimental Design and Data Analysis
Descriptive Statistics Effort Estimation Models Conclusion Future Work

Introduction Next, I will address the analytical method

Problem Statement In DoD, popular size measures are not available during contract proposal evaluation or earlier phase Use Case Points Story Points Function Points Most Proprietary Cost Models are based on Final Size (reported at contract end) which may lead to estimation error if not adjusted for growth

Proposed Solution Publicize Effort Estimation Models to crosscheck bidder’s proposed effort using input variables often available during contract bidding phase or earlier Estimated Software Requirements Estimated Peak Staff Super Domain Process Maturity (PMAT) Development Process (Agile, Waterfall, etc.) Scope (New vs Enhancement)

Research Questions: Effort Estimation Models
Estimated Software Requirements and Estimated Peak Staff, provided at contract start, valid predictor for Actual Effort? Accuracy improvement when one or more of the following variables are added? Super Domain Process Maturity (PMAT) Development Process Scope (New vs Enhancement) 3) Proposed Estimation Models applicable for Agile projects?

Experimental Design Next, I will address the analytical method

Content: Instrumentation
Questionnaire: Software Resource Data Report” (SRDR) (DD Form 2630) Source: Cost Assessment Data Enterprise (CADE) website: Content: Allows for the collection of project context, company information, requirements, product size, effort, schedule, and quality Where did the data come from? All the data used in this study were reported and collected via DCARC using the The developer submits two SRDR forms: The initial form contains estimates and is submitted a the beginning of the project The final form contains the actual data and it is submitted at the completion/delivery of the project Over the past few years the SRDR forms have become the primary data source for software cost estimation… It has also proven to be useful for root cause analysis by examining the initial versus final cost/schedule/size metrics

Each paired record includes:
Dataset used in Study Empirical data from recent US DoD programs: 203 Paired SRDR Records from the Cost Assessment Data Enterprise (CADE) 4 additional SRDR Records* (i.e. Agile Software Projects) from CADE 4 additional Paired Records (i.e. Agile Software Projects) from Proprietary Each paired record includes: SRDR Initial Developer Report (Estimates) & SRDR Final Developer Report (Actuals) 211 Records analyzed in this study *Fixed Price Incentive Fee (FPIF) Contract Type

Dataset by Development Process
Incremental and Spiral are the most commonly used processes in DoD

Dataset by Operating Environment
ERP = Enterprise Resource Planning; AIS = automated information system; C4I = command, control, communication, computer, intelligence

Dataset by Delivery Year

Data Normalization and Analysis Workflow
Dataset normalized to “account for sizing units, application complexity, and content so they are consistent for comparisons” (source: GAO) Estimated Software Requirements Super Domain Grouping Variable Selection Once the data was inspected the next major step was to normalize the data so they are consistent for comparisons. The six steps you see here are consistent with the GAO Best Practice Checklist… Ill expand on these in the next few slides… Regression Analysis Model Selection

Estimated Software Requirements
FORMULA Functional Requirements External Interface Requirements Estimated Software Requirements MEASURE “shall” statements contained in the baseline Software Requirements Specification (SRS) “ shall” statements contained in the baseline Interface Requirements Specifications (IRS) Once the data was inspected the next major step was to normalize the data so they are consistent for comparisons. The six steps you see here are consistent with the GAO Best Practice Checklist… Ill expand on these in the next few slides… SOURCE Initial SRDR Report Initial SRDR Report

Super Domain Grouping Approach
Dataset initially mapped into 17 Application Domains* Then into 4 complexity groups called Super Domains Application Domain Super Domain Software Tools Mission Support (SUPP) Training Enterprise Information System Automated Information System (AIS) Enterprise Services Custom AIS Software Mission Planning Test, Measurement, and Diagnostic Equipment Engineering (ENG) Scientific & Simulation Process Control System Software Command & Control, Communications Real Time (RTE) Real Time Embedded Vehicle Control/Payload Signal Processing, Microcode & Firmware

Super Domain Grouping Count Super Domain Groups 10 2 9 24 45 1 17 18 6
Support AIS Engineering Real Time TOTAL Aircraft 10 2 9 24 45 AIS/ERP 1 17 18 C4I 6 12 32 35 85 Missile 4 19 27 Ordnance Satellite 5 11 UAV 13 Ship 52 100 211

Variable Selection Pairwise Correlation to select Independent Variables Stepwise Analysis to select Categorical Variables Correlation Analysis Dependent Variable Actual Effort Independent Variable Estimated Software Requirements Estimated Effort Estimated Peak Staff Estimated Duration Select Independent Variables Original Effort Equation Stepwise Analysis 1 Categorical Variable Process Maturity Development Process Super Domain Scope (New vs Enhancement) Select Categorical Variables Regression Analysis

Pairwise Correlation Analysis
Conclusion Effort Estimation Models may include Estimated Software Requirements and Estimated Peak Staff as independent variables as these are strongly correlated to Actual Effort Estimated Software Requirements Estimated Effort Estimated Peak Staff Estimated Schedule Super Domain Requirements Volatility Process Maturity Development Process Scope Actual Effort 0.7 0.8 0.1 0.0 0.5 -0.2 -0.1 0.2 Strong Correlation Moderate Correlation Weak Correlation Not Applicable

Effort Estimation Equation
Stepwise Analysis 1 Effort Estimation Equation Variable Partial Correlations T-Stat P-Value Effort Estimation Equation R2 MMRE Original Equation aEffort = ƒ (REQ, STAFF) 65% 72% SD 0.6 10.4 0.0000 aEffort = ƒ (REQ, STAFF, SD) 77% 53% RVOL 0.3 3.8 0.0002 aEffort = ƒ (REQ, STAFF, RVOL) 67% SCOPE 0.1 1.3 0.1902 aEffort = ƒ (REQ, STAFF, SCOPE) 71% PROC 0.2064 aEffort = ƒ (REQ, STAFF, PROC) 64% 73% PMAT 0.9 0.3723 aEffort = ƒ (REQ, STAFF, PMAT) 63% 76% aEffort = Actual Effort (in person-months) at contract completion REQ = Estimated Software Requirements at contract start STAFF = Estimated Peak Staff at project initiation RVOL = Requirements Volatility Rating (VL=1, L=2, N=3, H=4, VH=5) SD = Super Domain (1 if Support, 2 if AIS, 3 if Engineering, 4 if Real Time) Scope = if Enhancement, 1 if New PROC = Development Process (0 if other, 1 if Agile, 2 if Waterfall) Conclusion: SD and RVOL variables may be included to further improve the Original Equation PROC, PMAT, SCOPE not included since Original Equation’s R2/MMRE did not improve with this data set

Model Selection Model Selection Based on T-Stat, lowest MMRE and CV
Measure Symbol Description Coefficient of Variation CV Percentage expression of the standard error compared to the mean of dependent variable. A relative measure allowing direct comparison among models. P-value α Level of statistical significance established through the coefficient alpha (p ≤ α). Variance Inflation Factor VIF Indicates whether multi-collinearity (correlation among predictors) is present in a multi-regression analysis. Coefficient of Determination R2 The Coefficient of Determination shows how much variation in dependent variable is explained by the regression equation. Mean Magnitude of Relative Error MMRE Low MMRE is an indication of high accuracy. MMRE is defined as the sample mean (M) of the magnitude relative error (MME). MME is the absolute value of the difference between actual and estimated effort divided by the actual effort.

Descriptive Statistics

Software Size vs Peak Staff
Staffing Level appears to influence Software Size

Productivity (median) vs Peak Staff
Larger teams appear to be less productive *Actual Hours per Estimated Software Requirements

Productivity (Median) vs Super Domain
Super Domain Type appears to influence software productivity

% Effort Overrun (Median) vs Super Domain
% Effort Overrun = Percent Change from Estimated Effort (at Contract Start) to Actual Effort (at Contract End) Super Domain Type appears to influence Effort Overruns

Effort Estimation Models
Practicality: Useful for crosschecking Contractor’s Estimated Effort during proposal evaluation Based on parameters available at early phase

Effort Model Variables
Type Definition Actual Effort Dependent Actual software engineering effort (in Hours) at contract completion Estimated Software Requirements Independent Sum of Functional Requirements and External Interface Requirements estimated at contract award. Counting convention based on “shall statements” Estimated Peak Staff Estimated peak team size at contract award, measured in full-time equivalent staff Super Domain Categorical Software primary application. Four Types: Mission Support, AIS, Engineering, or Real Time Requirements Volatility Change in software requirements from contract award to completion. Five Categories: Very Low, Low, Nominal, High, or Very High

Effort Estimation Models Entire Dataset
Where: aEffort = Actual Effort at contract completion REQ = Estimate Software Requirements at contract start STAFF = Estimated Peak Staff at project initiation RVOL = Requirements Volatility Rating (VL=1, L=2, N=3, H=4, VH=5) SD = Super Domain (1 if Mission Support, 2 if AIS, 3 if Engineering, 4 if Real Time) Coefficient Statistics: ID Equation Form N R2 % CV% Mean MMRE% REQ Min REQ Max 1a aEffort = 1562 x REQ0.6144 211 68 61 68669 63 2 5254 1b aEffort = 1576 x REQ x STAFF0.5181 209 78 50 69118 53 1c aEffort = 368 x REQ0.4369x STAFF x SD.9788 84 37 43 1d aEffort = 371 x REQ x STAFF0.541 x SD.9219 x RVOL.1701 38 42 Variable Model 1a T-stat Model 1b T-stat Model 1c T-stat Model 1d T-stat Intercept 4.1 5.0 4.6 REQ 14.7 8.9 12.0 10.9 STAFF 10.0 12.1 12.5 SD 10.7 9.8 RVOL 2.3 Model’s accuracy improves when STAFF and SD are gradually added

Effort Estimation Models Agile Subset
Where: aEffort = Actual Effort at contract completion REQ = Estimate Software Requirements at contract start STAFF = Estimated Peak Staff at project initiation SD = Super Domain (1 if Mission Support, 2 if AIS, 3 if Engineering, 4 if Real Time) Coefficient Statistics: ID Equation Form N R2 % CV% Mean MMRE% REQ Min REQ Max 2a aEffort = 1705 x REQ0.5682 15 74 46 69169 47 50 4867 2b aEffort = 1576 x REQ x STAFF0.5181 75 36 40 2c aEffort = 311x REQ0.4683x STAFF x SD1.122 89 22 25 Variable Model 2a T-stat Model 2b T-stat Model 2c T-stat Intercept 1.4 1.2 1.7 REQ 4.3 2.6 4.6 STAFF 1.86 2.9 SD 5.2 Equations 2a-2c are specifically for Agile Software Projects Model’s accuracy improves when STAFF and SD are gradually added

Conclusion

Primary Findings Estimated software requirements and Estimated Peak Staff, provided at contract start, are good for predicting Software Development Effort Effort Estimation Models’ accuracy improves when categorical variables are gradually added, especially Super Domain Cost Analysts should take into account Software Domain Type and Team Size when assessing Software Productivity

Model Usefulness Proposed Effort Models may be used to either crosscheck or validate contract proposals since input parameters used in the study are typically available during proposal evaluation phase Proposed Effort Models also applicable to Agile Projects as dataset includes 15 projects developed using Agile methods

Study Limitations Since data was collected at the CSCI level, the estimation models may not be appropriate for projects reported at the Roll-Up Level. Do not use Proposed Models if your input parameter is outside of the model range.

Backup

Effort Growth (Median) by Super Domain
Real Time and Engineering applications appear to experience more effort overruns

Productivity (Median) vs Super Domain
Productivity = Actual Hours per Estimated Software Requirements

Requirements (Median) vs Peak Staff
*Number of Estimated Software Requirements

Effort Estimation Models for Contract Cost Proposal Evaluation

Similar presentations

Presentation on theme: "Effort Estimation Models for Contract Cost Proposal Evaluation"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Effort Estimation Models for Contract Cost Proposal Evaluation

Similar presentations

Presentation on theme: "Effort Estimation Models for Contract Cost Proposal Evaluation"— Presentation transcript:

Similar presentations

About project

Feedback