Menzies/Hihn - 1 STAR Seeking New Frontiers in Cost Modeling Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Karen Lum (JPL) Dan Baker (WVU)

Slides:



Advertisements
Similar presentations
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
Advertisements

Techniques for Dealing with Hard Problems Backtrack: –Systematically enumerates all potential solutions by continually trying to extend a partial solution.
If you fix everything you lose fixes for everything else Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Dan Baker (WVU) Karen Lum (JPL) International.
SE 450 Software Processes & Product Metrics 1 Introduction to Quality Engineering.
Visual Recognition Tutorial
1 CODE TESTING Principles and Alternatives. 2 Testing - Basics goal - find errors –focus is the source code (executable system) –test team wants to achieve.
1 XOMO: understanding Development Options for Autonomy PDX, USA Julian Richardson RIACS,USRA
Tirgul 9 Amortized analysis Graph representation.
Applying COCOMO II Effort Multipliers to Simulation Models 16th International Forum on COCOMO and Software Cost Modeling Jongmoon Baik and Nancy Eickelmann.
Ryan Kinworthy 2/26/20031 Chapter 7- Local Search part 1 Ryan Kinworthy CSCE Advanced Constraint Processing.
Swami NatarajanJune 17, 2015 RIT Software Engineering Reliability Engineering.
Introduction to Quality Engineering
SE 450 Software Processes & Product Metrics Reliability Engineering.
University of Southern California Center for Systems and Software Engineering ©USC-CSSE1 Ray Madachy, Barry Boehm USC Center for Systems and Software Engineering.
University of Southern California Center for Systems and Software Engineering 1 © USC-CSSE A Constrained Regression Technique for COCOMO Calibration Presented.
Value of Information Some introductory remarks by Tony O’Hagan.
Lecture 24: Thurs. Dec. 4 Extra sum of squares F-tests (10.3) R-squared statistic (10.4.1) Residual plots (11.2) Influential observations (11.3,
Simulated Annealing 10/7/2005.
MEsA Future Trends Panel Discussion Jairus Hihn 22nd International Forum on COCOMO and Systems/Software Cost Modeling (2007)
“2cee” A 21 st Century Effort Estimation Methodology Tim Menzies Dan Baker Jairus Hihn Karen Lum
Lecture 10 Comparison and Evaluation of Alternative System Designs.
UNCLASSIFIED Schopenhauer's Proof For Software: Pessimistic Bias In the NOSTROMO Tool (U) Dan Strickland Dynetics Program Software Support
April 27, 2004CS WPI1 CS 562 Advanced SW Engineering Lecture #3 Tuesday, April 27, 2004.
University of Southern California Center for Systems and Software Engineering © 2009, USC-CSSE 1 An Analysis of Changes in Productivity and COCOMO Cost.
Testing Test Plans and Regression Testing. Programs need testing! Writing a program involves more than knowing the syntax and semantics of a language.
Monté Carlo Simulation MGS 3100 – Chapter 9. Simulation Defined A computer-based model used to run experiments on a real system.  Typically done on a.
Presenter: Shant Mandossian EFFECTIVE TESTING OF HEALTHCARE SIMULATION SOFTWARE.
COCOMO-SCORM: Cost Estimation for SCORM Course Development
Computational Stochastic Optimization: Bridging communities October 25, 2012 Warren Powell CASTLE Laboratory Princeton University
1 Validation & Verification Chapter VALIDATION & VERIFICATION Very Difficult Very Important Conceptually distinct, but performed simultaneously.
Vilalta&Eick: Informed Search Informed Search and Exploration Search Strategies Heuristic Functions Local Search Algorithms Vilalta&Eick: Informed Search.
by B. Zadrozny and C. Elkan
 1  Outline  stages and topics in simulation  generation of random variates.
Analysis and Visualization Approaches to Assess UDU Capability Presented at MBSW May 2015 Jeff Hofer, Adam Rauk 1.
1 A Bayesian Method for Guessing the Extreme Values in a Data Set Mingxi Wu, Chris Jermaine University of Florida September 2007.
1 Design and Analysis of Engineering Experiments Chapter 1: Introduction.
EQT373 STATISTIC FOR ENGINEERS Design of Experiment (DOE) Noorulnajwa Diyana Yaacob School of Bioprocess Engineering Universiti Malaysia Perlis 30 April.
Treatment Learning: Implementation and Application Ying Hu Electrical & Computer Engineering University of British Columbia.
1 Lesson 8: Basic Monte Carlo integration We begin the 2 nd phase of our course: Study of general mathematics of MC We begin the 2 nd phase of our course:
Design and Society Lecture 5 Tim Sheard. Reading Thirty-Something (Million): Should They Be Exceptions? 3x5 cards - discussion.
Fast Simulators for Assessment and Propagation of Model Uncertainty* Jim Berger, M.J. Bayarri, German Molina June 20, 2001 SAMO 2001, Madrid *Project of.
Optimizing NASA IV&V Benefits Using Simulation Grant Number: NAG David M. Raffo, Ph.D College of Engineering and Computer Science School of Business.
Copyright © 2012 Pearson Education. All rights reserved © 2010 Pearson Education Copyright © 2012 Pearson Education. All rights reserved. Chapter.
FORS 8450 Advanced Forest Planning Lecture 5 Relatively Straightforward Stochastic Approach.
Software Testing and Quality Assurance Practical Considerations (4) 1.
Chapter 12 Confidence Intervals and Hypothesis Tests for Means © 2010 Pearson Education 1.
University of Southern California Center for Systems and Software Engineering © 2010, USC-CSSE 1 Trends in Productivity and COCOMO Cost Drivers over the.
Reservoir Uncertainty Assessment Using Machine Learning Techniques Authors: Jincong He Department of Energy Resources Engineering AbstractIntroduction.
UNCLASSIFIED Approved for Public Release 07-MDA-2965 (26 OCT 07) Load Bearing Walls: Early Sizing Estimation In The NOSTROMO Tool (U) Dan Strickland Dynetics.
Optimization Problems
Tutorial I: Missing Value Analysis
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
SAMPLING DISTRIBUTION OF MEANS & PROPORTIONS. SAMPLING AND SAMPLING VARIATION Sample Knowledge of students No. of red blood cells in a person Length of.
The joint influence of break and noise variance on break detection Ralf Lindau & Victor Venema University of Bonn Germany.
Quality Control Copyright © 2015 McGraw-Hill Education. All rights reserved. No reproduction or distribution without the prior written consent of McGraw-Hill.
CS-ROSETTA Yang Shen et al. Presented by Jonathan Jou.
The Practice of Statistics Third Edition Chapter 11: Testing a Claim Copyright © 2008 by W. H. Freeman & Company Daniel S. Yates.
Unified Adaptivity Optimization of Clock and Logic Signals Shiyan Hu and Jiang Hu Dept of Electrical and Computer Engineering Texas A&M University.
Stats Methods at IC Lecture 3: Regression.
Lesson 8: Basic Monte Carlo integration
COCOMO III Workshop Summary
Software Verification and Validation
Statistics in MSmcDESPOT
Software Quality Engineering
Constructive Cost Model
Data Mining Practical Machine Learning Tools and Techniques
Software Systems Cost Estimation
Significance Tests: The Basics
Significance Tests: The Basics
20th International Forum on COCOMO and Software Cost Modeling
Presentation transcript:

Menzies/Hihn - 1 STAR Seeking New Frontiers in Cost Modeling Tim Menzies (WVU) Jairus Hihn (JPL) Oussama Elrawas (WVU) Karen Lum (JPL) Dan Baker (WVU) 22nd International Forum on COCOMO and Systems/Software Cost Modeling (2007)

Menzies/Hihn - 2 STAR STAR has three key advancements over traditional methods and even 2cee –Provides an integrated set of COCOMO models COCOMO II COQUALMO COCOMO II Risk (threats) Assessment Model –Can be used to systematically analyze strategic and tactical policy decisions Searches for optimal combination of inputs that jointly reduce effort, defect rates and threats Uses constraints to restrict search –Free, Floating, Fixed –Can be tuned/calibrated with constraint sets instead of traditional historical data records Seek stable conclusions in space of all tunings Abduction: View it as an alternative to Bayesian methods based STAR is an abductive inference engine that applies simulated annealing to a treatment learner (TAR)

Menzies/Hihn - 3 Note This talk is an extension of material presented in “The Business Case for Automated Software Engineering” –IEEE ASE 2007 –Menzies, Elwaras, Hihn Feather, Madachy, Boehm 07casease.pdf

Menzies/Hihn - 4 Method Stagger across the space of known tunings and inputs (Monte Carlo) For N staggers, score N runs by an index we call energy: –Ef = (effort - minEffort ) / (maxEffort - minEffort) –De = (defects - minDefects ) / (maxDefects - minDefects) –Th = (threats - minThreats) / (maxThreats - minThreats) Save the one with lowest energy index normalization 0 <= x <= 1

Menzies/Hihn - 5 How to Stagger Simulated annealing (Von Neuman) –Pick input ranges and internal values at random –Do many runs starting from “boiling hot” (when you stagger around like a drunk) to “cooler” (No staggering walk straight to your destination) Keep track of multiple solutions –Current –New –Best Sample runs from STAR (after 500 runs, little improvement) Bad Good Best 10%

Menzies/Hihn - 6 Staggering the Tunings Range of effort multipliers (COCOMO) COCOMO effort estimation –Effort multipliers are straight (ish) lines –when EM = 3 = nominal… multiple effort by one (I.e. nothing) –i.e. they pass through the point {3,1}; cplx, data, docu pvol, rely, ruse, stor, time Increase effort acap, apex, ltex, pcap, pcon, plex,sced, site,toool decrease effort

Menzies/Hihn - 7 After staggering, select best things Sort all ranges by their “goodness” –Try the first ranked range, –Then the first and second, –Then the first and second and third –And so on Seek the “policy” –The fewest ranges –that most reduce threats, effort, defects Bad Good 22 good ideas 38 not-so- good ideas

Menzies/Hihn - 8 Staggering the inputs : 5 different ways 1.COCOMO II: stagger over entire model input space “Values” = fixed “Ranges”= Loose (select within these ranges)

Menzies/Hihn - 9 Making Strategic Decisions Full range of model Constrained by Jairus’ guess at JPL environment

Menzies/Hihn - 10 Results : OSP One advantage of this output display –If you can’t accept the full policy… –… you can see what trade-offs arise with some partial policy But partial polices cannot include many choices. For example note the missing values: –Peer reviews < 6 –Execution testing & tools < 6 –Automated analysis < 5

Menzies/Hihn - 11 Results: OSP2 OSP2 was a more constrained environment as it was a follow-on from OSP and ‘inherited’ the –Team –Development Environment –Design –Etc. Again note the missing values: –Peer reviews < 6 –Execution testing & tools < 6 –Automated analysis < 6

Menzies/Hihn - 12 Results: all experiments No point in half-hearted defect removal –Never found in any policy Peer reviews in 1,2,3,4 Execution testing & tools in 1,2,3,4 Automated analysis in 1,2,3,4 Beware spurious generalities –X= one of {cocomo or osp or osp2 or flight or ground} –Y= one of {cocomo or osp or osp2 or flight or ground} –Not(X = Y) –X’s best policy is not Y’s best policy –Exception … … Use more automated analysis (model checking, etc) –Automated analysis = 5 or 6 always in best policy

Menzies/Hihn - 13 Calibrating/Tuning Models Traditional Approach Current cost models are tuned to local contexts –LC (Boehm, 1981) Tuned to local data using LC –Hard to tell when old data no longer locally relevant Suffers from the “large outlier problem” Row pruning done heuristically Traditional approach Next Step –2CEE (Menzies, Jalali, Baker, Hihn, Lum, 2007) Tuned to local data using LC Tunes and validates every time it runs Tames outliers primarily with column pruning Uses Nearest Neighbor for row pruning –Not all flight software is equal –Culls old data that is no longer relevant –Both of these approaches require you get more data which may be hard to obtain STAR –Current research results suggest that we may be able to estimate almost as well without local data and LC. Use est vs actual instead of energy as evaluation metric –Constrain parameter ranges based on project being estimated knowledge of what typically varies in your environment –Assumes basic COCOMO tunings are ‘representative’ Seems reasonable

Menzies/Hihn - 14 Comparisons Mre = abs(predicted - actual) /actual Diff = ∑ mre(lc)/ ∑ mre(star) “ ” same at 95% confidence (MWU) “  ” same at 99% confidence (MWU) Very little difference –Half the time: insignificantly different –Otherwise, median diffs = +/- 25% Why so little difference? –Most influential inputs tightly constrained –Most of the variance comes from uncertainty in the SLOC, Not from noise of internal staggering diff same diff same diff same ∑ mre(lc) / ∑ mre(star)strategictactical ground 66% 63% all 91% 75% OSP2 99% 125%  OSP 112%  111%  flight 101%  121% 