Software Prediction Models Forecasting the costs of software development.

Slides:



Advertisements
Similar presentations
On the application of GP for software engineering predictive modeling: A systematic review Expert systems with Applications, Vol. 38 no. 9, 2011 Wasif.
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
1 Statistical Modeling  To develop predictive Models by using sophisticated statistical techniques on large databases.
Chapter 27 Inferences for Regression This is just for one sample We want to talk about the relation between waist size and %body fat for the complete population.
Model Adequacy Checking in the ANOVA Text reference, Section 3-4, pg
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Probabilistic & Statistical Techniques Eng. Tamer Eshtawi First Semester Eng. Tamer Eshtawi First Semester
The loss function, the normal equation,
Instance Based Learning
Chapter 10 Simple Regression.
Regression Analysis. Unscheduled Maintenance Issue: l 36 flight squadrons l Each experiences unscheduled maintenance actions (UMAs) l UMAs costs $1000.
Learning From Data Chichang Jou Tamkang University.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Prof.Dr.Cevdet Demir
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Statistics 800: Quantitative Business Analysis for Decision Making Measures of Locations and Variability.
Statistical Learning: Pattern Classification, Prediction, and Control Peter Bartlett August 2002, UC Berkeley CIS.
Evaluation of Results (classifiers, and beyond) Biplav Srivastava Sources: [Witten&Frank00] Witten, I.H. and Frank, E. Data Mining - Practical Machine.
General Mining Issues a.j.m.m. (ton) weijters Overfitting Noise and Overfitting Quality of mined models (some figures are based on the ML-introduction.
Agresti/Franklin Statistics, 1 of 88 Chapter 11 Analyzing Association Between Quantitative Variables: Regression Analysis Learn…. To use regression analysis.
Data Mining: A Closer Look Chapter Data Mining Strategies 2.
Chapter 5 Data mining : A Closer Look.
Classification and Prediction: Regression Analysis
1 Prediction of Software Reliability Using Neural Network and Fuzzy Logic Professor David Rine Seminar Notes.
1 Machine Learning: Lecture 5 Experimental Evaluation of Learning Algorithms (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
1 Formal Evaluation Techniques Chapter 7. 2 test set error rates, confusion matrices, lift charts Focusing on formal evaluation methods for supervised.
1  The goal is to estimate the error probability of the designed classification system  Error Counting Technique  Let classes  Let data points in class.
CSCI 347 / CS 4206: Data Mining Module 06: Evaluation Topic 01: Training, Testing, and Tuning Datasets.
1 Forecasting Field Defect Rates Using a Combined Time-based and Metrics-based Approach: a Case Study of OpenBSD Paul Luo Li Jim Herbsleb Mary Shaw Carnegie.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Advancing Requirements-Based Testing Models to Reduce Software Defects Craig Hale, Process Improvement Manager and Presenter Mara Brunner, B&M Lead Mike.
Matlab Matlab Sigmoid Sigmoid Perceptron Perceptron Linear Linear Training Training Small, Round Blue-Cell Tumor Classification Example Small, Round Blue-Cell.
Regression using lm lmRegression.R Basics Prediction World Bank CO2 Data.
Statistical Tools for Solar Resource Forecasting Vivek Vijay IIT Jodhpur Date: 16/12/2013.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Agresti/Franklin Statistics, 1 of 88  Section 11.4 What Do We Learn from How the Data Vary Around the Regression Line?
Copyright © 2014, 2011 Pearson Education, Inc. 1 Chapter 21 The Simple Regression Model.
1 CSI5388 Current Approaches to Evaluation (Based on Chapter 5 of Mitchell T.., Machine Learning, 1997)
Predictive Modeling Spring 2005 CAMAR meeting Louise Francis, FCAS, MAAA Francis Analytics and Actuarial Data Mining, Inc
Essential Statistics Chapter 51 Least Squares Regression Line u Regression line equation: y = a + bx ^ –x is the value of the explanatory variable –“y-hat”
Applied Quantitative Analysis and Practices LECTURE#31 By Dr. Osman Sadiq Paracha.
Neural Networks Demystified by Louise Francis Francis Analytics and Actuarial Data Mining, Inc.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Occam’s Razor No Free Lunch Theorem Minimum.
PATTERN RECOGNITION : PRINCIPAL COMPONENTS ANALYSIS Richard Brereton
Copyright © 2010 Pearson Education, Inc Chapter Seventeen Correlation and Regression.
Data Mining and Decision Support
Machine Learning 5. Parametric Methods.
Artificial Neural Networks for Data Mining. Copyright © 2011 Pearson Education, Inc. Publishing as Prentice Hall 6-2 Learning Objectives Understand the.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12 Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Data Mining: Neural Network Applications by Louise Francis CAS Convention, Nov 13, 2001 Francis Analytics and Actuarial Data Mining, Inc.
1 Chapter 12: Analyzing Association Between Quantitative Variables: Regression Analysis Section 12.1: How Can We Model How Two Variables Are Related?
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 12: Analyzing the Association Between Quantitative Variables: Regression Analysis Section.
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
26134 Business Statistics Week 5 Tutorial
Data Mining CAS 2004 Ratemaking Seminar Philadelphia, Pa.
Regression Analysis Part D Model Building
Relationship with one independent variable
Correlation and Regression
Regression Computer Print Out
Chapter 10 Correlation and Regression
Regression Models - Introduction
Ninja Trader: Introduction to data mining in financial applications
BA 275 Quantitative Business Methods
Experiences and Lessons Learned from UNC Wilmington
Relationship with one independent variable
Bootstrapping Jackknifing
Chapter 14 Inference for Regression
Presentation transcript:

Software Prediction Models Forecasting the costs of software development

Prediction Study Outcomes Vary Estimation-by-analogy beats regression Or not Classification and regression trees (CART) beats regression Or not Artificial neural networks beat regression Or not

Why Are The Results Conflicting? Poor data or research procedure Complex techniques may require expert users; hence applications may vary Small sample size Measurement process that is flawed Selective use of differing parameters may result in different rankings

Key Terms Accuracy indicator – Some measure of a process – A summary statistic based on that measure Leave-one-out cross-validation Arbitrary function approximator taxonomy – Many-data versus sparse-data – Linear versus nonlinear – Supervised versus unsupervised Reliability versus validity

Indicator 1: MMRE Mean magnitude of relative error (MMRE) is an average where the MRE=|actual-prediction|/actual Claimed advantages of MMRE – Compare across data sets* – Independent of units – Compare across differing prediction models* – Scale independence *An hypothesis challenged by this paper

Indicator 2: MER Magnitude of the error relative to the estimate (MER) is defined as MER = |actual-prediction|/prediction

Indicator 3: AR The absolute residual (AR) is defined as AR = |actual-prediction|

Other Measures Standard deviation (SD) Relative standard deviation (RSD) Log standard deviation (LSD) Balanced relative error (BRE) Inverted balanced relative error (IBRE)

Standard Deviation of Residuals, Denoted SD

Algebraic Simplification

Relative Standard Deviation (RSD)

Log Standard Deviation (LSD)

Balanced Relative Error (BRE)

Inverted Balanced Relative Error (IBRE)