Presentation is loading. Please wait.

Presentation is loading. Please wait.

Using Correlation and Accuracy for Identifying Good Estimators 4 th International Predictor Models.

Similar presentations


Presentation on theme: "Using Correlation and Accuracy for Identifying Good Estimators 4 th International Predictor Models."— Presentation transcript:

1 Using Correlation and Accuracy for Identifying Good Estimators http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Gary D. Boetticher Nazim Lokhandwala Univ. of Houston - Clear Lake, Houston, TX, USA boetticher@uhcl.eduboetticher@uhcl.edu Lokhandwala@uhcl.eduLokhandwala@uhcl.edu 63 62 61

2 http://nas.cl.uh.edu/boetticher/publications.htmlThe 3 rd International Predictor Models in Software Engineering (PROMISE) Workshop Research vs. Reality according to Jörgensen TSE ’07: 300+ software est. papers, 76 journals, 15+ Years -8989-9900-04Total Algorithm4813770255 ML1324174 Human3222146 Misc.7192652 68% Algorithm 20% ML 12% Human PaperHuman Hihn 9189% Heemstra 9162% Paynter 9686% Jørgensen 9784% Hill 00100% Kitchenham 0272% JSS ’04: Compendium of expert estimation studies 82% Human 18% Formal

3 Statement of Problem http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop ((Log (TechGradCourses + (TechGradCourses ^ ((Log TotWShops)/(Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (((ProcIndExp + (Log (Sin MgmtGradCourses)))/(Sin SWPMExp)) + (Sin ((Cos (TechGradCourses ^ ((ProcIndExp + (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Sin SWPMExp)))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Cos (TechGradCourses ^ ((Log SWProjEstExp) / (((Log (ProcIndExp + (Log (TechGradCourses ^ ((Log SWProjEstExp) / (Log SWProjEstExp)))))) - 3) / (ProcIndExp + (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (TechGradCourses ^ (Log SWProjEstExp))))) / (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos (Log (Log (Log SWProjEstExp)))))))))))))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) + ((Log SWProjEstExp) / (Log SWProjEstExp)))))) / (Log (Log (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp))))))))))))))))))))))) / (TechGradCourses ^ (Log SWProjEstExp)))))) / (((Log ((((Log TotLangExp) / (Log SWProjEstExp)) / (Log SWProjEstExp)) / (Sin SWPMExp))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))) - 3) / (TechGradCourses ^ (Log SWProjEstExp)))))))))) + (((((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + ((TechGradCourses ^ (TechGradCourses ^ (Cos (TechGradCourses ^ ((ProcIndExp + (Log (Log (TechGradCourses ^ (TechGradCourses ^ (Cos (Log (Log (TechGradCourses ^ (Cos ((((Log SWProjEstExp) / ((ProcIndExp + (Log (TechGradCourses ^ (Log (TechGradCourses + (Cos (Log (Log (TechGradCourses ^ (Cos (((((Log SWProjEstExp) / (TechGradCourses ^ (Log SWProjEstExp))) / ((ProcIndExp + (Log (Sin MgmtGradCourses))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / ((Log SWProjEstExp) / (Log SWProjEstExp)))) / (Sin SWPMExp)) / (Sin SWPMExp)))))))))))) / (TechGradCourses ^ (Log SWProjEstExp))))))) / (Sin SWPMExp))))))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (TechGradCourses ^ (Log SWProjEstExp))) / (Sin SWPMExp))) Some Background 2006 http://www.starwarscrawl.com/?id=232

4 Statement of Problem How to build human-based estimation models that are accurate, intuitive, and easy to understand? http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop TechUGCourses < 45.5 | Hardware Proj Mgmt Exp < 6 | | No Of Hardware Proj Estimated < 4.5 | | | No Of Hardware Proj Estimated < 3 | | | | TechUGCourses < 23 | | | | | Hardware Proj Mgmt Exp < 0.75 | | | | | | TechUGCourses < 18 | | | | | | | Hardware Proj Mgmt Exp < 0.13 | | | | | | | | TechUGCourses < 0.5 | | | | | | | | | TechUGCourses < -1 : F (1/0) | | | | | | | | | TechUGCourses >= -1 | | | | | | | | | | Degree < 3.5 : A (4/0) | | | | | | | | | | Degree >= 3.5 : A (5/2) | | | | | | | | TechUGCourses >= 0.5 | | | | | | | | | TechUGCourses < 5.5 | | | | | | | | | | Degree < 3.5 : F (5/0) | | | | | | | | | | Degree >= 3.5 | | | | | | | | | | | TechUGCrses < 2 : A (1/0) | | | | | | | | | | | TechUGCrses >= 2 : F (1/0) | | | | | | | | | TechUGCrses >= 5.5 | | | | | | | | | | Degree < 3.5 | | | | | | | | | | | TechUGCrs < 10.5 : A (3/0) | | | | | | | | | | | TechUGCrses >= 10.5 | | | | | | | | | | | | TechUGCrs<12.5 : F (3/0) | | | | | | | | | | | | TechUGCrses >= 12.5 | | | | | | | | | | | | | TechUGCrs<16: A (2/0) | | | | | | | | | | | | | TechUGCrs>15 : A (2/1) | | | | | | | | | | Degree >= 3.5 : F (1/0) | | | | | | | HardProjMgmt Exp >= 0.13 : A (2/0) | | | | | | TechUGCourses >= 18 : A (2/0) | | | | | Hard Proj Mgmt Exp >= 0.75 : F (1/0) | | | | TechUGCourses >= 23 : F (5/0) | | | No Of Hardware Proj Est >= 3 : F (1/0) | | No Of Hardware Proj Est >= 4.5 : A (5/0) | Hardware Proj Mgmt Exp >= 6 : F (4/0) TechUGCrses >= 45.5 : A (2/0) Some Background 2007

5 The 4 th International Predictor Models in Software Engineering (PROMISE) Workshop http://nas.cl.uh.edu/boetticher/publications.html PROMISE 2008 versus 2007 Sample set: 178 Samples One learner  Accuracy and Intuitive Results Attribute reduction Analysis. Relatively Simple models.

6 The Approach http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Personal Demographics Age, Gender, Nationality, etc. Academic Courses Undergrad/Grad: CS, HW, SE, Proj. Mgmt, MIS Workshops/Conferences: CS, HW, SE, Proj. Mgmt, MIS Work Programming: Ada, ASP, Assembly, C, C++, COBOL, DBMS, FORTRAN, Java, PASCAL, Perl, PHP, SAP, TCL, VB, Other Work Experience (HW/SW) Project Management Exp. (HW/SW) # Projects Estimated (HW/SW) Average Project Size Domain Experience Procurement Industry Experience Estimate 28 Components Scale Factor And Correlation Apply Machine Learners Buyer Admin Buyer 1 Buyer n... Buyer Software Distribution Server Supplier 1 Supplier 2 Supplier n : Supplier Software

7 How user compares to other respondents Feedback to Users http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop User’s Estimates Actual Estimates

8 Experiments: Data http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Correlation ScaleScale ScaleScale ScaleScale ScaleScale Original Data set Experiment 1 Experiment 2 Experiment 3 82.8 -29.4 0.008 29X

9 Experiments: Tools, Configuration http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Outliers Removed WEKA Toolset C4.5 (J48) 1000 Trials 10-Fold Cross Validation

10 Results: Correlation Only http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop 2-Class Problem: 10 Best (A), 10 Worst (F) 1000 Trials, Accuracy = 41.6% Attribute Reduction using WRAPPER 1000 Trials, Accuracy = 78.6%

11 Results: Scale Factor Only http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop 1000 Trials, Accuracy = 65.0% Attribute Reduction using WRAPPER 1000 Trials, Accuracy = 78.2% 2-Class Problem: 10 Best (A), 10 Worst (F)

12 Results: Correlation & Scale Factor http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop 1000 Trials, Accuracy = 82.2% Attribute Reduction using WRAPPER 1000 Trials, Accuracy = 93.3% 2-Class Problem: 10 Best (A), 10 Worst (F)

13 Discussion - 1 http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Best Estimators Poorest Estimators Average Correlation0.41730.3686 Average Scale Factor2.61982.7419 How well does the decision tree from the third experiment apply to all the respondents minus outliers?

14 Discussion - 2 http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop Scope of effort Amortization of effort Reuse can skew estimates (esp. Design for Reuse) Respondent’s estimates = Boetticher’s estimates Challenges in component effort estimation

15 Conclusions Good accuracy rates, especially after attribute reduction Correlation + Scale Factor  Intuitive Model Bridges expert and model groups http://nas.cl.uh.edu/boetticher/publications.htmlThe 4 th International Predictor Models in Software Engineering (PROMISE) Workshop

16 http://nas.cl.uh.edu/boetticher/publications.html Thank You ! The 4 th International Predictor Models in Software Engineering (PROMISE) Workshop

17 References 1) Jorgensen, M., “A review of studies on Expert Estimation of Software Development Effort,” Journal of Systems and Software, 2004. 2) Jørgensen, Shepperd, A Systematic Review of Software Development Cost Estimation Studies, IEEE Transactions on Software Engineering, 33, 1, January, 2007, Pp. 33-53. The 4 th International Predictor Models in Software Engineering (PROMISE) Workshop http://nas.cl.uh.edu/boetticher/publications.html


Download ppt "Using Correlation and Accuracy for Identifying Good Estimators 4 th International Predictor Models."

Similar presentations


Ads by Google