# P ROPOSAL AND V ALIDATION OF A F EASIBILITY M ODEL FOR I NFORMATION M INING P ROJECTS Pablo Pytel. Paola Britos & Ramón García-Martínez.

## Presentation on theme: "P ROPOSAL AND V ALIDATION OF A F EASIBILITY M ODEL FOR I NFORMATION M INING P ROJECTS Pablo Pytel. Paola Britos & Ramón García-Martínez."— Presentation transcript:

P ROPOSAL AND V ALIDATION OF A F EASIBILITY M ODEL FOR I NFORMATION M INING P ROJECTS Pablo Pytel. Paola Britos & Ramón García-Martínez

AGENDA  Problem Description  Proposed Solution  Validation o Proof Concept o Comparison with real projects  Conclusions

 Problem Desctipion: Information Mining Projects Software Engineering o Methods o Technics o Tools Metodologies: o CRISP-DM o P 3 TQ o SEMMA 85% [2000] and 60% [2005] of projects failed to achieve its goals The main problems (and associted risks) are not identified in the initial stages Feasibility Model

 Feasibility Model for Information Mining Projects:  13 characteristics to be evaluated: o Categories:  Plausibility  Adequacy  Sucess  Procedure:  Data  Business Problem  Project  Project Team o Dimensions: Determining the value of each project features Interpreting the results Converting feature values into fuzzy intervals Calculating the value of each dimension Calculating the overall project feasibility

 Validation – Proof Concept: o Step 1: Determining the value of each project features Project Objetive Detecting evidence of causality between general satisfaction and internet. CategoryIDValue Data P1All P2Regular A1All A2Much A3Regular E1Little Business Problem P3All A4Much A5Regular Project E2Much E3Regular Project Team P4All E4Much Fuzzy Interval ( 7.8; 8.8; 10; 10 ) ( 3.4; 4.4; 5.6; 6.6 ) ( 7.8; 8.8; 10; 10 ) ( 5.6; 6.6; 7.8; 8.8 ) ( 3.4; 4.4; 5.6; 6.6 ) ( 1.2; 2.2; 3.4; 4.4 ) ( 7.8; 8.8; 10; 10 ) ( 5.6; 6.6; 7.8; 8.8 ) ( 3.4; 4.4; 5.6; 6.6 ) ( 5.6; 6.6; 7.8; 8.8 ) ( 3.4; 4.4; 5.6; 6.6 ) ( 7.8; 8.8; 10; 10 ) (5.6; 6.6; 7.8; 8.8 ) o Step 2: Converting feature values into fuzzy intervals Conversion Table

 Validation – Proof Concept: (2) o Step 3: Calculating the value of each dimension o Step 4: Calculating the overall project feasibility. o Step 5: Interpreting the results. Plausibility Adequacy Sucess DimensionValue Plausibility7.60 Adequacy6.27 Sucess5.25 Overall Project Feasibility6.47 Feasible Accepted (in the limit)

 Validation – Comparison with real projects: 1)Apply the model into 25 real projects: o 22 projects finished successfully o 3 projects cancelled before completion 2)Request experts to appraise the project. 3)Compare the model’s result with project appraisal provided by experts.  Statistical Analysis  Wilcoxon signed-rank test

 Validation – Comparison with real projects: (2)  Statistical Analysis Plausibility Adequacy

 Validation – Comparison with real projects: (3)  Statistical Analysis Sucess Overall Project Feasibility

 Validation – Comparison with real projects: (4)  Statistical Analysis Plausibility Adequacy Sucess Overall Project Feasibility

 Validation – Comparison with real projects: (5)  Wilcoxon signed-rank test: Hypotheses : H 0 : there are no meaningful differences between the researchers and the model values (i.e. they are equivalent). H 1 : the researchers and the model values are not equivalent. Dimension Sum Ranks + ( W + ) Sum Ranks – ( W + ) Plausibility 97228 Adequacy 22798 Success 175150 Overall Feasibility 181144  level of significance = 0.01  quantity of non-zero pairs = 25  critical value = 68 Check Critical Value 97 > 68  H 0 accepted 98 > 68  H 0 accepted 150 > 68  H 0 accepted 144 > 68  H 0 accepted

 Conclusions:  A model to determine whether a data mining project is feasible or not at an early stage is proposed  From the application of the model into real projects:  Statistical Analysis: o the model tends to be more conservative than the experts o standard deviation range and average values are almost the same  Wilcoxon signed-rank test the proposed model is equivalent to the appraisal performed by the experts.

THANK YOU FOR YOUR ATTENTION ppytel@gmail.com paobritos@gmail.com rgarcia@unla.edu.ar

Download ppt "P ROPOSAL AND V ALIDATION OF A F EASIBILITY M ODEL FOR I NFORMATION M INING P ROJECTS Pablo Pytel. Paola Britos & Ramón García-Martínez."

Similar presentations