Presentation on theme: "1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou."— Presentation transcript:
1 Evaluation for Web Mining Applications Bettina Berendt Humboldt University Berlin Ernestina Menasalvas Universidad Politécnica de Madrid Myra Spiliopoulou Otto von Guericke University Magdeburg www.wiwi.hu-berlin.de/~berendt/Evaluation
2 Agenda Mining for evaluation: perspectives and measures A case study Outlook: Evaluation of mining Web mining as a project: towards a methodology Evaluation and experimentation Evaluation and Web mining Web mining as a project: towards a methodology Evaluation and experimentation Evaluation and Web mining
3 Evaluation of Web mining applications, or: Web mining as a project Is it worthwhile to do the mining project? Is the result valuable for the application? Are (all) the tasks performed well? Are the data appropriate for the mining project? Are the techniques appropriate for the expected resutls?
4 Project definition Refers to Set of interdependent activities Oriented to a specific goal With a predetermined lenght Set of tasks Web Site goal: stakeholder Cost and time estimation
5 Data Mining as a project Define the goal: Corresponds to the business understanding step of Crisp-DM Business and data mining experts have to define the goal collaboratively Each goal must be defined with a great degree of detail Obtain the model Apply data mining process model Evaluate results and redirect Evaluation in the extent definition: the act of ascertaining the value of an object according to specified criteria, operationalised in terms of measures. Object= patterns or model Measures and criteria has to do with goals Deploy With business goals directing each step, data mining produce results with a business impact Check the business impact is due to the result of the project Experiment design
6 Web Mining as a project: the 3 components of a system by Garnert Group ERP/ERM Order Manag. Supply Chain Mgmt. Order Prom. Legacy Systems Sales Automation Service Automation Marketing Automation Field Service Mobile Sales Vertical Apps. Category Mgmt. Marketing Automation Campaign Mgmt. Customer Activity CustomersProducts Data Warehouse Voice (IVR, ACD) Conferencing Web Conferencing E-mail Response Management Fax Letter Direct Interaction Operational CRMAnalytical CRM Collaborative CRM Office Interaction Closed-Loop Processing (EAI Toolkits, Embedded/Mobile Agents
7 Web Mining as a project: the 3 components translated ERP/ERM Order Manag. Supply Chain Mgmt. Order Prom. Legacy Systems Sales Automation Service Automation Marketing Automation Field Service Mobile Sales Data Mining. Data Mining Customer Activity CustomersProducts Data Warehouse Recommender PersonalizationE-mail Response Management Operational Analytical Decisional System Office Interaction Closed-Loop Processing (EAI Toolkits, Embedded/Mobile Agents Web Site Front?? Web Site Back??
8 The 3 component of a Web Site Operational component: The end result of a Software Development Process Decisional component: Results of the analitycal component are integrated in the operational system: Software development project Analitical component: The end result of a Data Mining process Sw Development Methodologies Data Mining Methodologies ?¿?? Business Intelligent Project BI Methodologies
9 Methodology Process Model Lifecycle + Set of tasks to be perfomed: Development tasks Project Management tasks Sequencing of task Waterfall Iterative Phases of the project
11 CRM Catalyst mayor phases: The five mayor phases are: Discovery. Establishing the business goals for CRM Orientation. Defining necessary system and organisational (specific technical solutions) changes to meet the goals. This leads to a definition of top-level system requirements. Navigation. The CRM system requirements are defined more precisely, the system is scoped, system and vendor assessment criteria are defined and a system is selected and contracted. Implementation. Planning and managing the CRM project. It is during this phase that the system is built and put into use. Post implementation. Monitoring performance and continuous improvement since CRM project never ends because CRM must constantly evolve to keep pace with the changing business and its environment.
12 Software Methodologies Process Model ISO 12207 Lifecycle Iterative+= RUP
13 Web Mining Methodology? To Be Defined Can be reused ? The ones in CRISP-DM
14 Web mining methodology : Process Model: Crisp-DM Is it worthwhile to do the mining project? Is the result valuable for the application? Are (all) the tasks performed well? Are the data appropriate for the mining project? Are the techniques appropriate for the expected resutls? Has the goal be obtained as a cause effect of the project development?
15 Web Mining Project goals Top-level goal 1: The Web exists in order to be used Goals of usage depend on stakeholder and viewpoint. Is the site a good site? Is it successful? But: What does Success mean? Starting point: Web life-cycle metrics, micro-conversion rates Extension for application-oriented success measurement: Multi-Channel Metrics Has the goal be obtained as a cause effect of the project development? Join in this slides resutls with the web mining project or other factors
16 Agenda Mining for evaluation: perspectives and measures A case study Outlook: Evaluation of mining Web mining as a project: towards a methodology Evaluation and experimentation Evaluation and Web mining Web mining as a project: towards a methodology Evaluation and experimentation Evaluation and Web mining
17 Experimentation in Web Mining Applications Ernestina Menasalvas Javier Segovia Pilar Herrero Universidad Politécnica de Madrid
18 Experimentation Refers to Matching with facts Supositions, assumptions speculation and beliefs That abound in web mining solutions deployment Users and stakeholder satisfied Personalization helps the user to remain loyal Recommendation increase selling Evaluation: the act of ascertaining the value and the functioning of an object according to specified criteria, operationalised by measures. to assess concrete achievements to give feedback towards improvement
19 Experimentationin web mining: Is the success due to the web mining resutls or to external factors? Is this a good Website? Web Mining -> good website NOT web Mining -> good website
Humans can generate valid knowledge by means of trial and error Trial and error process is longer and chancy than the scientific method Experimental design is is used in other fields of science Zelkowitz (98): Controlled Observational Historical What is Experimental Design? 5 Kitchemham (96): Formal Experiments Case Studies Surveys Experimental design to Web Mining empirical validation Adatation of experimental design terminology to WM (Juristo& Moreno 02) Laboratory validation of theories Validation at the level of real projects Historical data validation Empirical validation can be carried out:
21 Experimental Design www.soacilaresearchmethods.net/Kb/desexper.html Most rigorous of all research design The strongest with respect to internal validity Internal validity: Asses the proposition: If X, then Y And If not X, Then not Y If the program is given, then the outcome occurs And If the program is not given then the outcome does not occur Isolate the program from all of the other potential causes of the outcome
22 Experimental Design www.soacilaresearchmethods.net/Kb/desexper.html Experimental design is intrusive Difficult to carry out in mos real world contexts TO some extent, you set up an artificial situation: Asses the casual relationship with high internal validity. Limitating the degree to which results can be generalized Reduce external validity in order to achieve greater internal validity
23 Phases of experimental design process 1. Defining the objectives of the experiment Mathematical techniques demand experiment to produce quantifiable hypothesis Hypothesis expressed in terms of: –a metric of the web mining results obtained using the web mining techniques –or of the web mining process where the techiques have been applied 2. Designing the experiment: Experimental unit Parameters Response variable Factors, levels ans interaction Replication: based on analogy ?? Design 3. Executing the experiment: Measure response variables at the end of each experiment 4. Analyzing results: Experimental Analysis Quantify the impact of each factor and each iteration between factors on the variation of the response variable: statistical significance
24 Experimental design classification What we see can be divided into: SignalNoise Related to the variable of interest: the construct to measure random factors in the situation Signal enhancers Noise Reducers Signal to noise metaphor: (www.socialresearchmethods.net/kb) Factorial designs Blocking Designs
25 Experimental design techniques Categorical Factors Quantitative Experimental response Quantitative Factors and Response variable 1 Factor (2 or n levels) K Factors (2 or n levels) All other parameters fixed Some parameters cannot be fixed Regression Models One factor experiment Blocking experiment Some parameters are irrelevant All factors are relevant Blocking Factorial design n k experiments Less than n k experiments Factorial design Fractorial Factorial design