Beyond Process Mining: Discovering Business Rules From Event Logs Marlon Dumas University of Tartu, Estonia With contributions from Luciano García-Bañuelos,

Slides:



Advertisements
Similar presentations
COMP3740 CR32: Knowledge Management and Adaptive Systems
Advertisements

Design of Experiments Lecture I
Han-na Yang Trace Clustering in Process Mining M. Song, C.W. Gunther, and W.M.P. van der Aalst.
Data Mining Methodology 1. Why have a Methodology  Don’t want to learn things that aren’t true May not represent any underlying reality ○ Spurious correlation.
/faculteit technologie management 1 Process Mining: Organizational and Conformance Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros.
Behavioral Comparison of Process Models Based on Canonically Reduced Event Structures Abel Armas-Cervantes Paolo Baldan Marlon Dumas Luciano García-Bañuelos.
/faculteit technologie management 1 Process Mining: Control-Flow Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros Eindhoven University.
Aligning Event Logs and Process Models for Multi- perspective Conformance Checking: An Approach Based on ILP Massimiliano de Leoni Wil M. P. van der Aalst.
Process Mining in CSCW Systems All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei ( )
Data Mining.
University of Minnesota
Data Mining By Archana Ketkar.
Discovering Coordination Patterns using Process Mining Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology.
Classification II.
/faculteit technologie management 1 Process Mining: Extension Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros Eindhoven University.
Data Mining – Intro.
Process Mining for Ubiquitous Mobile Systems An Overview and a Concrete Algorithm Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department.
Advanced Database Applications Database Indexing and Data Mining CS591-G1 -- Fall 2001 George Kollios Boston University.
Unraveling Unstructured Process Models Marlon Dumas University of Tartu, Estonia Joint work with Artem Polyvyanyy and Luciano García-Bañuelos Invited Talk,
Marlon Dumas marlon.dumas ät ut . ee
Data warehousing and mining Session VII (Part 1) 15: :10 Sunita Sarawagi School of IT, IIT Bombay.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
Walter Hop Web-shop Order Prediction Using Machine Learning Master’s Thesis Computational Economics.
OLAM and Data Mining: Concepts and Techniques. Introduction Data explosion problem: –Automated data collection tools and mature database technology lead.
Data Mining : Introduction Chapter 1. 2 Index 1. What is Data Mining? 2. Data Mining Functionalities 1. Characterization and Discrimination 2. MIning.
Discovering Web Access Patterns and Trends by Applying OLAP and Data Mining Technology on Web logs Data Engineering Lab 성 유 진.
Data warehousing and mining. 2 Introduction Organizations getting larger and amassing ever increasing amounts of data Historic data encodes useful information.
Data Mining Techniques
Scientific Workflows Within the Process Mining Domain Martina Caccavale 17 April 2014.
Data Mining Chun-Hung Chou
Research Terminology for The Social Sciences.  Data is a collection of observations  Observations have associated attributes  These attributes are.
COMP3503 Intro to Inductive Modeling
Process Mining Control flow process discovery
© 2010 Pearson Addison-Wesley. All rights reserved. Addison Wesley is an imprint of Designing the User Interface: Strategies for Effective Human-Computer.
Introduction to Data Mining Group Members: Karim C. El-Khazen Pascal Suria Lin Gui Philsou Lee Xiaoting Niu.
INTRODUCTION TO DATA MINING MIS2502 Data Analytics.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
A three-step approach for STULONG database analysis: characterization of patients’ groups O. Couturier, H. Delalin, H. Fu, E. Kouamou, E. Mephu Nguifo.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.4: Covering Algorithms Rodney Nielsen Many.
Data Mining – Intro. Course Overview Spatial Databases Temporal and Spatio-Temporal Databases Multimedia Databases Data Mining.
Advanced Database Course (ESED5204) Eng. Hanan Alyazji University of Palestine Software Engineering Department.
Last Words DM 1. Mining Data Steams / Incremental Data Mining / Mining sensor data (e.g. modify a decision tree assuming that new examples arrive continuously,
Detecting Group Differences: Mining Contrast Sets Author: Stephen D. Bay Advisor: Dr. Hsu Graduate: Yan-Cheng Lin.
CASE/Re-factoring and program slicing
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
Marlon Dumas University of Tartu
"Decomposing Alignment- based Conformance Checking of Data-aware Process Models" Massimiliano de Leoni, Jorge Muñoz-Gama, Josep Carmona, Wil van der Aalst.
MIS2502: Data Analytics Advanced Analytics - Introduction.
WHAT IS DATA MINING?  The process of automatically extracting useful information from large amounts of data.  Uses traditional data analysis techniques.
Mining Resource-Scheduling Protocols Arik Senderovich, Matthias Weidlich, Avigdor Gal, and Avishai Mandelbaum Technion – Israel Institute of Technology.
Data Mining What is to be done before we get to Data Mining?
Beyond Tasks and Gateways: Discovering BPMN Models with subprocesses, boundary events and activity markers Raffaele Conforti, Marcello La Rosa Queensland.
Data Mining – Introduction (contd…) Compiled By: Umair Yaqub Lecturer Govt. Murray College Sialkot.
Discovering Models for State-based Processes M.L. van Eck, N. Sidorova, W.M.P. van der Aalst.
The Automated Discovery of Hybrid Processes Fabrizio M. Maggi University of Tartu Tijs Slaats* IT University of Copenhagen Exformatics Hajo A. Reijers.
Oracle Advanced Analytics
Data Mining – Intro.
Systems Analysis and Design in a Changing World, Fourth Edition
MIS2502: Data Analytics Advanced Analytics - Introduction
MTAT Business Process Management (BPM) Lecture 11: Process Monitoring and Mining Fabrizio Maggi (based on lecture material by Marlon Dumas, Wil.
DATA MINING © Prentice Hall.
School of Computer Science & Engineering
David Redlich, Thomas Molka, Wasif Gilani, Awais Rashid, Gordon Blair
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
A General Framework for Correlating Business Process Characteristics
Data Warehousing and Data Mining
I don’t need a title slide for a lecture
Marlon Dumas marlon.dumas ät ut . ee
Flu and big data Week 10.2.
CHAPTER 7: Information Visualization
Presentation transcript:

Beyond Process Mining: Discovering Business Rules From Event Logs Marlon Dumas University of Tartu, Estonia With contributions from Luciano García-Bañuelos, Fabrizio Maggi & Massimiliano de Leoni Theory Days, Saka, 2013

Business Process Mining 2 Performance Analysis Process Model Organizational Model Social Network Event Log Event Log Slide by Ana Karla Alves de Medeiros Process mining tool (ProM, Disco, IBM BPI)

Automated Process Discovery 3 CIDTaskTime StampAttribute1 (amount)Attribute2 (salary) 13219Enter Loan Application T 11:20:10…… 13219Retrieve Applicant Data T 11:22:15…… 13220Enter Loan Application T 11:22:40…… 13219Compute Installments T 11:22:45…… 13219Notify Eligibility T 11:23:00…… Approve Simple Application T 11:24:30…… 13220Compute Installements T 11:24:35…… …………… Issue 1: Data?

Issue 2: Complexity

Dealing with Complexity Question: How to cope with complexity in (information) system specifications? Aggregate-Decompose Generalize-Specialize Special cases Summarize by aggregating and ignoring “uninteresting” parts Summarize by specializing and ignoring “uninteresting” specialized classes

Bottom-Line Do we want models or do we want insights?

Discovering Business Rules Decision rules Why does something happen at a given point in time? Descriptive (temporal) rules When and why does something happen? Discriminative rules When and why does something wrong happen?

Mining Decision Rules

What’s missing? 9 salary age installment amount length Decision points Decision points

ProM’s Decision Miner 10 salary age installment amount length CIDAmountLenSalaryAgeInstallmTask CIDAmountLenSalaryAgeInstallmTask NULL ELA CIDTaskDataTime Stamp… 13219ELA Amount=8500 Len= T 11:20: RAP Salary=2000 Age= T 11:22: ELA Amount=25000 Len= T 11:22: CIInstallm= T 11:22: NE T 11:23: ASA T 11:24: CIInstallm= T 11:24:35- …………… CIDAmountLenSalaryAgeInstallmTask NULL ELA NULLRAP RAP NE

(amount < 10000) (amount < 10000) ∨ (amount ≥ ∧ age < 35) amount Approve Simple Application (ASA) Approve Simple Application (ASA) ≥ < Approve Complex Application (ACA) Approve Complex Application (ACA) Approve Simple Application (ASA) Approve Simple Application (ASA) ≥ 35 age < 35 ProM’s Decision Miner / 2 CIDAmountInstallmSalaryAgeLenTask ASA ACA ASA ………………… 11 Decision tree learning amount ≥ ∧ age ≥ 35

ProM’s Decision Miner – Limitations Decision tree learning cannot discover expressions of the form “v op v” 12 installment > salary

Generalized Decision Rule Mining in Business Processes Problem –Discover decision rules composed of atoms of the form “v op c” and “v op v”, including linear equations or inequalities involving multiple variables Approach –Likely invariant discovery (Daikon) –Decision tree learning 13 De Leoni et al. FASE’2013

CIDAmountInstallmSalaryAgeLenTask NR NE NE ASA ACA ASA ………………… Daikon: Mining Likely Invariants 14 Daikon installment > salary amount ≥ 5000 length < age … installment > salary amount ≥ 5000 length < age … installment ≤ salary amount ≥ 5000 length < age … installment ≤ salary amount ≥ 5000 length < age … installment ≤ salary amount ≤ 9500 length < age … installment ≤ salary amount ≤ 9500 length < age … installment ≤ salary amount ≥ length < age … installment ≤ salary amount ≥ length < age …

Mining Descriptive Temporal Rules

Problem Statement Given a log, discover a set of temporal rules (LTL) that characterize the underlying process, e.g. –In a lab analysis process, every leukocyte count is eventually followed by a platelet count ☐ (leukocyte_count  platelet_count) –Patients who undergo surgery X do not undergo surgery Y later ☐ (X  ☐ not Y)

DeclareMiner (Maggi et al. 2011)

Oh no! Not again!

What went wrong? Not all rules are interesting What is “interesting”? –Not necessarily what is frequent (expected) –But what deviates from the expected Example: –Every patient who is diagnosed with condition X undergoes surgery Y But not if the have previously been diagnosed with condition Z

Interesting Rules Something should have “normally” happened but did not happen, why? Something should normally not have happened but it happened, why? Something happens only when things go “well”Something happens only when things go “wrong”

Discovering Refined Temporal Rules Discover temporal rules that are frequently “activated” but not always “fulfilled”, e.g. –When A occurs, eventually B occurs in 90% of cases ☐ (A  B) has 90% fulfillment ratio –Discover a rule that describes the remaining 10% of cases, e.g. using data attributes ☐ (A [age < 70]  B) has 100% fulfillment ratio

Now it’s better… Maggi et al. BPM’2013

Discriminative Rules Mining

Problem Statement Given a log partitioned into classes –e.g. good vs bad cases, on-time vs late cases Discover a set of temporal rules that distinguish one class from the other, e.g. Claims for house damage that end up in a complaint, are often those for which at two or more data entry errors are made by the customer when filing the claim

Mining Anomalous Software Development Issues (Sun et al. 2013) Extract features from traces based on which events occur in the trace Apply a contrasting itemset mining technique  features in one class and not in the other Decision tree to construct readable rules

Where is the data?

Challenges Scalable algorithms for discovering FO-LTL rules –Frequent rules (descriptive) –Discriminative rules –Other interestingness notions Interactive business rule mining