Process Mining: Discovering processes from event logs All truths are easy to understand once they are discovered; the point is to discover them. Galileo.

Slides:



Advertisements
Similar presentations
Jorge Muñoz-Gama Josep Carmona
Advertisements

A university for the world real R © 2009, Chapter 3 Advanced Synchronization Moe Wynn Wil van der Aalst Arthur ter Hofstede.
Han-na Yang Trace Clustering in Process Mining M. Song, C.W. Gunther, and W.M.P. van der Aalst.
Sequential Patterns & Process Mining Current State of Research Edgar de Graaf LIACS.
Process Mining in the Context of Web Services Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands.
/faculteit technologie management 1 Process Mining: Organizational and Conformance Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros.
MXML A Meta model for process mining data
/faculteit technologie management 1 Process Mining: Control-Flow Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros Eindhoven University.
Process discovery: Inductive Miner
Mining Declarative Models using Intervals Jan Martijn van der Werf Ronny Mans Wil van der Aalst.
Block-Structured Process Discovery: Filtering Infrequent Behaviour Sander Leemans Dirk Fahland Wil van der Aalst Eindhoven University of Technology.
Boudewijn van Dongen /t Multi-phase process mining Building instance graphs.
/faculteit technologie management Genetic Process Mining Ana Karla Medeiros Ton Weijters Wil van der Aalst Eindhoven University of Technology Department.
Process Mining from discovery to checking Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology, Department of Information Systems, P.O. Box.
/faculteit technologie management Genetic Process Mining Ana Karla Alves de Medeiros Eindhoven University of Technology Department.
1 Workflow Management Systems : Functions, architecture, and products. Wil van der Aalst Eindhoven University of Technology Faculty of Technology Management.
Process Mining in CSCW Systems All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei ( )
Mining Social Networks Uncovering interaction patterns in business processes Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department.
1 Analysis of workflows a-priori and a-posteriori analysis Wil van der Aalst Eindhoven University of Technology Faculty of Technology Management Department.
Business Alignment Using Process Mining as a Tool for Delta Analysis Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information.
/faculteit technologie management Dutch-Belgian Database Day 2007 The Challenges of Process Mining A.J.M.M. Weijters (and many others)
Process Mining: The next step in Business Process Management Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information.
/faculteit technologie management Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst.
Discovering Coordination Patterns using Process Mining Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology.
Boudewijn van Dongen April 27, 2005 The ProM-framework A framework for integrating process mining tools.
Boudewijn van Dongen June 22, 2004 /t Process Mining, the basics.
Process Mining: Discovering processes from event logs All truths are easy to understand once they are discovered; the point is to discover them. Galileo.
/faculteit technologie management Genetic Process Mining Wil van der Aalst Ana Karla Medeiros Ton Weijters Eindhoven University of Technology Department.
Process Mining: An iterative algorithm using the Theory of Regions Kristian Bisgaard Lassen Boudewijn van Dongen Wil van.
/faculteit technologie management 1 Process Mining: Extension Mining Algorithms Ana Karla Alves de Medeiros Ana Karla Alves de Medeiros Eindhoven University.
Process Mining for Ubiquitous Mobile Systems An Overview and a Concrete Algorithm Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department.
A university for the world real R © 2009, Chapter 17 Process Mining and Simulation Moe Wynn Anne Rozinat Wil van der Aalst Arthur.
A university for the world real R © 2009, Chapter 23 Epilogue Wil van der Aalst Michael Adams Arthur ter Hofstede Nick Russell.
Process mining Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology, Department of Information Systems, P.O. Box 513, 5600 MB Eindhoven, The.
Scientific Workflows Within the Process Mining Domain Martina Caccavale 17 April 2014.
Process Mining Control flow process discovery Fabrizio Maria Maggi (based on Process Mining book – Springer copyright 2011 and lecture material by Marlon.
Jorge Muñoz-Gama Universitat Politècnica de Catalunya (Barcelona, Spain) Algorithms for Process Conformance and Process Refinement.
Process Mining Control flow process discovery
Jianmin Wang 1, Shaoxu Song 1, Xiaochen Zhu 1, Xuemin Lin 2 1 Tsinghua University, China 2 University of New South Wales, Australia 1/23 VLDB 2013.
Han-na Yang Rediscovering Workflow Models from Event-Based Data using Little Thumb.
Petri nets refresher Prof.dr.ir. Wil van der Aalst
Process-oriented System Analysis Process Mining. BPM Lifecycle.
Decision Mining in Prom A. Rozinat and W.M.P. van der Aalst Joosung, Ko.
/faculteit technologie management Workflow Mining: Current Status and Future Directions Ana Karla A. de Medeiros, W.M.P van der Aalst and A.J.M.M. Weijters.
Representing Relations Using Matrices A relation between finite sets can be represented using a zero-one matrix Suppose R is a relation from A = {a 1,
Maikel Leemans Wil M.P. van der Aalst. Process Mining in Software Systems 2 System under Study (SUS) Functional perspective Focus: User requests Functional.
1 CS 352 Introduction to Logic Design Lecture 1 Ahmed Ezzat Number Systems and Boolean Algebra, Ch-1 + Ch-2.
1 CS techniques for IT auditing Lecture 6. Dept of Mathematics and Computer Science 2 Transition system (1) Basic process model of CS is a transition.
1 CS 352 Introduction to Logic Design Lecture 2 Ahmed Ezzat Boolean Algebra and Its Applications Ch-3 + Ch-4.
Process Mining – Concepts and Algorithms Review of literature on process mining techniques for event log data.
Case tool Relational Database Schema Designer Cai Xinlei Tang Ning Xu Chen Zhang Yichuan CS4221 P06.
Discovering Models for State-based Processes M.L. van Eck, N. Sidorova, W.M.P. van der Aalst.
Profiling and process mining What has been done???
Multi-phase Process Mining: Building Instance Graphs
30 januari 2018 Mining Social Networks Uncovering interaction patterns in business processes Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology.
FIGURE 3.1 Two-variable K-map
7 mei 2018 Process Mining in CSCW Systems All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei.
Profiling based unstructured process logs
David Redlich, Thomas Molka, Wasif Gilani, Awais Rashid, Gordon Blair
Decomposed Process Mining: The ILP Case
Wil van der Aalst Eindhoven University of Technology
CH7 Multilevel Gate Network
Workflow Management Systems: Functions, architecture, and products.
Wil van der Aalst Eindhoven University of Technology
Workflow Management Systems: Functions, architecture, and products.
Multi-phase process mining
3 mei 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
Business Alignment Using Process Mining as a Tool for Delta Analysis
5 juli 2019 Process Mining and Security: Detecting Anomalous Process Executions and Checking Process Conformance Wil van der Aalst Ana Karla A. de Medeiros.
19 augustus 2019 Mining Social Networks Uncovering interaction patterns in business processes Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology.
Presentation transcript:

Process Mining: Discovering processes from event logs All truths are easy to understand once they are discovered; the point is to discover them. Galileo Galilei ( ) Prof.dr.ir. Wil van der Aalst Eindhoven University of Technology Department of Information and Technology P.O. Box 513, 5600 MB Eindhoven The Netherlands

Outline Process Mining –overview –alpha algorithm –genetic mining ProM –Architecture –Convertors ( , Staffware, InConcert, SAP, etc.) –Process mining plug-ins Alpha-algorithm Multi-phase mining Genetic mining –Analysis plug-ins –Conformance testing plug-in –LTL checker plug-in –Social network plug-in Conclusion

Process Mining

Motivation: Reversing the process Process mining can be used for: –Process discovery (What is the process?) –Delta analysis (Are we doing what was specified?) –Performance analysis (How can we improve?) process mining

Overview 1) basic performance metrics 2) process model3) organizational model4) social network 5) performance characteristics If …then … 6) auditing/security

Let us focus on mining process models … 1) basic performance metrics 2) process model3) organizational model4) social network 5) performance characteristics If …then … 6) auditing/security... and a very simple approach: The alpha algorithm

Alpha algorithm α

Process log Minimal information in log: case id’s and task id’s. Additional information: event type, time, resources, and data. In this log there are three possible sequences: –ABCD –ACBD –EF case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D

>, ,||,# relations Direct succession: x>y iff for some case x is directly followed by y. Causality: x  y iff x>y and not y>x. Parallel: x||y iff x>y and y>x Choice: x#y iff not x>y and not y>x. case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D A>B A>C B>C B>D C>B C>D E>F ABACBDCDEFABACBDCDEF B||C C||B

Basic idea (1) xyxy

Basic idea (2) x  y, x  z, and y||z

Basic idea (3) x  y, x  z, and y#z

Basic idea (4) x  z, y  z, and x||y

Basic idea (5) x  z, y  z, and x#y

It is not that simple: Basic alpha algorithm Let W be a workflow log over T.  (W) is defined as follows. 1.T W = { t  T     W t   }, 2.T I = { t  T     W t = first(  ) }, 3.T O = { t  T     W t = last(  ) }, 4.X W = { (A,B)  A  T W  B  T W   a  A  b  B a  W b   a1,a2  A a 1 # W a 2   b1,b2  B b 1 # W b 2 }, 5.Y W = { (A,B)  X   (A,B)  X A  A  B  B  (A,B) = (A,B) }, 6.P W = { p (A,B)  (A,B)  Y W }  {i W,o W }, 7.F W = { (a,p (A,B) )  (A,B)  Y W  a  A }  { (p (A,B),b)  (A,B)  Y W  b  B }  { (i W,t)  t  T I }  { (t,o W )  t  T O }, and  (W) = (P W,T W,F W ). The alpha algorithm has been proven to be correct for a large class of free-choice nets.

Example case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task E case 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task F case 4 : task D  (W) W

DEMO Alpha algorithm 48 cases 16 performers

Challenges Refining existing algorithm for (control-flow/process perspective) –Hidden tasks –Duplicate tasks –Non-free-choice constructs –Loops –Detecting concurrency (implicit or explicit) –Mining and exploiting time –Dealing with noise –Dealing with incompleteness Mining other perspectives (data, resources, roles, …) Gathering data from heterogeneous sources Visualization of results Delta analysis

Genetic mining

Approach

Genetic mining: The two main questions How to represent an individual? (Petri net?) How to define the genetic operators? (e.g., crossover)

How to represent an individual? Problems with Petri nets: –Places do not exist in log –difficulties defining mutation and crossover –problems describing subtle rules without adding transitions

Representation of the goal process trueAAADDE^FE^FBvCvG →ABCDEFGH A BvCvD B H C H D E^F E G F G G H H true

A more compact representation ACTIVITYINPUTOUTPUT A{}{{B,C,D}} B{{A}}{{H}} C{{A}}{{H}} D{{A}}{{E},{F}} E{{D}}{{G}} F{{D}}{{G}} G{{E},{F}}{{H}} H{{B,C,G}}{}

Any Petri net can be mapped onto a causal matrix: ACTIVITYINPUTOUTPUT A{...}{{C,D},...} B{...}{{C,D},...} C{{A,B},...}{...} D{{A,B},...}{...} but...

Mapping a causal matrix onto a Petri net? ACTIVITYINPUTOUTPUT A{{i 11,i 12,i 13 },{i 21,i 22,i 23 }}{{o 11,o 12,o 13 },{o 21,o 22,o 23 }}

Wiring based on input and output sets Using place fusion or silent transitions.

Example ACTIVITYINPUTOUTPUT A{}{{C,D}} B{}{{D}} C{{A}}{} D{{A,B}}{}

Wiring using silent transitions Always a solution?

Problem (the need to be lazy...) ACTIVITYINPUTOUTPUT A{}{{B},{C,D}} B{{A}}{{E,F}} C{{A}}{{E}} D{{A}}{{F}} E{{B},{C}}{{G}} F{{B},{D}}{{G}} G{{E},{F}}{}

However,... ACTIVITYINPUTOUTPUT A{}{{B},{C,D}} B{{A}}{{E,F}} C{{A}}{{E}} D{{A}}{{F}} E{{B},{C}}{{G}} F{{B},{D}}{{G}} G{{E},{F}}{}

Two approaches: CM2PN 1.A naive mapping using silent transitions. Always works Larger net Requires "lazy transition" semantics Implemented in ProM 2.A more sophisticated mapping based on "smart place fusion" Only for a subset of CMs Subclass can be characterized Not (yet) implemented

Example: Event log case idactivity idoriginatortimestamp case 1activity AJohn :15.01 case 2activity AJohn :15.12 case 3activity ASue :16.03 case 3activity DCarol :16.07 case 1activity BMike :18.25 case 1activity HJohn :9.23 case 2activity CMike :10.34 case 4activity ASue :10.35 case 2activity HJohn :12.34 case 3activity EPete :12.50 case 3activity FCarol :10.12 case 4activity DPete :10.14 case 3activity GSue :10.44 case 3activity HPete :11.03 case 4activity FSue :11.18 case 4activity EClare :12.22 case 4activity GMike :14.34 case 4activity HClare :14.38

Goal

Example: Starting point case idactivity id case 1activity A case 2activity A case 3activity A case 3activity D case 1activity B case 1activity H case 2activity C case 4activity A case 2activity H case 3activity E case 3activity F case 4activity D case 3activity G case 3activity H case 4activity F case 4activity E case 4activity G case 4activity H randomly generated initial individuals

Two individuals ACTIVITYINPUTOUTPUT A{}{{B,C,D}} B{{A}}{{H}} C{{A}}{{H}} D{{A}}{{E}} E{{D}}{{G}} F{}{{G}} G{{E},{F}}{{H}} H{{C,B,G}}{} ACTIVITYINPUTOUTPUT A{}{{B,C,D}} B{{A}}{{H}} C{{A}}{{H}} D{{A}}{{E,F}} E{{D}}{{G}} F{{D}}{{G}} G{{E},{F}}{{H}} H{{C},{B},{G}}{}

Crossover ACTIVITYINPUTOUTPUT A{}{{B,C,D}} B{{A}}{{H}} C{{A}}{{H}} D{{A}}{{E, F}} E{{D}}{{G}} F{{D}}{{G}} G{{E},{F}}{{H}} H{{C,B,G}}{} ACTIVITYINPUTOUTPUT A{}{{B,C,D}} B{{A}}{{H}} C{{A}}{{H}} D{{A}}{{E}} E{{D}}{{G}} F{}{{G}} G{{E},{F}}{{H}} H{{C},{B},{G}}{}

Resulting CM with fitness 1.0 trueAAADDE^FE^FBvCvG →ABCDEFGH A BvCvD B H C H D E^F E G F G G H H true

Mapping

ProM framework

ProM

Converter plug-in: Analyzer

XML format

ProM architecture

Mining plug-in: Alpha algorithm

Mining plug-in: Genetic Miner

Mining plug-in: Multi-phase mining

Step 1: Get instances

Step 2: Project

Step 3: Aggregate

Step 4: Map onto EPC

Step 5: Map onto Petri net (or other language)

Mining plug-in: Social network miner

Cliques

SN based on hand-over of work metric density of network is 0.225

SN based on working together (and ego network)

Analysis plug-in: LTL checker

Analysis plug-in: Conformance checker Do they agree?

Fitness is not enough

Screenshot (Also runs on Mac.)

Other analysis plug-ins

More demos?

Conclusion Process mining provides many interesting challenges for scientists, customers, users, managers, consultants, and tool developers. Involves multiple perspectives (process, data, resources, etc.) Get ProM-ed! You can contribute by applying ProM and developing plug-ins

More information