PV226: Process Mining Seminar Martin Macák 5. 3. 2021 Faculty of Informatics, Masaryk University Brno Outline ∙ Basic overview of Process Mining ∙ Course information What is process mining? ∙ Discipline that aims to understand and analyze processes. ∙ It uses a structured event log. ∙ We can use it to discover process maps. Image taken from: https://camunda.com/bpmn/examples/ Why use process mining? ∙ We assume how the process is performed. ∙ However, how the real process looks like? ∙ There might be differences between the real execution and the assumption: ∙ special situations, ∙ "shortcuts", ∙ subjectivity, ∙ malicious activity. How does process mining work? ∙ Typically works with event logs which represent processes ∙ These logs have to contain cases (sequences of events) How does process mining work? use webmin_backdoor (2) set RHOST (1) 1 exploit (2) 1 set LHOST (2) 1 1 set RPORT (1) 1 set SSL (2) 2 set TARGET (2) 2 2 1 2 How does process mining work? ∙ Each event has: ∙ [𝑅𝑒𝑞𝑢𝑖𝑟𝑒𝑑] caseId, ∙ [𝑅𝑒𝑞𝑢𝑖𝑟𝑒𝑑] activity, ∙ timestamp, ∙ resource, ∙ other data. How does process mining work? ∙ Sometimes, the mapping is not clear. ∙ For example, the name of the worker can be: ∙ resource, ∙ activity, ∙ caseId. What is the difference between process mining and data mining? What is the difference between process mining and data mining? What is the difference between process mining and data mining? Where is process mining used? ∙ Healthcare ∙ Manufacturing ∙ Finance ∙ Public sector ∙ Usability ∙ Robotics, industry 4.0 ∙ Utility ∙ Advisory, audits ∙ Biology ∙ Agriculture ∙ ICT ∙ Education ∙ Logistics ∙ Security ∙ Call center ∙ Entertainment ∙ Garment ∙ Retail ∙ Hotel More details in [1] Let’s start to mine! Analysis of the past ∙ Process discovery techniques ∙ From the event log, we create a model that represents how the process was executed ∙ Model can be represented as a petri net, activity diagram, BPMN diagram, heuristic net, . . . ∙ Now we focus only on control flow Analysis of the past: Employees’ productivity low (37) 2 medium (23) 16 @@E 19 4 high (20) 161 17 2 @@S 16 2 2 Analysis of the past: Cybersecurity training session LevelStarted (3) HintTaken-1 (3) 3 HintTaken-2 (3) 3 SolutionDisplayed (4) 2 WrongFlagSubmitted (1) 1 CorrectFlagSubmitted (3) 2 LevelFinished (3) 2 12 1 Process discovery activities ∙ We can: ∙ explore processes, ∙ discover process models, ∙ compare the model of desired behavior with the model of reality, ∙ check the deviations in historic data, ∙ promote the model that shows the desired behavior. Adding additional perspectives ∙ Control flow is not the only perspective. ∙ We can enhance the existing process models with: ∙ social network analysis, ∙ organizational structures, ∙ resource behavior analysis, ∙ time perspective, ∙ decision points mining, ∙ . . . Detecting deviations in processes ∙ We can check the conformance with the model: Detecting deviations in processes ∙ We can check the conformance with the model: Detecting deviations in processes ∙ We can check the conformance with the model: ∙ token-based replay, ∙ business rules, ∙ . . . Token-based replay Business rules ∙ Specific rules we want to follow. ∙ To define them, we can use Declare: ∙ Constraint-based workflow language that uses graphical notations and semantics based on Linear Temporal Logic. ∙ Example: Analysis of the present ∙ We analyze running cases. ∙ We can: ∙ detect deviations in real-time data using the model of the desired behavior, ∙ do real-time predictions (probability of success, remaining time, . . . ), ∙ make recommendations. Deviation detection: past vs. present Summary ∙ Process-centric data analysis. ∙ Process discovery, enhancement, and conformance checking. ∙ Past vs. present. PV226 Course information ∙ e-learning (recommended: 2. – 7. week) ∙ https://www.coursera.org/learn/process-mining PV226 Course information ∙ Project ∙ You can come up with your own topic, set your own difficulty. ∙ You can work in groups. ∙ We will have a meeting (in April) where we will discuss your topics. ∙ 28.5. — presentation of your work. ∙ Optional consultations of your project on Discord through the whole semester. PV226 Course information ∙ Project ∙ You can come up with your own topic, set your own difficulty. ∙ You can work in groups. ∙ We will have a meeting (in April) where we will discuss your topics. ∙ 28.5. — presentation of your work. ∙ Optional consultations of your project on Discord through the whole semester. ∙ Examples of project types: ∙ process discovery in tool Disco (https://fluxicon.com/disco/), ∙ process analysis in tool ProM (http://www.promtools.org/), ∙ process analysis in tool RapidMiner (https://rapidminer.com/), ∙ process analysis using Python (https://github.com/pm4py/pm4py-source), ∙ survey research paper about the specific usage of Process Mining. Additional sources ∙ Process Mining book [2] ∙ https://www.springer.com/gp/book/9783662498507 ∙ Use school VPN and you can download it! :) Additional sources ∙ Process Mining book [2] ∙ https://www.springer.com/gp/book/9783662498507 ∙ Use school VPN and you can download it! :) ∙ Our Discord server: https://discord.gg/CyykPVN ∙ Discuss anything with your colleagues and me :) Resources [1] C. dos Santos Garcia, A. Meincheim, E. R. F. Junior, M. R. Dallagassa,D. M. V. Sato, D. R. Carvalho, et al., Process mining techniques and applications - a systematic mapping study, Expert Systems with Applications, vol. 133, pp. 260 – 295, 2019.doi: https://doi.org/10.1016/j.eswa.2019.05.003. [Online]. [2] W. van der Aalst, Process Mining: Data Science in Action, 2nd Edition, Springer Publishing Company, Incorporated, 2016.