A Native Operator for Process Discovery

Syamsiyah, Alifah; van Dongen, Boudewijn F.; Dijkman, Remco M.

doi:10.1007/978-3-319-98812-2_25

Alifah Syamsiyah¹⁸,
Boudewijn F. van Dongen¹⁸ &
Remco M. Dijkman¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11030))

Included in the following conference series:

International Conference on Database and Expert Systems Applications

1429 Accesses
2 Citations

Abstract

The goal of process mining is to gain insights into operational processes through the analysis of events recorded by information systems. Typically, this is done in three phases. Firstly, events are extracted from a data store into an event log. Secondly, an intermediate structure is built in memory and finally, this intermediate structure is converted into a process model or other analysis results.

In this paper, we propose a native SQL operator for direct process discovery on relational databases. We merge steps 1 and 2 by defining a native operator for the simplest form of the intermediate structure, called the “directly follows relation”. We evaluate our work on big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
We used an H2 database server with 64 GB of RAM and 8 CPU cores@2.40 Ghz. Discovery was done on a PC with 8 GB of RAM and 2 CPU cores@2.30 Ghz.
2.
Due to the fact that the nested query is so time-consuming, we did not include it in some of the tests.
3.
Note that the linearithmic comes from the fact that H2 database uses B-tree index, hence finding an element is \(\mathcal {O}(\log {}x)\). There are x rows for which we need to perform this look up, therefore the complexity is \(\mathcal {O}(x \cdot \log {}x)\).

References

Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, A., Bollinger, T.: The quest data mining system. In: KDD 1996, pp. 244–249. AAAI Press (1996)
Google Scholar
Calvanese, D., Montali, M., Syamsiyah, A., van der Aalst, W.M.P.: Ontology-driven extraction of event logs from relational databases. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp. 140–153. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1_12
Chapter Google Scholar
Dijkman, R., Gao, J., Grefen, P., ter Hofstede, A.: Relational algebra for in-database process mining (2017)
Google Scholar
Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)
Article Google Scholar
Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational databases. TKDE 5(1), 29–40 (1993)
Google Scholar
Leemans, S.J.J.: Robust process mining with guarantees. Ph.D. thesis, TU Eindhoven (2017)
Google Scholar
Shen, W., Ong, K., Mitbander, B., Zaniolo, C.: Metaqueries for data mining (1996)
Google Scholar
Syamsiyah, A., van Dongen, B.F., Dijkman, R.: Native directly follows operator. CoRR, abs/1806.01657 (2018)
Google Scholar
Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: DB-XES: enabling process mining in the large. In: SIMPDA 2016 - Extended Versions, pp. 63–77 (2016)
Google Scholar
Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Discovering social networks instantly: moving process mining computations to the database and data entry time. In: Reinhartz-Berger, I., Gulden, J., Nurcan, S., Guédria, W., Bera, P. (eds.) BPMDS/EMMSAD -2017. LNBIP, vol. 287, pp. 51–67. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59466-8_4
Chapter Google Scholar
Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Recurrent process mining on procedural and declarative approaches. BPM Center Report BPM-17-03 (2017)
Google Scholar
van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
Book Google Scholar
van der Aalst, W.M.P., Weijter, A.J.M.M., Maruster, L.: Workflow mining: discovering process models from event logs. TKDE 16, 1128–1142 (2004)
Google Scholar
van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Eindhoven University of Technology, Eindhoven, The Netherlands
Alifah Syamsiyah, Boudewijn F. van Dongen & Remco M. Dijkman

Authors

Alifah Syamsiyah
View author publications
You can also search for this author in PubMed Google Scholar
Boudewijn F. van Dongen
View author publications
You can also search for this author in PubMed Google Scholar
Remco M. Dijkman
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alifah Syamsiyah .

Editor information

Editors and Affiliations

Clausthal University of Technology, Clausthal-Zellerfeld, Germany
Sven Hartmann
Victoria University of Wellington, Wellington, New Zealand
Hui Ma
Paul Sabatier University, Toulouse, France
Abdelkader Hameurlain
University of Regensburg, Regensburg, Germany
Günther Pernul
Johannes Kepler University, Linz, Austria
Roland R. Wagner

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Syamsiyah, A., van Dongen, B.F., Dijkman, R.M. (2018). A Native Operator for Process Discovery. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2018. Lecture Notes in Computer Science(), vol 11030. Springer, Cham. https://doi.org/10.1007/978-3-319-98812-2_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-98812-2_25
Published: 09 August 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98811-5
Online ISBN: 978-3-319-98812-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics