Skip to main content

A Native Operator for Process Discovery

  • Conference paper
  • First Online:
Database and Expert Systems Applications (DEXA 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11030))

Included in the following conference series:

Abstract

The goal of process mining is to gain insights into operational processes through the analysis of events recorded by information systems. Typically, this is done in three phases. Firstly, events are extracted from a data store into an event log. Secondly, an intermediate structure is built in memory and finally, this intermediate structure is converted into a process model or other analysis results.

In this paper, we propose a native SQL operator for direct process discovery on relational databases. We merge steps 1 and 2 by defining a native operator for the simplest form of the intermediate structure, called the “directly follows relation”. We evaluate our work on big event data and the experimental results show that it performs faster than the state-of-the-art of database approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We used an H2 database server with 64 GB of RAM and 8 CPU cores@2.40 Ghz. Discovery was done on a PC with 8 GB of RAM and 2 CPU cores@2.30 Ghz.

  2. 2.

    Due to the fact that the nested query is so time-consuming, we did not include it in some of the tests.

  3. 3.

    Note that the linearithmic comes from the fact that H2 database uses B-tree index, hence finding an element is \(\mathcal {O}(\log {}x)\). There are x rows for which we need to perform this look up, therefore the complexity is \(\mathcal {O}(x \cdot \log {}x)\).

References

  1. Agrawal, R., Mehta, M., Shafer, J., Srikant, R., Arning, A., Bollinger, T.: The quest data mining system. In: KDD 1996, pp. 244–249. AAAI Press (1996)

    Google Scholar 

  2. Calvanese, D., Montali, M., Syamsiyah, A., van der Aalst, W.M.P.: Ontology-driven extraction of event logs from relational databases. In: Reichert, M., Reijers, H.A. (eds.) BPM 2015. LNBIP, vol. 256, pp. 140–153. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-42887-1_12

    Chapter  Google Scholar 

  3. Dijkman, R., Gao, J., Grefen, P., ter Hofstede, A.: Relational algebra for in-database process mining (2017)

    Google Scholar 

  4. Färber, F., Cha, S.K., Primsch, J., Bornhövd, C., Sigg, S., Lehner, W.: SAP HANA database: data management for modern business applications. SIGMOD Rec. 40(4), 45–51 (2012)

    Article  Google Scholar 

  5. Han, J., Cai, Y., Cercone, N.: Data-driven discovery of quantitative rules in relational databases. TKDE 5(1), 29–40 (1993)

    Google Scholar 

  6. Leemans, S.J.J.: Robust process mining with guarantees. Ph.D. thesis, TU Eindhoven (2017)

    Google Scholar 

  7. Shen, W., Ong, K., Mitbander, B., Zaniolo, C.: Metaqueries for data mining (1996)

    Google Scholar 

  8. Syamsiyah, A., van Dongen, B.F., Dijkman, R.: Native directly follows operator. CoRR, abs/1806.01657 (2018)

    Google Scholar 

  9. Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: DB-XES: enabling process mining in the large. In: SIMPDA 2016 - Extended Versions, pp. 63–77 (2016)

    Google Scholar 

  10. Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Discovering social networks instantly: moving process mining computations to the database and data entry time. In: Reinhartz-Berger, I., Gulden, J., Nurcan, S., Guédria, W., Bera, P. (eds.) BPMDS/EMMSAD -2017. LNBIP, vol. 287, pp. 51–67. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59466-8_4

    Chapter  Google Scholar 

  11. Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Recurrent process mining on procedural and declarative approaches. BPM Center Report BPM-17-03 (2017)

    Google Scholar 

  12. van der Aalst, W.M.P.: Process Mining: Data Science in Action. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4

    Book  Google Scholar 

  13. van der Aalst, W.M.P., Weijter, A.J.M.M., Maruster, L.: Workflow mining: discovering process models from event logs. TKDE 16, 1128–1142 (2004)

    Google Scholar 

  14. van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Alifah Syamsiyah .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Syamsiyah, A., van Dongen, B.F., Dijkman, R.M. (2018). A Native Operator for Process Discovery. In: Hartmann, S., Ma, H., Hameurlain, A., Pernul, G., Wagner, R. (eds) Database and Expert Systems Applications. DEXA 2018. Lecture Notes in Computer Science(), vol 11030. Springer, Cham. https://doi.org/10.1007/978-3-319-98812-2_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-98812-2_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-98811-5

  • Online ISBN: 978-3-319-98812-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics