Ontology-Based Data Access for Extracting Event Logs from Legacy Data: The onprom Tool and Methodology
Process mining aims at discovering, monitoring, and improving business processes by extracting knowledge from event logs. In this respect, process mining can be applied only if there are proper event logs that are compatible with accepted standards, such as extensible event stream (XES). Unfortunately, in many real world set-ups, such event logs are not explicitly given, but instead are implicitly represented in legacy information systems. In this work, we exploit a framework and associated methodology for the extraction of XES event logs from relational data sources that we have recently introduced. Our approach is based on describing logs by means of suitable annotations of a conceptual model of the available data, and builds on the ontology-based data access (OBDA) paradigm for the actual log extraction. Making use of a real-world case study in the services domain, we compare our novel approach with a more traditional extract-transform-load based one, and are able to illustrate its added value. We also present a set of tools that we have developed and that support the OBDA-based log extraction framework. The tools are integrated as plugins of the ProM process mining suite.
KeywordsProcess mining Ontology-based data access Event log extraction Relational database management systems
This research has been partially supported by the Euregio IPN12 “KAOS: Knowledge-Aware Operational Support” project, which is funded by the “European Region Tyrol-South Tyrol-Trentino” (EGTC) under the first call for basic research projects and by the UNIBZ internal project “OnProm”. We thank Ario Santoso for the development of the log extraction plug-in of onprom, and Wil van der Aalst for the interesting discussions and insights on the problem of extracting event logs from legacy information systems.
- 3.IEEE Computational Intelligence Society: IEEE Standard for eXtensible Event Stream (XES) for Achieving Interoperability in Event Logs and Event Streams. IEEE Std 1849-2016, i–50 (2016)Google Scholar
- 7.Syamsiyah, A., van Dongen, B.F., van der Aalst, W.M.P.: DB-XES: enabling process discovery in the large. In: Proceedings of the 6th International Symposium on Data-Driven Process Discovery and Analysis (SIMPDA). CEUR, vol. 1757, pp. 63–77. ceur-ws.org (2016)Google Scholar
- 10.Calvanese, D., Giacomo, G., Lembo, D., Lenzerini, M., Poggi, A., Rodriguez-Muro, M., Rosati, R.: Ontologies and databases: the DL-Lite approach. In: Tessaris, S., Franconi, E., Eiter, T., Gutierrez, C., Handschuh, S., Rousset, M.-C., Schmidt, R.A. (eds.) Reasoning Web 2009. LNCS, vol. 5689, pp. 255–356. Springer, Heidelberg (2009). doi: 10.1007/978-3-642-03754-2_7 CrossRefGoogle Scholar
- 12.Motik, B., Cuenca Grau, B., Horrocks, I., Wu, Z., Fokoue, A., Lutz, C.: OWL 2 Web Ontology Language profiles, 2nd edn. W3C Recommendation, W3C, December 2012. http://www.w3.org/TR/owl2-profiles/
- 13.Antonioli, N., Castanò, F., Coletta, S., Grossi, S., Lembo, D., Lenzerini, M., Poggi, A., Virardi, E., Castracane, P.: Ontology-based data management for the Italian public debt. In: Proceedings of FOIS. Frontiers in Artificial Intelligence and Applications, vol. 267, pp. 372–385. IOS Press (2014)Google Scholar
- 14.Jiménez-Ruiz, E., Kharlamov, E., Zheleznyakov, D., Horrocks, I., Pinkel, C., Skjæveland, M.G., Thorstensen, E., Mora, J.: BootOX: bootstrapping OWL 2 ontologies and R2RML mappings from relational databases. In: Proceedings of ISWC Posters & Demonstrations Track. CEUR, vol. 1486. ceur-ws.org (2015)Google Scholar