Abstract
Existing process mining approaches are able to tolerate a certain degree of noise in the process log. However, processes that contain infrequent paths, multiple (nested) parallel branches, or have been changed in an ad-hoc manner, still pose major challenges. For such cases, process mining typically returns “spaghetti-models”, that are hardly usable even as a starting point for process (re-)design. In this paper, we address these challenges by introducing data transformation and pre-processing steps that improve and ensure the quality of mined models for existing process mining approaches. We propose the concept of semantic log purging, the cleaning of logs based on domain specific constraints utilizing semantic knowledge which typically complements processes. Furthermore we demonstrate the feasibility and effectiveness of the approach based on a case study in the higher education domain. We think that semantic log purging will enable process mining to yield better results, thus giving process (re-)designers a valuable tool.
The work presented in this paper has been partly conducted within the project I743-N23 funded by the Austrian Science Fund (FWF).
Chapter PDF
Similar content being viewed by others
References
van der Aalst, W.M.P., et al.: Process Mining Manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM Workshops 2011, Part I. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012)
De Medeiros, A.K.A., Weijters, A.J.M.M.: Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery14 (2007)
Weijters, A., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using little thumb. In: ICAE, vol. 10, pp. 151–162 (2003)
Fahland, D., van der Aalst, W.M.P.: Simplifying Mined Process Models: An Approach Based on Unfoldings. In: Rinderle-Ma, S., Toumani, F., Wolf, K. (eds.) BPM 2011. LNCS, vol. 6896, pp. 362–378. Springer, Heidelberg (2011)
van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Heidelberg (2011)
Verbeek, H.M.W., Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: XES, XESame, and ProM 6. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 60–75. Springer, Heidelberg (2011)
Derntl, M., Mangler, J.: Web services for blended learning patterns. In: Proc. IEEE International Conference on Advanced Learning Technologies, pp. 614–618 (2004)
Ly, L.T., Knuplesch, D., Rinderle-Ma, S., Göser, K., Pfeifer, H., Reichert, M., Dadam, P.: SeaFlows Toolset – Compliance Verification Made Easy for Process-Aware Information Systems. In: Soffer, P., Proper, E. (eds.) CAiSE Forum 2010. LNBIP, vol. 72, pp. 76–91. Springer, Heidelberg (2011)
Ly, L.T., Rinderle-Ma, S., Dadam, P.: Design and Verification of Instantiable Compliance Rule Graphs in Process-Aware Information Systems. In: Pernici, B. (ed.) CAiSE 2010. LNCS, vol. 6051, pp. 9–23. Springer, Heidelberg (2010)
Rahm, E., Do, H.: Data cleaning: Problems and current approaches. IEEE Data Engineering Bulletin 23(4), 313 (2000)
Heiko Müller, J.F.: Problems, methods, and challenges in comprehensive data cleansing. Technical Report 164, Humboldt University Berlin (2003)
Dunkl, R., Fröschl, K.A., Grossmann, W., Rinderle-Ma, S.: Assessing Medical Treatment Compliance Based on Formal Process Modeling. In: Holzinger, A., Simonic, K.-M. (eds.) USAB 2011. LNCS, vol. 7058, pp. 533–546. Springer, Heidelberg (2011)
Rinderle-Ma, S., Mangler, J.: Integration of process constraints from heterogeneous sources in Process-Aware information systems. In: Int’l. Workshop Enterprise Modelling and Information Systems Architectures, EMISA (2011)
Funk, M., Rozinat, A., Alves de Medeiros, A.K., van der Putten, P., Corporaal, H., van der Aalst, W.M.P.: Improving Product Usage Monitoring and Analysis with Semantic Concepts. In: Yang, J., Ginige, A., Mayr, H.C., Kutsche, R.-D. (eds.) UNISCON 2009. LNBIP, vol. 20, pp. 190–201. Springer, Heidelberg (2009)
Mans, R.S., Schonenberg, H., Song, M., van der Aalst, W.M.P., Bakker, P.J.M.: Application of Process Mining in Healthcare - A Case Study in a Dutch Hospital. In: Fred, A., Filipe, J., Gamboa, H. (eds.) BIOSTEC 2008. CCIS, vol. 25, pp. 425–438. Springer, Heidelberg (2008)
van der Aalst, W.M.P., de Beer, H.T., van Dongen, B.F.: Process Mining and Verification of Properties: An Approach Based on Temporal Logic. In: Meersman, R., Tari, Z. (eds.) OTM 2005. LNCS, vol. 3760, pp. 130–147. Springer, Heidelberg (2005)
de Medeiros, A.K.A., van der Aalst, W.M.P., Pedrinaci, C.: Semantic process mining tools: Core building blocks. In: Proc. ECIS 2008, pp. 1953–1964 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ly, L.T., Indiono, C., Mangler, J., Rinderle-Ma, S. (2012). Data Transformation and Semantic Log Purging for Process Mining. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds) Advanced Information Systems Engineering. CAiSE 2012. Lecture Notes in Computer Science, vol 7328. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31095-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-31095-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31094-2
Online ISBN: 978-3-642-31095-9
eBook Packages: Computer ScienceComputer Science (R0)