Abstract
Process mining refers to the extraction of process models from event logs. Real-life processes tend to be less structured and more flexible. Traditional process mining algorithms have problems dealing with such unstructured processes and generate “spaghetti-like” process models that are hard to comprehend. An approach to overcome this is to cluster process instances such that each of the resulting clusters correspond to coherent sets of process instances that can each be adequately represented by a process model. In this paper, we present multiple feature sets based on conserved patterns and show that the proposed feature sets have a better performance than contemporary approaches. We evaluate the goodness of the formed clusters using established fitness and comprehensibility metrics defined in the context of process mining. The proposed approach is able to generate clusters such that the process models mined from the clustered traces show a high degree of fitness and comprehensibility. Further, the proposed feature sets can be easily discovered in linear time making it amenable to real-time analysis of large data sets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)
Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace Clustering in Process Mining. In: Ardagna, D., et al. (eds.) BPM 2008 Workshops. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009)
Song, M., Günther, C.W., van der Aalst, W.M.P.: Improving Process Mining with Trace Clustering. J. Korean Inst of Industrial Engineers 34(4), 460–469 (2008)
Jagadeesh Chandra Bose, R.P., van der Aalst, W.M.P.: Context Aware Trace Clustering: Towards Improving Process Mining Results. In: Proceedings of the SIAM International Conference on Data Mining, SDM, pp. 401–412 (2009)
Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology (1997)
Kolpakov, Kucherov: Finding Maximal Repetitions in a Word in Linear Time. In: FOCS: IEEE Symposium on Foundations of Computer Science, FOCS (1999)
Cheung, C.F., Yu, J.X., Lu, H.: Constructing Suffix Tree for Gigabyte Sequences with Megabyte Memory. IEEE Trans. Knowl. Data Eng. 17(1), 90–105 (2005)
Ukkonen, E.: On-Line Construction of Suffix Trees. Algorithmica 14(3), 249–260 (1995)
Rao, S., Rodriguez, A., Benson, G.: Evaluating distance functions for clustering tandem repeats. Genome Informatics 16(1), 3–12 (2005)
Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using Little Thumb. Integrated Computer-Aided Engineering 10(2), 151–162 (2003)
Ward, J.H.: Hierarchical Grouping to Optimize an Objective Function. J. Amer. Stat. Assoc. 58, 236–244 (1963)
Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Englewood Cliffs (1988)
van der Aalst, W.M.P., Reijers, H.A., Weijters, A.J.M.M., van Dongen, B.F., de Medeiros, A.K.A., Song, M., Verbeek, H.M.W.: Business Process Mining: An Industrial Application. Info. Sys. 32(5), 713–732 (2007)
Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Discovering Expressive Process Models by Clustering Log Traces. IEEE Trans. Knowl. Data Eng., 1010–1027 (2006)
de Medeiros, A.K.A., Guzzo, A., Greco, G., van der Aalst, W.M.P., Weijters, A.J.M.M., van Dongen, B.F., Sacca, D.: Process Mining Based on Clustering: A Quest for Precision. In: BPM Workshops, pp. 17–29 (2007)
Mendling, J., Strembeck, M.: Influence Factors of Understanding Business Process Models. BIS, 142–153 (2008)
Mendling, J., Neumann, G., van der Aalst, W.M.P.: Understanding the occurrence of errors in process models based on metrics. In: Meersman, R., Tari, Z. (eds.) OTM 2007, Part I. LNCS, vol. 4803, pp. 113–130. Springer, Heidelberg (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bose, R.P.J.C., van der Aalst, W.M.P. (2010). Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds) Business Process Management Workshops. BPM 2009. Lecture Notes in Business Information Processing, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12186-9_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-12186-9_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12185-2
Online ISBN: 978-3-642-12186-9
eBook Packages: Computer ScienceComputer Science (R0)