Skip to main content

Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models

  • Conference paper
Book cover Business Process Management Workshops (BPM 2009)

Part of the book series: Lecture Notes in Business Information Processing ((LNBIP,volume 43))

Included in the following conference series:

Abstract

Process mining refers to the extraction of process models from event logs. Real-life processes tend to be less structured and more flexible. Traditional process mining algorithms have problems dealing with such unstructured processes and generate “spaghetti-like” process models that are hard to comprehend. An approach to overcome this is to cluster process instances such that each of the resulting clusters correspond to coherent sets of process instances that can each be adequately represented by a process model. In this paper, we present multiple feature sets based on conserved patterns and show that the proposed feature sets have a better performance than contemporary approaches. We evaluate the goodness of the formed clusters using established fitness and comprehensibility metrics defined in the context of process mining. The proposed approach is able to generate clusters such that the process models mined from the clustered traces show a high degree of fitness and comprehensibility. Further, the proposed feature sets can be easily discovered in linear time making it amenable to real-time analysis of large data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow Mining: Discovering Process Models from Event Logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  2. Song, M., Günther, C.W., van der Aalst, W.M.P.: Trace Clustering in Process Mining. In: Ardagna, D., et al. (eds.) BPM 2008 Workshops. LNBIP, vol. 17, pp. 109–120. Springer, Heidelberg (2009)

    Google Scholar 

  3. Song, M., Günther, C.W., van der Aalst, W.M.P.: Improving Process Mining with Trace Clustering. J. Korean Inst of Industrial Engineers 34(4), 460–469 (2008)

    Google Scholar 

  4. Jagadeesh Chandra Bose, R.P., van der Aalst, W.M.P.: Context Aware Trace Clustering: Towards Improving Process Mining Results. In: Proceedings of the SIAM International Conference on Data Mining, SDM, pp. 401–412 (2009)

    Google Scholar 

  5. Gusfield, D.: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology (1997)

    Google Scholar 

  6. Kolpakov, Kucherov: Finding Maximal Repetitions in a Word in Linear Time. In: FOCS: IEEE Symposium on Foundations of Computer Science, FOCS (1999)

    Google Scholar 

  7. Cheung, C.F., Yu, J.X., Lu, H.: Constructing Suffix Tree for Gigabyte Sequences with Megabyte Memory. IEEE Trans. Knowl. Data Eng. 17(1), 90–105 (2005)

    Article  Google Scholar 

  8. Ukkonen, E.: On-Line Construction of Suffix Trees. Algorithmica 14(3), 249–260 (1995)

    Article  Google Scholar 

  9. Rao, S., Rodriguez, A., Benson, G.: Evaluating distance functions for clustering tandem repeats. Genome Informatics 16(1), 3–12 (2005)

    Google Scholar 

  10. Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using Little Thumb. Integrated Computer-Aided Engineering 10(2), 151–162 (2003)

    Article  Google Scholar 

  11. Ward, J.H.: Hierarchical Grouping to Optimize an Objective Function. J. Amer. Stat. Assoc. 58, 236–244 (1963)

    Article  Google Scholar 

  12. Jain, A.K., Dubes, R.C.: Algorithms for Clustering Data. Prentice-Hall Inc., Englewood Cliffs (1988)

    Google Scholar 

  13. van der Aalst, W.M.P., Reijers, H.A., Weijters, A.J.M.M., van Dongen, B.F., de Medeiros, A.K.A., Song, M., Verbeek, H.M.W.: Business Process Mining: An Industrial Application. Info. Sys. 32(5), 713–732 (2007)

    Article  Google Scholar 

  14. Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Discovering Expressive Process Models by Clustering Log Traces. IEEE Trans. Knowl. Data Eng., 1010–1027 (2006)

    Article  Google Scholar 

  15. de Medeiros, A.K.A., Guzzo, A., Greco, G., van der Aalst, W.M.P., Weijters, A.J.M.M., van Dongen, B.F., Sacca, D.: Process Mining Based on Clustering: A Quest for Precision. In: BPM Workshops, pp. 17–29 (2007)

    Google Scholar 

  16. Mendling, J., Strembeck, M.: Influence Factors of Understanding Business Process Models. BIS, 142–153 (2008)

    Google Scholar 

  17. Mendling, J., Neumann, G., van der Aalst, W.M.P.: Understanding the occurrence of errors in process models based on metrics. In: Meersman, R., Tari, Z. (eds.) OTM 2007, Part I. LNCS, vol. 4803, pp. 113–130. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bose, R.P.J.C., van der Aalst, W.M.P. (2010). Trace Clustering Based on Conserved Patterns: Towards Achieving Better Process Models. In: Rinderle-Ma, S., Sadiq, S., Leymann, F. (eds) Business Process Management Workshops. BPM 2009. Lecture Notes in Business Information Processing, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12186-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12186-9_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12185-2

  • Online ISBN: 978-3-642-12186-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics