Advertisement

Distributed and Parallel Databases

, Volume 31, Issue 4, pp 471–507 | Cite as

Decomposing Petri nets for process mining: A generic approach

  • Wil M. P. van der Aalst
Article

Abstract

The practical relevance of process mining is increasing as more and more event data become available. Process mining techniques aim to discover, monitor and improve real processes by extracting knowledge from event logs. The two most prominent process mining tasks are: (i) process discovery: learning a process model from example behavior recorded in an event log, and (ii) conformance checking: diagnosing and quantifying discrepancies between observed behavior and modeled behavior. The increasing volume of event data provides both opportunities and challenges for process mining. Existing process mining techniques have problems dealing with large event logs referring to many different activities. Therefore, we propose a generic approach to decompose process mining problems. The decomposition approach is generic and can be combined with different existing process discovery and conformance checking techniques. It is possible to split computationally challenging process mining problems into many smaller problems that can be analyzed easily and whose results can be combined into solutions for the original problems.

Keywords

Process mining Process decomposition Distributed conformance checking Distributed process discovery Petri nets 

Notes

Acknowledgements

This work was supported by the Basic Research Program of the National Research University Higher School of Economics (HSE) in Moscow.

References

  1. 1.
    Adriansyah, A., Sidorova, N., van Dongen, B.F.: Cost-based fitness in conformance checking. In: International Conference on Application of Concurrency to System Design (ACSD 2011), pp. 57–66. IEEE Comput. Soc., Los Alamitos (2011) CrossRefGoogle Scholar
  2. 2.
    Adriansyah, A., van Dongen, B., van der Aalst, W.M.P.: Conformance checking using cost-based fitness analysis. In: Chi, C.H., Johnson, P. (eds.) IEEE International Enterprise Computing Conference (EDOC 2011), pp. 55–64. IEEE Comput. Soc., Los Alamitos (2011) Google Scholar
  3. 3.
    Adriansyah, A., van Dongen, B.F., van der Aalst, W.M.P.: Towards robust conformance checking. In: zur Muehlen, M., Su, J. (eds.) BPM 2010 Workshops, Proceedings of the Sixth Workshop on Business Process Intelligence (BPI 2010). Lecture Notes in Business Information Processing, vol. 66, pp. 122–133. Springer, Berlin (2011) Google Scholar
  4. 4.
    Adriansyah, A., Munoz-Gama, J., Carmona, J., van Dongen, B.F., van der Aalst, W.M.P.: Alignment based precision checking. In: Weber, B., Ferreira, D.R., van Dongen, B. (eds.) Workshop on Business Process Intelligence (BPI 2012), Tallinn, Estonia (2012) Google Scholar
  5. 5.
    Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Trans. Knowl. Data Eng. 8(6), 962–969 (1996) CrossRefGoogle Scholar
  6. 6.
    Agrawal, R., Gunopulos, D., Leymann, F.: Mining process models from workflow logs. In: Sixth International Conference on Extending Database Technology. Lecture Notes in Computer Science, vol. 1377, pp. 469–483. Springer, Berlin (1998) Google Scholar
  7. 7.
    Alves de Medeiros, A.K.: Genetic Process Mining. Ph.D. thesis, Eindhoven University of Technology (2006) Google Scholar
  8. 8.
    Alves de Medeiros, A.K., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic process mining: an experimental evaluation. Data Min. Knowl. Discov. 14(2), 245–304 (2007) MathSciNetCrossRefGoogle Scholar
  9. 9.
    Bergenthum, R., Desel, J., Lorenz, R., Mauser, S.: Process mining based on regions of languages. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) International Conference on Business Process Management (BPM 2007). Lecture Notes in Computer Science, vol. 4714, pp. 375–383. Springer, Berlin (2007) Google Scholar
  10. 10.
    Berthelot, G.: Transformations and decompositions of nets. In: Brauer, W., Reisig, W., Rozenberg, G. (eds.) Advances in Petri Nets 1986 Part I: Petri Nets, Central Models and Their Properties. Lecture Notes in Computer Science, vol. 254, pp. 360–376. Springer, Berlin (1987) Google Scholar
  11. 11.
    Boukala, M.C., Petrucci, L.: Towards distributed verification of Petri nets properties. In: Proceedings of the International Workshop on Verification and Evaluation of Computer and Communication Systems (VECOS’07), pp. 15–26. British Computer Society, London (2007) Google Scholar
  12. 12.
    Bratosin, C., Sidorova, N., van der Aalst, W.M.P.: Distributed genetic process mining. In: Ishibuchi, H. (ed.) IEEE World Congress on Computational Intelligence (WCCI 2010), Barcelona, Spain, July 2010, pp. 1951–1958. IEEE Press, New York (2010) Google Scholar
  13. 13.
    Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: On the role of fitness, precision, generalization and simplicity in process discovery. In: Meersman, R., Rinderle, S., Dadam, P., Zhou, X. (eds.) OTM Federated Conferences, 20th International Conference on Cooperative Information Systems (CoopIS 2012). Lecture Notes in Computer Science, vol. 7565, pp. 305–322. Springer, Berlin (2012) Google Scholar
  14. 14.
    Calders, T., Guenther, C., Pechenizkiy, M., Rozinat, A.: Using minimum description length for process mining. In: ACM Symposium on Applied Computing (SAC 2009), pp. 1451–1455. ACM, New York (2009) CrossRefGoogle Scholar
  15. 15.
    Cannataro, M., Congiusta, A., Pugliese, A., Talia, D., Trunfio, P.: Distributed data mining on grids: services, tools, and applications. IEEE Trans. Syst. Man Cybern., Part B, Cybern. 34(6), 2451–2465 (2004) CrossRefGoogle Scholar
  16. 16.
    Carmona, J., Cortadella, J.: Process mining meets abstract interpretation. In: Balcazar, J.L. (ed.) ECML/PKDD 2010. Lecture Notes in Artificial Intelligence, vol. 6321, pp. 184–199. Springer, Berlin (2010) Google Scholar
  17. 17.
    Carmona, J., Cortadella, J., Kishinevsky, M.: A region-based algorithm for discovering Petri nets from event logs. In: Business Process Management (BPM 2008), pp. 358–373 (2008) CrossRefGoogle Scholar
  18. 18.
    Carmona, J., Cortadella, J., Kishinevsky, M.: Divide-and-conquer strategies for process mining. In: Dayal, U., Eder, J., Koehler, J., Reijers, H. (eds.) Business Process Management (BPM 2009). Lecture Notes in Computer Science, vol. 5701, pp. 327–343. Springer, Berlin (2009) CrossRefGoogle Scholar
  19. 19.
    Castellanos, M., Casati, F., Dayal, U., Shan, M.C.: A comprehensive and automated approach to intelligent business processes execution analysis. Distrib. Parallel Databases 16(3), 239–273 (2009) CrossRefGoogle Scholar
  20. 20.
    Cook, J.E., Wolf, A.L.: Discovering models of software processes from event-based data. ACM Trans. Softw. Eng. Methodol. 7(3), 215–249 (1998) CrossRefGoogle Scholar
  21. 21.
    Cook, J.E., Wolf, A.L.: Software process validation: quantitatively measuring the correspondence of a process to a model. ACM Trans. Softw. Eng. Methodol. 8(2), 147–176 (1999) CrossRefGoogle Scholar
  22. 22.
    IEEE Task Force on Process Mining: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) Business Process Management Workshops. Lecture Notes in Business Information Processing, vol. 99, pp. 169–194. Springer, Berlin (2012) CrossRefGoogle Scholar
  23. 23.
    Darondeau, P.: Unbounded Petri net synthesis. In: Desel, J., Reisig, W., Rozenberg, G. (eds.) Lectures on Concurrency and Petri Nets. Lecture Notes in Computer Science, vol. 3098, pp. 413–438. Springer, Berlin (2004) CrossRefGoogle Scholar
  24. 24.
    De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A robust F-measure for evaluating discovered process models. In: Chawla, N., King, I., Sperduti, A. (eds.) IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), Paris, France, April 2011, pp. 148–155. IEEE Press, New York (2011) CrossRefGoogle Scholar
  25. 25.
    Dhama, H.: Quantitative models of cohesion and coupling in software. J. Syst. Softw. 29(1), 65–74 (1995) CrossRefGoogle Scholar
  26. 26.
    Fahland, D., de Leoni, M., van Dongen, B.F., van der Aalst, W.M.P.: Conformance checking of interacting processes with overlapping instances. In: Rinderle, S., Toumani, F., Wolf, K. (eds.) Business Process Management (BPM 2011). Lecture Notes in Computer Science, vol. 6896, pp. 345–361. Springer, Berlin (2011) CrossRefGoogle Scholar
  27. 27.
    Feige, U., Hajiaghayi, M., Lee, J.: Improved approximation algorithms for minimum-weight vertex separators. In: Proceedings of the Thirty-Seventh Annual ACM Symposium on Theory of Computing, pp. 563–572. ACM, New York (2005) Google Scholar
  28. 28.
    Gaaloul, W., Gaaloul, K., Bhiri, S., Haller, A., Hauswirth, M.: Log-based transactional workflow mining. Distrib. Parallel Databases 25(3), 193–240 (2009) CrossRefGoogle Scholar
  29. 29.
    Georgakopoulos, D., Hornick, M., Sheth, A.: An overview of workflow management: from process modeling to workflow automation infrastructure. Distrib. Parallel Databases 3, 119–153 (1995) CrossRefGoogle Scholar
  30. 30.
    Goedertier, S., Martens, D., Vanthienen, J., Baesens, B.: Robust process discovery with artificial negative events. J. Mach. Learn. Res. 10, 1305–1340 (2009) MathSciNetMATHGoogle Scholar
  31. 31.
    Grigori, D., Casati, F., Castellanos, M., Dayal, U., Sayal, M., Shan, M.C.: Business process intelligence. Comput. Ind. 53(3), 321–343 (2004) CrossRefGoogle Scholar
  32. 32.
    Günther, C.W., van der Aalst, W.M.P.: Fuzzy mining: adaptive process simplification based on multi-perspective metrics. In: Alonso, G., Dadam, P., Rosemann, M. (eds.) International Conference on Business Process Management (BPM 2007). Lecture Notes in Computer Science, vol. 4714, pp. 328–343. Springer, Berlin (2007) Google Scholar
  33. 33.
    Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. MIT Press, Cambridge (2001) Google Scholar
  34. 34.
    Herbst, J.: Ein induktiver Ansatz zur Akquisition und Adaption von Workflow-Modellen. Ph.D. thesis, Universität Ulm (November 2001) Google Scholar
  35. 35.
    Hilbert, M., Lopez, P.: The World’s technological capacity to store, communicate, and compute information. Science 332(6025), 60–65 (2011) CrossRefGoogle Scholar
  36. 36.
    Karpis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1), 359–392 (1998) MathSciNetCrossRefGoogle Scholar
  37. 37.
    Kernighan, B.W., Lin, S.: An efficient heuristic procedure for partitioning graphs. Bell Syst. Tech. J. 49(2), 291–307 (1970) MATHCrossRefGoogle Scholar
  38. 38.
    Kim, M., Candan, K.: SBV-cut: vertex-cut based graph partitioning using structural balance vertices. Data Knowl. Eng. 72, 285–303 (2012) CrossRefGoogle Scholar
  39. 39.
    Lakos, C., Petrucci, L.: Modular analysis of systems composed of semiautonomous subsystems. In: Application of Concurrency to System Design (ACSD’2004), pp. 185–194. IEEE Comput. Soc., Los Alamitos (2004) Google Scholar
  40. 40.
    Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Byers, A.: Big Data: the Next Frontier for Innovation, Competition, and Productivity. McKinsey Global Institute, San Francisco (2011) Google Scholar
  41. 41.
    Mitchell, T.M.: Machine Learning. McGraw-Hill, New York (1997) MATHGoogle Scholar
  42. 42.
    Munoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) Business Process Management (BPM 2010). Lecture Notes in Computer Science, vol. 6336, pp. 211–226. Springer, Berlin (2010) CrossRefGoogle Scholar
  43. 43.
    Munoz-Gama, J., Carmona, J.: Enhancing precision in process conformance: stability, confidence and severity. In: Chawla, N., King, I., Sperduti, A. (eds.) IEEE Symposium on Computational Intelligence and Data Mining (CIDM 2011), Paris, France, April 2011, pp. 184–191. IEEE Press, New York (2011) CrossRefGoogle Scholar
  44. 44.
    Munoz-Gama, J., Carmona, J., van der Aalst, W.M.P.: Conformance Checking in the Large: Partitioning and Topology. BPM Center Report BPM-13-10, BPMcenter.org, (2013)
  45. 45.
    Munoz-Gama, J., Carmona, J., van der Aalst, W.M.P.: Hierarchical conformance checking of process models based on event logs. In: Desel, J., Colom, J.M. (eds.) Applications and Theory of Petri Nets 2013. Lecture Notes in Computer Science Springer, Berlin (2013) Google Scholar
  46. 46.
    Polyvyanyy, A., Vanhatalo, J., Völzer, H.: Simplified computation and generalization of the refined process structure tree. In: Bravetti, M., Bultan, T. (eds.) WS-FM 2010. Lecture Notes in Computer Science, vol. 6551, pp. 25–41. Springer, Berlin (2011) Google Scholar
  47. 47.
    Reguieg, H., Toumani, F., Motahari Nezhad, H., Benatallah, B.: Using MapReduce to scale events correlation discovery for business processes mining. In: Barros, A., Gal, A., Kindler, E. (eds.) International Conference on Business Process Management (BPM 2012). Lecture Notes in Computer Science, vol. 7481, pp. 279–284. Springer, Berlin (2012) Google Scholar
  48. 48.
    Rozinat, A., van der Aalst, W.M.P.: Decision mining in ProM. In: Dustdar, S., Fiadeiro, J.L., Sheth, A. (eds.) International Conference on Business Process Management (BPM 2006). Lecture Notes in Computer Science, vol. 4102, pp. 420–425. Springer, Berlin (2006) Google Scholar
  49. 49.
    Rozinat, A., van der Aalst, W.M.P.: Conformance checking of processes based on monitoring real behavior. Inf. Syst. J. 33(1), 64–95 (2008) CrossRefGoogle Scholar
  50. 50.
    Sheth, A.: A new landscape for distributed and parallel data management. Distrib. Parallel Databases 30(2), 101–103 (2012) CrossRefGoogle Scholar
  51. 51.
    Sole, M., Carmona, J.: Process mining from a basis of regions. In: Lilius, J., Penczek, W. (eds.) Applications and Theory of Petri Nets 2010. Lecture Notes in Computer Science, vol. 6128, pp. 226–245. Springer, Berlin (2010) CrossRefGoogle Scholar
  52. 52.
    van der Aalst, W.M.P.: Process Mining: Discovery, Conformance and Enhancement of Business Processes. Springer, Berlin (2011) Google Scholar
  53. 53.
    van der Aalst, W.M.P.: Decomposing process mining problems using passages. In: Haddad, S., Pomello, L. (eds.) Applications and Theory of Petri Nets 2012. Lecture Notes in Computer Science, vol. 7347, pp. 72–91. Springer, Berlin (2012) Google Scholar
  54. 54.
    van der Aalst, W.M.P.: Distributed process discovery and conformance checking. In: de Lara, J., Zisman, A. (eds.) International Conference on Fundamental Approaches to Software Engineering (FASE 2012). Lecture Notes in Computer Science, vol. 7212, pp. 1–25. Springer, Berlin (2012) CrossRefGoogle Scholar
  55. 55.
    van der Aalst, W.M.P.: Passages in Graphs. BPM Center Report BPM-12-19, BPMcenter.org, (2012)
  56. 56.
    van der Aalst, W.M.P., ter Hofstede, A.H.M., Kiepuszewski, B., Barros, A.P.: Workflow patterns. Distrib. Parallel Databases 14(1), 5–51 (2003) CrossRefGoogle Scholar
  57. 57.
    van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004) CrossRefGoogle Scholar
  58. 58.
    van der Aalst, W.M.P., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010) CrossRefGoogle Scholar
  59. 59.
    van der Aalst, W.M.P., van Hee, K.M., van der Werf, J.M., Verdonk, M.: Auditing 2.0: using process mining to support tomorrow’s auditor. IEEE Comput. Soc. 43(3), 90–93 (2010) CrossRefGoogle Scholar
  60. 60.
    van der Aalst, W.M.P., Adriansyah, A., van Dongen, B.: Replaying history on process models for conformance checking and performance analysis. Data Min. Knowl. Discov. 2(2), 182–192 (2012) CrossRefGoogle Scholar
  61. 61.
    van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process discovery using integer linear programming. Fundam. Inform. 94, 387–412 (2010) Google Scholar
  62. 62.
    Vanhatalo, J., Völzer, H., Koehler, J.: The refined process structure tree. Data Knowl. Eng. 68(9), 793–818 (2009) CrossRefGoogle Scholar
  63. 63.
    Verbeek, H.M.W., van der Aalst, W.M.P.: Decomposing Replay Problems: A Case Study. BPM Center Report BPM-13-09, BPMcenter.org, (2013)
  64. 64.
    Weijters, A.J.M.M., van der Aalst, W.M.P.: Rediscovering workflow models from event-based data using little thumb. Integr. Comput.-Aided Eng. 10(2), 151–162 (2003) Google Scholar
  65. 65.
    Weske, M.: Business Process Management: Concepts, Languages, Architectures. Springer, Berlin (2007) Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  1. 1.Architecture of Information SystemsEindhoven University of TechnologyEindhovenThe Netherlands
  2. 2.International Laboratory of Process-Aware Information SystemsNational Research University Higher School of Economics (HSE)MoscowRussia

Personalised recommendations