Advertisement

Journal of Intelligent Information Systems

, Volume 32, Issue 2, pp 163–190 | Cite as

A novel approach for process mining based on event types

  • Lijie WenEmail author
  • Jianmin Wang
  • Wil M. P. van der Aalst
  • Biqing Huang
  • Jiaguang Sun
Article

Abstract

Despite the omnipresence of event logs in transactional information systems (cf. WFM, ERP, CRM, SCM, and B2B systems), historic information is rarely used to analyze the underlying processes. Process mining aims at improving this by providing techniques and tools for discovering process, control, data, organizational, and social structures from event logs, i.e., the basic idea of process mining is to diagnose business processes by mining event logs for knowledge. Given its potential and challenges it is no surprise that recently process mining has become a vivid research area. In this paper, a novel approach for process mining based on two event types, i.e., START and COMPLETE, is proposed. Information about the start and completion of tasks can be used to explicitly detect parallelism. The algorithm presented in this paper overcomes some of the limitations of existing algorithms such as the α-algorithm (e.g., short-loops) and therefore enhances the applicability of process mining.

Keywords

Process mining Workflow mining Data mining Event types Petri nets WF-nets DWF-nets 

Notes

Acknowledgements

The work is supported by the National 973 Planning Project (No. 2002CB 312006), the National Basic Research Program of China (No. 2007CB310802), the National Natural Science Foundation of China (No. 60473077) and the Program for New Century Excellent Talents in University.

The authors would like to thank Ton Weijters, Ana Karla Alves de Medeiros, Boudewijn van Dongen, Minseok Song, Laura Maruster, Eric Verbeek, Monique Jansen-Vullers, Hajo Reijers, Michael Rosemann, and Peter van den Brand for their on-going work on process mining techniques and tools at Eindhoven University of Technology.

References

  1. Agrawal, R., Gunopulos, D., & Leymann, F. (1998). Mining process models from workflow logs. In Sixth International Conference on Extending Database Technology (pp. 469–483).Google Scholar
  2. Angluin, D., & Smith, C. H. (1983). Inductive inference: Theory and methods. Computing Surveys, 15(3), 37–269.CrossRefMathSciNetGoogle Scholar
  3. Cook, J. E., & Wolf, A. L. (1998a). Discovering models of software processes from event-based data. ACM Transactions on Software Engineering and Methodology, 7(3), 215–249.CrossRefGoogle Scholar
  4. Cook, J. E., & Wolf, A. L. (1998b). Event-based detection of concurrency. In Proceedings of the Sixth International Symposium on the Foundations of Software Engineering (FSE-6) (pp. 35–45).Google Scholar
  5. Cook, J. E., & Wolf, A. L. (1999). Software process validation: Quantitatively measuring the correspondence of a process to a model. ACM Transactions on Software Engineering and Methodology, 8(2), 147–176.CrossRefGoogle Scholar
  6. Desel, J., & Esparza, J. (1995). Free choice petri nets, volume 40 of cambridge tracts in theoretical computer science. Cambridge, UK: Cambridge University Press.Google Scholar
  7. Eder, J., Olivotto, G. E., & Gruber, W. (2002). A data warehouse for workflow logs. In Y. Han, S. Tai, & D. Wikarski (Eds.), International Conference on Engineering and Deployment of Cooperative Information Systems (EDCIS 2002), Volume 2480 of Lecture Notes in Computer Science (pp. 1–15). Berlin: Springer.Google Scholar
  8. Ehrenfeucht, A., & Rozenberg, G. (1989). Partial (set) 2-structures—Part 1 and part 2. Acta Informatica, 27(4), 315–368.CrossRefMathSciNetGoogle Scholar
  9. Gold, E. M. (1967). Language identfication in the limit. Information and Control, 10(5), 447–474.zbMATHCrossRefGoogle Scholar
  10. Gold, E. M. (1978). Complexity of automaton identification from given data. Information and Control, 37(3), 302–320.zbMATHCrossRefMathSciNetGoogle Scholar
  11. Grigori, D., Casati, F., Dayal, U., & Shan, M. C. (2001). Improving business process quality through exception understanding, prediction, and prevention. In P. Apers, P. Atzeni, S. Ceri, S. Paraboschi, K. Ramamohanarao, & R. Snodgrass (Eds.), Proceedings of 27th International Conference on Very Large Data Bases (VLDB’01) (pp. 159–168). Morgan Kaufmann.Google Scholar
  12. Herbst, J. (2000a). A machine learning approach to workflow management. In Proceedings 11th European Conference on Machine Learning, Volume 1810 of Lecture Notes in Computer Science (pp. 183–194). Berlin: Springer-Verlag.Google Scholar
  13. Herbst, J. (2000b). Dealing with concurrency in workflow induction. In U. Baake, R. Zobel, & M. Al-Akaidi (Eds.), European Concurrent Engineering Conference. SCS Europe.Google Scholar
  14. Herbst, J. (2001). Ein induktiver Ansatz zur Akquisition und Adaption von Workflow-Modellen. Ph.D. Thesis, Universität Ulm, November.Google Scholar
  15. Herbst, J., & Karagiannis, D. (1998). Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. In Proceedings of the Ninth International Workshop on Database and Expert Systems Applications (pp. 745–752). IEEE.Google Scholar
  16. Herbst, J., & Karagiannis, D. (1999). An inductive approach to the acquisition and adaptation of workflow models. In M. Ibrahim, & B. Drabble (Eds.), Proceedings of the IJCAI’99 Workshop on Intelligent Workflow and Process Management: The New Frontier for AI in Business (pp. 52–57). Stockholm, Sweden, August.Google Scholar
  17. Herbst, J., & Karagiannis, D. (2000). Integrating machine learning and workflow management to support acquisition and adaptation of workflow models. International Journal of Intelligent Systems in Accounting, Finance and Management, 9, 67–92.CrossRefGoogle Scholar
  18. IDS Scheer (2002). ARIS Process Performance Manager (ARIS PPM). http://www.ids-scheer.com.
  19. Kiepuszewski, B. (2003). Expressiveness and suitability of languages for control flow modelling in workflows. PhD Thesis, Queensland University of Technology, Brisbane, Australia. Available via http://www.workflowpatterns.com.
  20. Mannila, H., & Rusakov, D. (2001). Decomposing event sequences into independent components. In V. Kumar, & R. Grossman (Eds.), Proceedings of the First SIAM Conference on Data Mining (pp. 1–17). SIAM.Google Scholar
  21. Mannila, H., Toivonen, H., & Verkamo, A. I. (1997). Discovery of frequent episodes in event sequences. Data Mining and Knowledge Discovery, 1(3), 259–289.CrossRefGoogle Scholar
  22. Maruster, L., van der Aalst, W. M. P., Weijters, A. J. M. M., van den Bosch, A., & Daelemans, W. (2001). Automated discovery of workflow models from hospital data. In B. Kröse, M. de Rijke, G. Schreiber, & M. van Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001) (pp. 183–190).Google Scholar
  23. Maruster, L., Weijters, A. J. M. M., van der Aalst, W. M. P., & van den Bosch, A. (2002). Process mining: Discovering direct successors in process logs. In Proceedings of the 5th International Conference on Discovery Science (Discovery Science 2002), Volume 2534 of Lecture Notes in Artificial Intelligence (pp. 364–373). Berlin: Springer.Google Scholar
  24. Maxeiner, M. K., Küspert, K., & Leymann, F. (2001). Data Mining von Workflow-Protokollen zur teilautomatisierten Konstruktion von ProzeSSmodellen. In Proceedings of Datenbanksysteme in Büro, Technik und Wissenschaft (pp. 75–84). Berlin, Germany: Informatik Aktuell Springer.Google Scholar
  25. Moreno, J. L. (1934). Who shall survive? Washington, DC: Nervous and Mental Disease Publishing Company.Google Scholar
  26. Parekh, R., Honavar, V. (2000). Automata induction, grammar inference, and language acquisition. In Dale, Moisl, & Somers (Eds.), Handbook of Natural Language Processing. New York: Marcel Dekker.Google Scholar
  27. Pitt, L. (1889). Inductive inference, DFAs, and computational complexity. In K. P. Jantke (Ed.), Proceedings of International Workshop on Analogical and Inductive Inference (AII), Volume 397 of Lecture Notes in Computer Science (pp. 18–44). Berlin: Springer.Google Scholar
  28. Reisig, W., & Rozenberg, G. (Eds) (1998). Lectures on petri nets I: Basic models, Volume 1491 of Lecture Notes in Computer Science. Berlin: Springer.Google Scholar
  29. Rozinat, A., & van der Aalst, W. M. P. (2006). Conformance testing: Measuring the fit and appropriateness of event logs and process models. In C. Bussler et al. (Eds.), BPM 2005 Workshop on Business Process Intelligence, Volume 3812 of Lecture Notes in Computer Science (pp. 163–176). Berlin: Springer.Google Scholar
  30. Sayal, M., Casati, F., Shan, M. C., & Dayal, U. (2002). Business process cockpit. In Proceedings of 28th International Conference on Very Large Data Bases (VLDB’02) (pp. 880–883). Morgan Kaufmann.Google Scholar
  31. Schimm, G. (2001). Process mining. http://www.processmining.de/.
  32. Schimm, G. (2000). Generic linear business process modeling. In S. W. Liddle, H. C. Mayr, & Thalheim, B. (Eds.), Proceedings of the ER 2000 Workshop on Conceptual Approaches for E-Business and The World Wide Web and Conceptual Modeling, Volume 1921 of Lecture Notes in Computer Science (pp. 31–39). Berlin: Springer.Google Scholar
  33. Schimm, G. (2002). Process miner—A tool for mining process schemes from event-based data. In S. Flesca, & G. Ianni (Eds.), Proceedings of the 8th European Conference on Artificial Intelligence (JELIA), Volume 2424 of Lecture Notes in Computer Science (pp. 525–528). Berlin: Springer.Google Scholar
  34. Scott, J. (1992). Social network analysis. Sage, Newbury Park CA.Google Scholar
  35. Staffware (2002). Staffware Process Monitor (SPM). http://www.staffware.com.
  36. van der Aalst, W. M. P. (1998). The application of Petri nets to workflow management. The Journal of Circuits, Systems and Computers, 8(1), 21–66.CrossRefGoogle Scholar
  37. van der Aalst, W. M. P., & Song, M. (2004). Mining social networks: Uncovering interaction patterns in business processes. In M. Weske, B. Pernici, & J. Desel (Eds.), International Conference on Business Process Management (BPM 2004), Lecture Notes in Computer Science. Berlin: Springer.Google Scholar
  38. van der Aalst, W. M. P., & van Dongen, B. F. (2002). Discovering workflow performance models from timed logs. In Y. Han, S. Tai, & D. Wikarski (Eds.), International Conference on Engineering and Deployment of Cooperative Information Systems (EDCIS 2002), Volume 2480 of Lecture Notes in Computer Science (pp. 45–63). Berlin: Springer.Google Scholar
  39. van der Aalst, W. M. P., van Dongen, B. F., Herbst, J., Maruster, L., Schimm, G., & Weijters, A. J. M. M. (2003). Workflow mining: A survey of issues and approaches. Data and Knowledge Engineering, 47(2), 237–267.CrossRefGoogle Scholar
  40. van der Aalst, W. M. P., & van Hee, K. M. (2002). Workflow management: Models, methods, and systems. Cambridge, MA: MIT Press.Google Scholar
  41. van der Aalst, W. M. P., Weijters, A. J. M. M. (Eds.), Process mining, special issue of computers in industry, Vol. 53, No. 3. Amsterdam: Elsevier (2004).Google Scholar
  42. van der Aalst, W. M. P., Weijters, A. J. M. M., & Maruster, L. (2004). Workflow mining: Discovering process models from event logs. IEEE Transactions on Knowledge and Data Engineering, 16(9), 1128–1142.CrossRefGoogle Scholar
  43. van Dongen, B. F., Alves de Medeiros, A. K., Verbeek, H. M. W., Weijters, A. J. M. M., & van der Aalst, W. M. P. (2005). The prom framework: A new era in process mining tool support. In G. Ciardo, & P. Darondeau (Eds.), Application and Theory of Petri Nets 2005, Volume 3536 of Lecture Notes in Computer Science (pp. 444–454). Berlin: Springer.Google Scholar
  44. Weijters, A. J. M. M., & van der Aalst, W. M. P. (2001a). Process mining: Discovering workflow models from event-based data. In B. Kröse, M. de Rijke, G. Schreiber, & M. van Someren (Eds.), Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001) (pp. 283–290).Google Scholar
  45. Weijters, A. J. M. M., & van der Aalst, W. M. P. (2001b). Rediscovering workflow models from event-based data. In V. Hoste, & G. de Pauw (Eds.), Proceedings of the 11th Dutch–Belgian Conference on Machine Learning (Benelearn 2001) (pp. 93–100).Google Scholar
  46. Weijters, A. J. M. M., & van der Aalst, W. M. P. (2002). Workflow mining: Discovering workflow models from event-based data. In C. Dousson, F. Höppner, & R. Quiniou (Eds.), Proceedings of the ECAI Workshop on Knowledge Discovery and Spatial Data (pp. 78–84).Google Scholar
  47. Weijters, A. J. M. M., & van der Aalst, W. M. P. (2003). Rediscovering workflow models from event-based data using little thumb. Integrated Computer-Aided Engineering, 10(2), 151–162.Google Scholar
  48. Wen, L. J., van der Aalst, W. M. P., Wang, J. M., & Sun, J. G. (2007). Mining process models with non-free-choice constructs. Data Mining and Knowledge Discovery, 15(2), 145–180.CrossRefMathSciNetGoogle Scholar
  49. zur Mühlen, M. (2001a). Process-driven management information systems combining data warehouses and workflow technology. In B. Gavish (Ed.), Proceedings of the International Conference on Electronic Commerce Research (ICECR-4) (pp. 550–566). Los Alamitos, California: IEEE Computer Society Press.Google Scholar
  50. zur Mühlen, M. (2001b). Workflow-based process controlling—Or: What you can measure you can control. In L. Fischer (Ed.), Workflow Handbook 2001, Workflow Management Coalition (pp. 61–77). Florida: Lighthouse Point, Future Strategies.Google Scholar
  51. zur Mühlen, M., & Rosemann, M. (2000). Workflow-based process monitoring and controlling—technical and organizational issues. In R. Sprague (Ed.), Proceedings of the 33rd Hawaii International Conference on System Science (HICSS-33) (pp. 1–10). Los Alamitos, California: IEEE Computer Society Press.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2007

Authors and Affiliations

  • Lijie Wen
    • 1
    • 2
    • 3
    Email author
  • Jianmin Wang
    • 1
    • 2
  • Wil M. P. van der Aalst
    • 4
  • Biqing Huang
    • 3
  • Jiaguang Sun
    • 1
    • 2
  1. 1.School of SoftwareTsinghua UniversityBeijingChina
  2. 2.Key Laboratory for Information System Security, Ministry of EducationTsinghua National Laboratory for Information Science and Technology (TNList)BeijingP.R. China
  3. 3.Department of AutomationTsinghua UniversityBeijingChina
  4. 4.Department of Technology ManagementEindhoven University of TechnologyEindhovenThe Netherlands

Personalised recommendations