Enhanced Benchmark Datasets for a Comprehensive Evaluation of Process Model Matching Techniques

  • Muhammad Ali
  • Khurram ShahzadEmail author
Conference paper
Part of the Lecture Notes in Business Information Processing book series (LNBIP, volume 332)


Process Model Matching (PMM) refers to the automatic identification of corresponding activities between a pair of process models. Recognizing the pivotal role of PMM in numerous application areas a plethora of matching techniques have been developed. To evaluate the effectiveness of these techniques, researchers typically use PMMC’15 datasets and three well-established performance measures, precision, recall and F1 score. The performance scores of these measures are useful for a surface level evaluation of a matching technique. However, these overall scores do not provide essential insights about the capabilities of a matching technique. To that end, we enhance the PMMC’15 datasets by classifying corresponding pairs into three types and compute performance scores of each type, separately. We contend that the performance scores for each type of corresponding pairs, together with the surface level performance scores, provide valuable insights about the capabilities of a matching technique. As a second contribution, we use the enhanced datasets for a comprehensive evaluation of three prominent semantic similarity measures. Thirdly, we use the enhanced datasets for a comprehensive evaluation of the results of twelve matching systems from the PMM Contest 2015. From the results, we conclude that there is a need for developing the next generation of matching techniques that are equally effective for the three types of pairs.


Business process management Process Model Matching PMMC’15 datasets Enhanced datasets Comprehensive evaluation 


  1. 1.
    Kuss, E., Leopold, H., van der Aa, H., Stuckenschmidt, H., Reijers, H.A.: Probabilistic evaluation of process model matching techniques. In: Comyn-Wattiau, I., Tanaka, K., Song, I.-Y., Yamamoto, S., Saeki, M. (eds.) ER 2016. LNCS, vol. 9974, pp. 279–292. Springer, Cham (2016). Scholar
  2. 2.
    Jabeen, F., Leopold, H., Reijers, Hajo A.: How to make process model matching work better? An analysis of current similarity measures. In: Abramowicz, W. (ed.) BIS 2017. LNBIP, vol. 288, pp. 181–193. Springer, Cham (2017). Scholar
  3. 3.
    Rodríguez, C., Klinkmüller, C., Weber, I., Daniel, F., Casati, F.: Activity matching with human intelligence. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNBIP, vol. 260, pp. 124–140. Springer, Cham (2016). Scholar
  4. 4.
    Awad, A., Polyvyanyy, A., Weske, M.: Semantic querying of business. Process models. In: Proceedings of the 12th IEEE International Conference on Enterprise Distributed Object Computing Conference (EDOC 2008), pp. 85–94, Munich, Germany (2008)Google Scholar
  5. 5.
    Dumas, M., García-Bañuelos, L., La Rosa, M., Uba, R.: Fast detection of exact clones in business process model repositories. Inf. Syst. 38(4), 619–633 (2012)CrossRefGoogle Scholar
  6. 6.
    La Rosa, M., Dumas, M., Uba, R., Dijkman, R.M.: Business process model merging: an approach to business process consolidation. ACM Trans. Softw. Eng. Methodol. 22(2), 11–42 (2012)Google Scholar
  7. 7.
    Meilicke, C., Leopold, H., Kuss, E.S., Reijers, H.: Overcoming individual process model matcher weaknesses using ensemble matching. Decis. Support Syst. 100(1), 15–26 (2017)CrossRefGoogle Scholar
  8. 8.
    Antunes, G., et al.: The process model matching contest 2015. In: Kolb, J., Leopold, H., Mendling, J. (eds.) Proceedings of the 6th International Workshop on Enterprise Modelling and Information Systems Architecture (EMISA 2015), Innsbruck, Austria. LNI, pp. 1–29. Springer, Heidelberg (2015)Google Scholar
  9. 9.
    Kuss, E., Leopold, H., Aa, H., Stuckenschmidt Reijers, H.A.: A probabilistic evaluation procedure for process model matching techniques. DKE J. (2018, in press)Google Scholar
  10. 10.
    Sonntag, A., Hake, P., Fettke, P., Loos, P.: An approach for semantic business process model matching using supervised machine learning. In: Proceedings of the 24th European Conference on Information Systems, pp. 1– 12. AIS, Istanbul (2016)Google Scholar
  11. 11.
    Clough, P., Gaizauskas, R., Piao, S., Wilks, Y.: METER: MEasuring TExt Reuse. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL 2002), Philadelphia, USA, pp. 152–159 (2002)Google Scholar
  12. 12.
    Clough, P., Stevenson, M.: Developing a corpus of plagiarized short answers. Lang. Resour. Eval. 45(1), 5–24 (2011)CrossRefGoogle Scholar
  13. 13.
    Sameen, S., Sharjeel, M., Nawab, R.M.A., Rayson, P., Muneer, I.: Measuring short text reuse for the Urdu language. IEEE Access 6(1), 7412–7421 (2018)CrossRefGoogle Scholar
  14. 14.
    Xiao, C., Wang, W., Lin, X., Yu, J.X.: Efficient similarity joins for near duplicate detection. ACM Trans. Database Syst. 36(3), 1–15 (2011)CrossRefGoogle Scholar
  15. 15.
    Miller, A.G.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)CrossRefGoogle Scholar
  16. 16.
    Patwardhan, S., Banerjee, S., Pedersen, T.: Using measures of semantic relatedness for word sense disambiguation. In: Proceedings of the 4th International Conference on Intelligent Text Processing and Computational Linguistics, Maxico City, Mexico, pp. 241–257 (2003)Google Scholar
  17. 17.
    Budanitsky, A., Hirst, G.: Evaluating WordNet-based measures of lexical semantic relatedness. Comput. Linguist. 32(1), 13–47 (2006)CrossRefGoogle Scholar
  18. 18.
    Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning (ICML 1998), Madison, USA, pp. 296–304 (1998)Google Scholar
  19. 19.
    Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from a ice cream cone. In: Proceedings of the 5th Annual International Conference on Systems Documentation (SIGDOC 1986), Toronto, Canada, pp. 24–26 (1986)Google Scholar
  20. 20.
    Niemann, M., Siebenhaar, M., Schulte, S., Steinmetz, R.: Comparison and retrieval of process models using related cluster pairs. Comput. Ind. 63(2), 168–180 (2012)CrossRefGoogle Scholar
  21. 21.
    Sebu, M.L.: Similarity of business process models in a modular design. In: Proceedings of the Applied Computational Intelligence and Informatics (SACI 2016), Timisoara, Romania, pp. 31–36 (2016)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Software Development and Maintenance CenterUniversity of GujratGujratPakistan
  2. 2.Punjab University College of Information TechnologyUniversity of the PunjabLahorePakistan

Personalised recommendations