Skip to main content

Process Mining Event Logs from FLOSS Data: State of the Art and Perspectives

  • Conference paper
  • First Online:
Software Engineering and Formal Methods (SEFM 2014)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 8938))

Included in the following conference series:

Abstract

Free/Libre Open Source Software (FLOSS) is a phenomenon that has undoubtedly triggered extensive research endeavors. At the heart of these initiatives is the ability to mine data from FLOSS repositories with the hope of revealing empirical evidence to answer existing questions on the FLOSS development process. In spite of the success produced with existing mining techniques, emerging questions about FLOSS data require alternative and more appropriate ways to explore and analyse such data.

In this paper, we explore a different perspective called process mining. Process mining has been proved to be successful in terms of tracing and reconstructing process models from data logs (event logs). The chief objective of our analysis is threefold. We aim to achieve: (1) conformance to predefined models; (2) discovery of new model patterns; and, finally, (3) extension to predefined models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    FLEX syntax is used by Adobe Flex, a tool that generates programs for pattern matching in text. It receives user-specified input and produces a C source file.

References

  1. Bettenburg, N., Shihab, E., Hassan, A.E.: An empirical study on the risks of using off-the-shelf techniques for processing mailing list data. In: Proceedings of the IEEE International Conference on Software Maintenance, pp. 539–542. IEEE Computer Society (September 2009)

    Google Scholar 

  2. Cubranic, D., Murphy, G.C.: Hipikat: recommending pertinent software development artifacts. In: Proceedings of the 25th International Conference on Software Engineering, pp. 408–418. IEEE Computer Society (May 2003)

    Google Scholar 

  3. Cubranic, D., Murphy, G.C., Singer, J., Booth, K.S.: Hipikat: a project memory for software development. IEEE Trans. Softw. Eng. 31(6), 446–465 (2005)

    Article  Google Scholar 

  4. de Medeiros, A.K.A., van der Aalst, W.M.P., Weijters, A.J.M.M.T.: Workflow mining: current status and future directions. In: Meersman, R., Schmidt, D.C. (eds.) CoopIS 2003, DOA 2003, and ODBASE 2003. LNCS, vol. 2888, pp. 389–406. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  5. German, D.M.: An empirical study of fine-grained software modifications. Empirical Softw. Eng. 11(3), 369–393 (2006)

    Article  Google Scholar 

  6. German, D.M., Hindle, A.: Visualizing the evolution of software using softchange. Int. J. Softw. Eng. Knowl. Eng. 16(01), 5–21 (2006)

    Article  Google Scholar 

  7. Hassan, A.E.: Mining software repositories to assist developers and support managers. In: Proceedings of the 22nd IEEE International Conference on Software Maintenance (ICSM’06), pp. 339–342. IEEE Computer Society (September 2006)

    Google Scholar 

  8. Hassan, A.E.: The road ahead for mining software repositories. In: Frontiers of Software Maintenance (FoSM 2008), pp. 48–57. IEEE Computer Society (September 2008)

    Google Scholar 

  9. Huang, S.K., Liu, K.M.: Mining version histories to verify the learning process of legitimate peripheral participants. ACM SIGSOFT Softw. Eng. Notes 38(4), 1–5 (2005)

    Google Scholar 

  10. Kagdi, H., Collard, M.L., Maletic, J.I.: A survey and taxonomy of approaches for mining software repositories in the context of software evolution. J. Softw. Maint. Evol. Res. Pract. 19(2), 77–131 (2007)

    Article  Google Scholar 

  11. OpenStack. Openstack system usage data. http://www.openstack.org

  12. Poncin, W., Serebrenik, A., van den Brand, M.: Process mining software repositories. In: Proceedings of the 15th European Conference on Software Maintenance and Reengineering (CSMR 2011), pp. 5–14. IEEE Computer Society (2011)

    Google Scholar 

  13. Robbes, R.: Mining a change-based software repository. In: Proceedings of the Fourth International Workshop on Mining Software Repositories, p. 15. IEEE Computer Society (2007)

    Google Scholar 

  14. Robles, C., Gonzalez-Barahona, J.M.: Developer identification methods for integrated data from various sources. ACM SIGSOFT Softw. Eng. Notes 38(4), 1–5 (2005)

    Article  Google Scholar 

  15. Robles, G., Gonzalez-Barahona, J.M., Izquierdo-Cortazar, D., Herraiz, I.: Tools for the study of the usual data sources found in libre software projects. Int. J. Open Source Softw. Process. (IJOSSP) 1(1), 24–45 (2009)

    Article  Google Scholar 

  16. Robles, G., Koch, S., Gonzalez-Barahona, J.M.: Remote analysis and measurement of libre software systems by means of the cvsanaly tool. In: Proceedings of the 2nd Workshop on Remote Analysis and Measurement of Software Systems (2004)

    Google Scholar 

  17. Rysselberghe, F.V., Demeyer, S.: Mining version control systems for facs (frequently applied changes). In: Proceedings of the International Workshop on Mining Software Repositories (MSR’04), pp. 48–52 (May 2004)

    Google Scholar 

  18. Śliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? ACM SIGSOFT Softw. Eng. Notes 38(4), 1–5 (2005)

    Article  Google Scholar 

  19. Sowe, S.K., Cerone, A.: Integrating data from multiple repositories to analyze patterns of contribution in foss projects. In: Proceedings of the 4th International Workshop on Foundations and Techniques for Open Source Software Certification (OpenCert 2010), Electronic Communications of the EASST, vol. 33. EASST (2010)

    Google Scholar 

  20. van der Aalst, W.M., Rubin, V., Verbeek, H.M.W., van Dongen, B.F., Kindler, E., Günther, C.W.: Process mining: a two-step approach to balance between underfitting and overfitting. Softw. Syst. Model. 9(1), 87–111 (2010)

    Article  Google Scholar 

  21. van der Aalst, W.M., van Dongen, B.F., Herbst, J., Maruster, L., Schimm, G., Weijters, A.J.M.M.: Workflow mining: a survey of issues and approaches. Data Knowl. Eng. 47(2), 237–267 (2003)

    Article  Google Scholar 

  22. van der Aalst, W.M., Weijters, T., Maruster, L.: Workflow mining: discovering process models from event logs. IEEE Trans. Knowl. Data Eng. 16(9), 1128–1142 (2004)

    Article  Google Scholar 

  23. Voinea, L., Telea, A.: Mining software repositories with CVSgrab. In: Proceedings of the 2006 International Workshop on Mining Software Repositories, pp. 167–168. ACM (May 2006)

    Google Scholar 

  24. Weijters, A.J.M.M., der Aalst, W.M.P.V.: Process mining: discovering workflow models from event-based data. In: Proceedings of the 13th Belgium-Netherlands Conference on Artificial Intelligence (BNAIC 2001), pp. 283–290 (October 2001)

    Google Scholar 

  25. Yao, A.: Cvssearch: searching through source code using cvs comments. In: Proceedings of the IEEE International Conference on Software Maintenance (ICSM’01), p. 364. IEEE Computer Society (November 2001)

    Google Scholar 

  26. Ying, A.T., Wright, J.L., Abrams, S.: Source code that talks: an exploration of eclipse task comments and their implication to repository mining. ACM SIGSOFT Softw. Eng. Notes 30(4), 1–5 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Mukala .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Mukala, P., Cerone, A., Turini, F. (2015). Process Mining Event Logs from FLOSS Data: State of the Art and Perspectives. In: Canal, C., Idani, A. (eds) Software Engineering and Formal Methods. SEFM 2014. Lecture Notes in Computer Science(), vol 8938. Springer, Cham. https://doi.org/10.1007/978-3-319-15201-1_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-15201-1_12

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-15200-4

  • Online ISBN: 978-3-319-15201-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics