Advertisement

Empirical Software Engineering

, Volume 21, Issue 3, pp 1107–1142 | Cite as

An empirical study of software release notes

  • Surafel Lemma Abebe
  • Nasir Ali
  • Ahmed E. Hassan
Article

Abstract

Release notes are an important source of information about a new software release. Such notes contain information regarding what is new, changed, and/or got fixed in a release. Despite the importance of release notes, they are rarely explored in the research literature. Little is known about the contained information, e.g., contents and structure, in release notes. To better understand the types of contained information in release notes, we manually analyzed 85 release notes across 15 different software systems. In our manual analysis, we identify six different types of information (e.g., caveats and addressed issues) that are contained in release notes. Addressed issues refer to new features, bugs, and improvements that were integrated in that particular release. We observe that most release notes list only a selected number of addressed issues (i.e., 6-26 % of all addressed issues in a release). We investigated nine different factors (e.g., issue priority and type) to better understand the likelihood of an issue being listed in release notes. The investigation is conducted on eight release notes of three software systems using four machine learning techniques. Results show that certain factors, e.g., issue type, have higher influence on the likelihood of an issue to be listed in release notes. We use machine learning techniques to automatically suggest the issues to be listed in release notes. Our results show that issues listed in all release notes can be automatically determined with an average precision of 84 % and an average recall of 90 %. To train and build the classification models, we also explored three scenarios: (a) having the user label some issues for a release and automatically suggest the remaining issues for that particular release, (b) using the previous release notes for the same software system, and (c) using prior releases for the current software system and the rest of the studied software systems. Our results show that the content of release notes vary between software systems and across the versions of the same software system. Nevertheless, automated techniques can provide reasonable support to the writers of such notes with little training data. Our study provides developers with empirically-supported advice about release notes instead of simply relying on adhoc advice from on-line inquiries.

Keywords

Software release notes Machine learning techniques Empirical study 

References

  1. Abebe SL, Arnaoudova V, Tonella P, Antoniol G, Gueheneuc Y-G (2012) Can lexicon bad smells improve fault prediction? In: Proceedings of the 2012 19th Working Conference on Reverse Engineering, WCRE ’12. IEEE Computer Society, Washington, pp 235–244Google Scholar
  2. Abran A, Bourque P, Dupuis R, Moore J W, Leonard L T (2004) Guide to the software engineering body of knowledge - SWEBOK, 2004version edn. IEEE Press, PiscatawayGoogle Scholar
  3. Ahsan SN, Ferzund J, Wotawa F (2009) Program file bug fix effort estimation using machine learning methods for oss. In: SEKE, pp 129–134Google Scholar
  4. Bachmann A, Bird C, Rahman F, Devanbu P, Bernstein A (2010) The missing links: bugs and bug-fix commits. In: Proceedings of the eighteenth ACM SIGSOFT international symposium on Foundations of software engineering. ACM, pp 97–106Google Scholar
  5. Barandela R, Sánchez JS, García V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recogn 36(3):849–851Google Scholar
  6. Barandela R, Sánchez JS, García V, Rangel E (2003) Strategies for learning in class imbalance problems. Pattern Recogn 36(3):849–851Google Scholar
  7. Robert MB, Thomas JO, Elaine JW (2011) Does measuring code change improve fault prediction? In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering, Promise ’11. ACM, New York, pp 2:1–2:8Google Scholar
  8. Bernard R, PSP, CHS-III (2012) Convergence q&a: The answer is in black and white. Accessed September, 2014. URL http://goo.gl/VMZG2k
  9. Bird C, Pattison D, D’Souza R, Filkov V, Devanbu P (2008) Latent social structure in open source projects. In: Proceedings of the 16th ACM SIGSOFT International Symposium on Foundations of software engineering. ACM, pp 24–35Google Scholar
  10. Breiman L, Schapire E (2001) Random forests. In: Machine Learning, pp 5–32Google Scholar
  11. Creswell JW (2008) Research design: qualitative, quantitative, and mixed methods approaches, 3edn. Sage Publications Ltd.Google Scholar
  12. D’Ambros M, Lanza M, Robbes R (2010) An extensive comparison of bug prediction approaches. In: Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR), MSR ’10, pp 31–41Google Scholar
  13. Anh ND, Cruzes DS, Conradi R, Ayala C (2011) Empirical validation of human factors in predicting issue lead time in open source projects. In: Proceedings of the 7th International Conference on Predictive Models in Software Engineering. ACM, p 13Google Scholar
  14. Eyolfson J, Tan L, Lam P (2011) Do time of day and developer experience affect commit bugginess? In: Proceedings of the 8th Working Conference on Mining Software Repositories. ACM, pp 153–162Google Scholar
  15. German D (2004) Using software trails to rebuild the evolution of software, Journal of Software Maintenance and Evolution: Research and PracticeGoogle Scholar
  16. Hall M, Frank E, Holmes G, Pfahringer B, Reutemann P, Witten IH (2009) The weka data mining software: an update. SIGKDD Exploration Newsletter 11(1):10–18CrossRefGoogle Scholar
  17. He Z, Shu F, Ye Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Automated Software Engg 19(2):167–199CrossRefGoogle Scholar
  18. Hothorn T, Hornik K, Zeileis A (2006) Unbiased recursive partitioning: A conditional inference framework. J Comput Graph Stat 15(3):651–674MathSciNetCrossRefGoogle Scholar
  19. Hothorn T, Lausen B, Benner A, Radespiel-Tröger M (2004) Bagging survival trees. Stat Med 23(1):77–91CrossRefGoogle Scholar
  20. Kamei Y, Matsumoto S, Monden A, Matsumoto K-i, Adams B, Hassan AE (2010) Revisiting common bug prediction findings using effort-aware models. In: Proceedings of the 2010 IEEE International Conference on Software Maintenance, ICSM ’10. IEEE Computer Society, Washington , pp 1–10Google Scholar
  21. Keerthi SS, Shevade SK, Bhattacharyya C, Murthy KRK (2001) Improvements to platt’s smo algorithm for svm classifier design. Neural Comput 13 (3):637–649CrossRefMATHGoogle Scholar
  22. Khomh F, Dhaliwal T, Zou Y, Adams B (2012) Do faster releases improve software quality? an empirical case study of mozilla firefox. In: 2012 9th IEEE Working Conference on Mining Software Repositories (MSR). IEEE, pp 179–188Google Scholar
  23. Liu K, Tan H B K, Zhang H (2013) Has this bug been reported? In: 2013 20th Working Conference on Reverse Engineering (WCRE). IEEE, pp 82–91Google Scholar
  24. Maalej W, Happel H-J (2010) Can development work describe itself? In: 7th IEEE Working Conference on Mining Software Repositories, pp 191–200Google Scholar
  25. Martin PY, Turner BA (1986) Grounded theory and organizational research. J Appl Behav Sci 22(2):141–157CrossRefGoogle Scholar
  26. Mende T, Koschke R (2010) Effort-aware defect prediction models. In: 2010 14th European Conference on Software Maintenance and Reengineering (CSMR). IEEE, pp 107–116Google Scholar
  27. Mockus A, Herbsleb JD (2002) Expertise browser: A quantitative approach to identifying expertise. In: Proceedings of the 24th International Conference on Software Engineering, ICSE ’02. ACM, New York, pp 503–512Google Scholar
  28. LaTonya Pearson (2013) The benefit of software release notes and why your company should use them. Accessed September, 2014. URL http://goo.gl/lsCKtw
  29. Quinlan R (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers, San MateoGoogle Scholar
  30. Rahman F, Devanbu P (2011) Ownership, experience and defects: a fine-grained study of authorship. In: Proceedings of the 33rd International Conference on Software Engineering. ACM, pp 491–500Google Scholar
  31. Shirabad JS (2003) Supporting software maintenance by mining software update records, PhD thesis, Ottawa, Ont., Canada, Canada. AAINQ79317Google Scholar
  32. Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto K-i (2010) Predicting re-opened bugs: A case study on the eclipse project. In: 17th Working Conference on Reverse Engineering (WCRE), pp 249–258Google Scholar
  33. Shihab E, Ihara A, Kamei Y, Ibrahim WM, Ohira M, Adams B, Hassan AE, Matsumoto K-i (2013) Studying re-opened bugs in open source software. Empir Softw Eng 18(5):1005–1042CrossRefGoogle Scholar
  34. Sing T, Sander O, Beerenwinkel N, Lengauer T (2005) Rocr: visualizing classifier performance in r. Bioinformatics 21(20):3940–3941CrossRefGoogle Scholar
  35. Sumner M, Frank E, Hall M (2005) Speeding up logistic model tree induction. In: Proceedings of the 9th European conference on Principles and Practice of Knowledge Discovery in Databases, PKDD’05. Springer, Berlin, pp 675–683Google Scholar
  36. Tian Y, Lo D, Sun C (2013) Drone: Predicting priority of reported bugs by multi-factor analysis. In: 2013 29th IEEE International Conference on Software Maintenance (ICSM). IEEE, pp 200–209Google Scholar
  37. Wu R, Zhang H, Kim S, Cheung S-C (2011) Relink: recovering links between bugs and changes. In: Proceedings of the nineteen ACM SIGSOFT international symposium on Foundations of software engineering. ACM, pp 15–25Google Scholar
  38. Yu L (2009) Mining change logs and release notes to understand software maintenance and evolution. CLEI Electron J 12(2)Google Scholar
  39. Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proceedings of the 11th Working Conference on Mining Software Repositories, MSR 2014. ACM, New York, pp 182–191Google Scholar
  40. Zimmermann T, Nagappan N, Gall H, Giger E, Murphy B (2009) Cross-project defect prediction: A large scale experiment on data vs. domain vs. process. In: Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, ESEC/FSE ’09. ACM, New York, pp 91–100Google Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  • Surafel Lemma Abebe
    • 1
  • Nasir Ali
    • 1
  • Ahmed E. Hassan
    • 1
  1. 1.Software Analysis and Intelligence Lab (SAIL) School of ComputingQueen’s UniversityKingstonCanada

Personalised recommendations