Skip to main content

Semantic Categorization of Software Bug Repositories for Severity Assignment Automation

  • Chapter
  • First Online:
Book cover Integrating Research and Practice in Software Engineering

Part of the book series: Studies in Computational Intelligence ((SCI,volume 851))

Abstract

Bug triage is one of the crucial activities undertaken during the maintenance phase of large-scale software projects, to fix the bugs that appear. In this paper we propose an approach to automate one of the important activities of bug triage which is the bug severity assignment. The proposed approach is based on mining the historical bug repositories of software projects. It utilizes the Hierarchical Dirichlet Process (HDP) topic modeller to extract the topics shared by the historical bug reports, then categorizing them according to their proportions in the extracted topics using the K-means clustering algorithm. For each new submitted report, the top K similar reports are retrieved from their cluster using a novel weighted K-nearest neighbour algorithm that utilizes a similarity measure called Improved-Sqrt-Cosine similarity. The severity level of the new bug is assigned using a Dual-weighted voting scheme. The experimental results demonstrated that our proposed model improved the performance of the bug severity assignment task when compared against three baseline models in the context of two popular bug repositories, Eclipse and Mozilla.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 129.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Yang, G., Zhang, T., Lee, B.: Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Proceedings of the IEEE 38th Annual Computer Software and Applications Conference (COMPSAC’14) (2014)

    Google Scholar 

  2. Xuan, J., Jiang, H., Hu, Y., Ren, Z., Zou, W., Luo, Z., Wu, X.: Towards effective bug triage with software data reduction techniques. IEEE Trans. Knowl. Data Eng. (2015)

    Google Scholar 

  3. Uddin, J., Ghazali1, R., Mat Deris, M., Naseem, R., Shah, H.: A survey on bug prioritization. Artif. Intell. Rev. (2016)

    Google Scholar 

  4. Xia, X., Lo, D., Wen, M., Shihab, E., Zhou, B.: An empirical study of bug report field reassignment. In: the Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (2014)

    Google Scholar 

  5. Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: The Proceeding of IEEE International Conference on Software Maintenance (ICSM 2008), pp. 346–355, Sept 2008

    Google Scholar 

  6. Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: The Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10), pp. 1–10 (2010)

    Google Scholar 

  7. Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonck, T. : Comparing mining algorithms for predicting the severity of a reported bug. In: The Proceedings of 15th European Conference on Software Maintenance and Reengineering (CSMR), pp. 249–258 (2011)

    Google Scholar 

  8. Chaturvedi, K., Singh, V.: Determining bug severity using machine learning techniques, In: The Proceedings of the 6th Conference on Software Engineering (CONSEG) (2012)

    Google Scholar 

  9. Yang, C.-Z., Hou, C.-C., Kao, W.-C., Chen, I.-X.: An empirical study on improving severity prediction of defect reports using feature selection. In: The Proceedings of the 19th Asia-Pacific Software Engineering Conference (APSEC’12), pp. 240–249 (2012)

    Google Scholar 

  10. Sharma, G., Sharma, S., Gujral, S.: A novel way of assessing software bug severity using dictionary of critical terms. In: The Proceedings of 4th International Conference on Eco-friendly Computing and Communication Systems (ICECCS, 2015) [Proc. Comput. Sci. 70, 632–639 (2015)]

    Google Scholar 

  11. Roy, N.K.S., Rossi, B.: Towards an improvement of bug severity classification. In: 40th Euromicro Conference on Software Engineering and Advanced Applications, Italy (2014)

    Google Scholar 

  12. Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: The Proceedings of 19th Working Conference on Reverse Engineering (WCRE), pp. 215–224 (2012)

    Google Scholar 

  13. Zhang, T., Chen, J., Yang, G., Lee, B., Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Softw. (2016)

    Google Scholar 

  14. Hotho, A., Nurnberger, A., Paas, G.: A brief survey of text mining. J. Comput. Linguist. Lang. Technol. 19–62 (2005)

    Google Scholar 

  15. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Int. J. Machine Learn. Res. 3, 993–1022 (2003)

    Google Scholar 

  16. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Statistical Assoc. 101(476) (2006)

    Article  MathSciNet  Google Scholar 

  17. Wallach, H.M.: Topic modelling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning (ICML ‘06), New York, USA (2006)

    Google Scholar 

  18. Sohangir, S., Wang, D.: Improved Sqrt‑Cosine similarity measurement. J. Big Data (2017)

    Google Scholar 

  19. Hamdy, A., Elsayed, M.: Towards more accurate automatic recommendation of software design patterns. J. Theor. Appl. Inform. Technol. 96(15), 5069–5079 (2018)

    Google Scholar 

  20. Hamdy, A., Elsayed, M.: Topic modelling for automatic selection of software design patterns, In: proceedings of International Conference on Software and Services Engineering (ICSSE), 20–22 April 2018

    Google Scholar 

  21. Gou, J., Xiong, T., Kuang, Y.: A novel weighted voting for K-nearest neighbour rule. J. Comput. (2011)

    Google Scholar 

  22. Wen, Z., Song, W., Qing, W.: BAHA: A novel approach to automatic bug report assignment with topic modeling and heterogeneous network analysis. Chin. J. Electron. 25(6) (2016)

    Google Scholar 

  23. Nguyen, A.T., Lo, D., Sun, C.: Duplicate bug report detection with a combination of information retrieval and topic modeling. In: The Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE ’12), Essen, Germany, 3–7 Sept 2012

    Google Scholar 

  24. Limsettho, N., Hata, H., Monden, A., Matsumoto, K.: Unsupervised bug report categorization using clustering and labelling algorithm. Int. J. Softw. Eng. Knowl. Eng. (2016)

    Google Scholar 

  25. Nagwani, N.K., Verma, S., Mehta, K.K.: Generating taxonomic terms for software bug classification by utilizing topic models based on Latent Dirichlet Allocation. In: The Proceedings of 11th International Conference on ICT and Knowledge Engineering (2013)

    Google Scholar 

  26. Yanb, M., Zhang, X., Yang, D., Xub, L., Kymerb, J.D.: A component recommender for bug reports using discriminative probability latent semantic analysis. Inform. Softw. Technol. 37–51 (2016)

    Article  Google Scholar 

  27. Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields, pp. 42–49. In: CIKM’04 (2004)

    Google Scholar 

  28. NLTK: www.nltk.org

  29. Porter, M.F.: An algorithm for suffix stripping. J. Program Electron. Library Inform. Syst. 40, 211–218 (2006)

    Article  MathSciNet  Google Scholar 

  30. Porter, M.F.: Snowball: a language for stemming algorithms (2001)

    Google Scholar 

  31. Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. ICML 97, 412–420 (1997)

    Google Scholar 

  32. GENSIM: https://pypi.org/project/gensim/

  33. Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)

    Google Scholar 

  34. Dudani, S.A.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6, 325–327 (1976)

    Article  Google Scholar 

  35. Kang, P., Cho, S.: Locally linear reconstruction for instance-based learning. Pattern Recogn. 41, 3507–3518 (2008)

    Article  Google Scholar 

  36. Zhu, S., Liu, L., Wang, Y.: Information retrieval using Hellinger distance and Sqrt-Cos similarity. In: The Proceedings of 7th International Conference on Computer Science & Education (ICCSE 2012), Melbourne, Australia, 14–17 July 2012

    Google Scholar 

  37. WEKA: https://pypi.org/project/python-weka-wrapper/

  38. Hamdy, A., El-Laithy, A.: Using smote and feature reduction for more effective bug severity prediction. Int. J. Softw. Eng. Knowl. Eng. 29(6), 897–919 (2017)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to AbdulRahman El-Laithy .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Hamdy, A., El-Laithy, A. (2020). Semantic Categorization of Software Bug Repositories for Severity Assignment Automation. In: Jarzabek, S., Poniszewska-Marańda, A., Madeyski, L. (eds) Integrating Research and Practice in Software Engineering. Studies in Computational Intelligence, vol 851. Springer, Cham. https://doi.org/10.1007/978-3-030-26574-8_2

Download citation

Publish with us

Policies and ethics