Semantic Categorization of Software Bug Repositories for Severity Assignment Automation

Hamdy, Abeer; El-Laithy, AbdulRahman

doi:10.1007/978-3-030-26574-8_2

Abeer Hamdy^5,6 &
AbdulRahman El-Laithy⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 851))

696 Accesses
8 Citations

Abstract

Bug triage is one of the crucial activities undertaken during the maintenance phase of large-scale software projects, to fix the bugs that appear. In this paper we propose an approach to automate one of the important activities of bug triage which is the bug severity assignment. The proposed approach is based on mining the historical bug repositories of software projects. It utilizes the Hierarchical Dirichlet Process (HDP) topic modeller to extract the topics shared by the historical bug reports, then categorizing them according to their proportions in the extracted topics using the K-means clustering algorithm. For each new submitted report, the top K similar reports are retrieved from their cluster using a novel weighted K-nearest neighbour algorithm that utilizes a similarity measure called Improved-Sqrt-Cosine similarity. The severity level of the new bug is assigned using a Dual-weighted voting scheme. The experimental results demonstrated that our proposed model improved the performance of the bug severity assignment task when compared against three baseline models in the context of two popular bug repositories, Eclipse and Mozilla.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Hardcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Yang, G., Zhang, T., Lee, B.: Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Proceedings of the IEEE 38th Annual Computer Software and Applications Conference (COMPSAC’14) (2014)
Google Scholar
Xuan, J., Jiang, H., Hu, Y., Ren, Z., Zou, W., Luo, Z., Wu, X.: Towards effective bug triage with software data reduction techniques. IEEE Trans. Knowl. Data Eng. (2015)
Google Scholar
Uddin, J., Ghazali1, R., Mat Deris, M., Naseem, R., Shah, H.: A survey on bug prioritization. Artif. Intell. Rev. (2016)
Google Scholar
Xia, X., Lo, D., Wen, M., Shihab, E., Zhou, B.: An empirical study of bug report field reassignment. In: the Proceedings of the 2014 Software Evolution Week-IEEE Conference on Software Maintenance, Reengineering and Reverse Engineering (2014)
Google Scholar
Menzies, T., Marcus, A.: Automated severity assessment of software defect reports. In: The Proceeding of IEEE International Conference on Software Maintenance (ICSM 2008), pp. 346–355, Sept 2008
Google Scholar
Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: The Proceedings of the 7th IEEE Working Conference on Mining Software Repositories (MSR’10), pp. 1–10 (2010)
Google Scholar
Lamkanfi, A., Demeyer, S., Soetens, Q.D., Verdonck, T. : Comparing mining algorithms for predicting the severity of a reported bug. In: The Proceedings of 15th European Conference on Software Maintenance and Reengineering (CSMR), pp. 249–258 (2011)
Google Scholar
Chaturvedi, K., Singh, V.: Determining bug severity using machine learning techniques, In: The Proceedings of the 6th Conference on Software Engineering (CONSEG) (2012)
Google Scholar
Yang, C.-Z., Hou, C.-C., Kao, W.-C., Chen, I.-X.: An empirical study on improving severity prediction of defect reports using feature selection. In: The Proceedings of the 19th Asia-Pacific Software Engineering Conference (APSEC’12), pp. 240–249 (2012)
Google Scholar
Sharma, G., Sharma, S., Gujral, S.: A novel way of assessing software bug severity using dictionary of critical terms. In: The Proceedings of 4th International Conference on Eco-friendly Computing and Communication Systems (ICECCS, 2015) [Proc. Comput. Sci. 70, 632–639 (2015)]
Google Scholar
Roy, N.K.S., Rossi, B.: Towards an improvement of bug severity classification. In: 40th Euromicro Conference on Software Engineering and Advanced Applications, Italy (2014)
Google Scholar
Tian, Y., Lo, D., Sun, C.: Information retrieval based nearest neighbour classification for fine-grained bug severity prediction. In: The Proceedings of 19th Working Conference on Reverse Engineering (WCRE), pp. 215–224 (2012)
Google Scholar
Zhang, T., Chen, J., Yang, G., Lee, B., Luo, X.: Towards more accurate severity prediction and fixer recommendation of software bugs. J. Syst. Softw. (2016)
Google Scholar
Hotho, A., Nurnberger, A., Paas, G.: A brief survey of text mining. J. Comput. Linguist. Lang. Technol. 19–62 (2005)
Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. Int. J. Machine Learn. Res. 3, 993–1022 (2003)
Google Scholar
Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet processes. J. Am. Statistical Assoc. 101(476) (2006)
Article MathSciNet Google Scholar
Wallach, H.M.: Topic modelling: beyond bag-of-words. In: Proceedings of the 23rd International Conference on Machine Learning (ICML ‘06), New York, USA (2006)
Google Scholar
Sohangir, S., Wang, D.: Improved Sqrt‑Cosine similarity measurement. J. Big Data (2017)
Google Scholar
Hamdy, A., Elsayed, M.: Towards more accurate automatic recommendation of software design patterns. J. Theor. Appl. Inform. Technol. 96(15), 5069–5079 (2018)
Google Scholar
Hamdy, A., Elsayed, M.: Topic modelling for automatic selection of software design patterns, In: proceedings of International Conference on Software and Services Engineering (ICSSE), 20–22 April 2018
Google Scholar
Gou, J., Xiong, T., Kuang, Y.: A novel weighted voting for K-nearest neighbour rule. J. Comput. (2011)
Google Scholar
Wen, Z., Song, W., Qing, W.: BAHA: A novel approach to automatic bug report assignment with topic modeling and heterogeneous network analysis. Chin. J. Electron. 25(6) (2016)
Google Scholar
Nguyen, A.T., Lo, D., Sun, C.: Duplicate bug report detection with a combination of information retrieval and topic modeling. In: The Proceedings of the 27th IEEE/ACM International Conference on Automated Software Engineering (ASE ’12), Essen, Germany, 3–7 Sept 2012
Google Scholar
Limsettho, N., Hata, H., Monden, A., Matsumoto, K.: Unsupervised bug report categorization using clustering and labelling algorithm. Int. J. Softw. Eng. Knowl. Eng. (2016)
Google Scholar
Nagwani, N.K., Verma, S., Mehta, K.K.: Generating taxonomic terms for software bug classification by utilizing topic models based on Latent Dirichlet Allocation. In: The Proceedings of 11th International Conference on ICT and Knowledge Engineering (2013)
Google Scholar
Yanb, M., Zhang, X., Yang, D., Xub, L., Kymerb, J.D.: A component recommender for bug reports using discriminative probability latent semantic analysis. Inform. Softw. Technol. 37–51 (2016)
Article Google Scholar
Robertson, S., Zaragoza, H., Taylor, M.: Simple BM25 extension to multiple weighted fields, pp. 42–49. In: CIKM’04 (2004)
Google Scholar
NLTK: www.nltk.org
Porter, M.F.: An algorithm for suffix stripping. J. Program Electron. Library Inform. Syst. 40, 211–218 (2006)
Article MathSciNet Google Scholar
Porter, M.F.: Snowball: a language for stemming algorithms (2001)
Google Scholar
Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. ICML 97, 412–420 (1997)
Google Scholar
GENSIM: https://pypi.org/project/gensim/
Hand, D., Mannila, H., Smyth, P.: Principles of Data Mining. The MIT Press, Cambridge (2001)
Google Scholar
Dudani, S.A.: The distance-weighted k-nearest neighbor rule. IEEE Trans. Syst. Man Cybern. SMC-6, 325–327 (1976)
Article Google Scholar
Kang, P., Cho, S.: Locally linear reconstruction for instance-based learning. Pattern Recogn. 41, 3507–3518 (2008)
Article Google Scholar
Zhu, S., Liu, L., Wang, Y.: Information retrieval using Hellinger distance and Sqrt-Cos similarity. In: The Proceedings of 7th International Conference on Computer Science & Education (ICCSE 2012), Melbourne, Australia, 14–17 July 2012
Google Scholar
WEKA: https://pypi.org/project/python-weka-wrapper/
Hamdy, A., El-Laithy, A.: Using smote and feature reduction for more effective bug severity prediction. Int. J. Softw. Eng. Knowl. Eng. 29(6), 897–919 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics and Computer Science, British University in Egypt, El Shorouk, Egypt
Abeer Hamdy & AbdulRahman El-Laithy
Computers and Systems Department, Electronics Research Institute, Giza, Egypt
Abeer Hamdy

Authors

Abeer Hamdy
View author publications
You can also search for this author in PubMed Google Scholar
AbdulRahman El-Laithy
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to AbdulRahman El-Laithy .

Editor information

Editors and Affiliations

Faculty of Computer Science, Bialystok University of Technology, Białystok, Poland
Stan Jarzabek
Institute of Information Technology, Lodz University of Technology, Łódź, Poland
Aneta Poniszewska-Marańda
Faculty of Computer Science and Management, Wroclaw University of Science and Technology, Wrocław, Poland
Lech Madeyski

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hamdy, A., El-Laithy, A. (2020). Semantic Categorization of Software Bug Repositories for Severity Assignment Automation. In: Jarzabek, S., Poniszewska-Marańda, A., Madeyski, L. (eds) Integrating Research and Practice in Software Engineering. Studies in Computational Intelligence, vol 851. Springer, Cham. https://doi.org/10.1007/978-3-030-26574-8_2

Download citation

DOI: https://doi.org/10.1007/978-3-030-26574-8_2
Published: 03 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-26573-1
Online ISBN: 978-3-030-26574-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics