Feature Comparison for Automatic Bug Report Classification

Luaphol, Bancha; Srikudkao, Boonchoo; Kachai, Tontrakant; Srikanjanapert, Natthakit; Polpinij, Jantima; Bheganan, Poramin

doi:10.1007/978-3-030-19861-9_7

Bancha Luaphol¹⁸,
Boonchoo Srikudkao¹⁸,
Tontrakant Kachai¹⁸,
Natthakit Srikanjanapert¹⁸,
Jantima Polpinij¹⁸ &
…
Poramin Bheganan¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 936))

Included in the following conference series:

International Conference on Computing and Information Technology

445 Accesses
4 Citations

Abstract

Nowadays, various bug tracking systems (BTS) such as Jira, Trace, and Bugzilla have been developed and proposed to gather the issues from users worldwide. This is because those issues, called bug reports, contain a significant information for software quality maintenance and improvement. However, many bug reports with poor quality might have been submitted to the BTS. In general, the reported bugs in the BTS are firstly analyzed and filtered out by bug triagers. However, with the increasing amount of bug reports in the BTS, manually classifying bug reports is a time-consuming task. To address this problem, automatically distinguishing of bugs and non-bugs is necessary. To the best of our knowledge, this task is never easy for bug reports classification because the problem of bug reports misclassification still occurs to date. The background of this problem may be arise from using inappropriate or confusing features. Therefore, this work aims to study and discover the most proper features for binary bug report classification. This study compares seven features such as unigram, bigram, camel case, unigram+bigram, unigram+camel case, bigram+ camel case, and all features together. The experimental results show that the unigram+camel case should be the most proper features for binary bug report classification, especially when using with the logistic regression algorithm. Consequently, the unigram+camel case should be the proper feature to distinguish bug reports from the non-bugs ones.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Sandusky, R.J., Gasser, L., Ripoche, G.: Bug report networks: varieties, strategies, and impacts in a F/OSS development community. In: The 1st International Workshop on Mining Software Repositories, pp. 80–84 (2004)
Google Scholar
Jalbert, N., Weimer, W.: Automated duplicate detection for bug tracking systems. In: IEEE International Conference on Dependable Systems and Networks With FTCS and DCC, pp. 52–61 (2008)
Google Scholar
Wang, X., Zhang, L., Xie, T., Anvik, J., Sun, J.: An approach to detecting duplicate bug reports using natural language and execution information. In: ACM/IEEE 30th International Conference on Software Engineering, pp. 461–470 (2008)
Google Scholar
Bhattacharya, P., Neamtiu, I.: Bug-fix time prediction models: can we do better? In: Proceedings of the 8th Working Conference on Mining Software Repositories, pp. 207–210 (2011)
Google Scholar
Tian, Y., Sun, C., Lo, D.: Improved duplicate bug report identification. In: The 16th European Conference on Software Maintenance and Reengineering (CSMR), pp. 385–390 (2012)
Google Scholar
Zhang, J., Wang, X., Hao, D., Xie, B., Zhang, L., Mei, H.: A survey on bug-report analysis. Sci. China Inf. Sci. 58(2), 1–24 (2015)
Article Google Scholar
Aggarwal, K., Timbers, F., Rutgers, T., Hindle, A., Stroulia, E., Greiner, R.: Detecting duplicate bug reports with software engineering domain knowledge. J. Softw.: Evol. Process. 29(3), e1821 (2017)
Google Scholar
Anvik, J., Murphy, G.C.: Reducing the effort of bug report triage: recommenders for development oriented decisions. ACM Trans. Softw. Eng. Methodol. 20, 10:1–10:35 (2011)
Article Google Scholar
Antoniol, G., Ayari, K., Di Penta, M., Khomh, F., Guéhéneuc, Y.-G.: Is it a bug or an enhancement?: A text-based approach to classify change requests. In: Proceedings of the 2008 Conference of the Center for Advanced Studies on Collaborative Research: Meeting Of Minds. ACM (2008)
Google Scholar
Pingclasai, N., Hata, H., Matsumoto, K.-I.: Classifying bug reports to bugs and other requests using topic modeling. In: The 20th Asia-Pacific Software Engineering Conference (APSEC), pp. 13–18. IEEE (2013)
Google Scholar
Herzig, K., Just, S., Zeller, A.: It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: The 35th International Conference on Software Engineering (ICSE), pp. 103–104 (2013)
Google Scholar
Limsettho, N., Hata, H., Monden, A., Matsumoto, K.: Automatic unsupervised bug report categorization. In: The 6th International Workshop on Empirical Software Engineering in Practice (IWESEP), pp. 7–12. IEEE (2014)
Google Scholar
Qin, H., Sun, X.: Classifying bug reports into bugs and non-bugs using LSTM. In: Proceedings of the Tenth Asia-Pacific Symposium on Internetware. ACM (2018)
Google Scholar
Zhou, Y., Tong, Y., Gu, R., Gall, H.: Combining text mining and data mining for bug report classification. In: 2014 IEEE International Conference on Software Maintenance and Evolution, pp. 311–320. IEEE (2014)
Google Scholar
Lamkanfi, A., Demeyer, S., Giger, E., Goethals, B.: Predicting the severity of a reported bug. In: 2010 7th IEEE Working Conference on Mining Software Repositories (MSR 2010), pp. 1–10. IEEE (2010)
Google Scholar
Ko, A.J., Myers, B.A., Chau, D.H.: A linguistic analysis of how people describe software problems. In: Visual Languages and Human Centric Computing (VL/HCC 2006), pp. 127–134 (2006)
Google Scholar
Pandey, N., Hudait, A., Sanyal, D.K., Sen, A.: Automated classification of issue reports from a software issue tracker. Presented at the Progress in Intelligent Computing Techniques: Theory, Practice, and Applications (2018)
Google Scholar
Almhana, R., Mkaouer, W., Kessentini, M., Ouni, A.: Recommending relevant classes for bug reports using multi-objective search. In: Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering, pp. 286–295 (2016)
Google Scholar
Zhou, J., Zhang, H., Lo, D.: Where should the bugs be fixed? More accurate information retrieval based bug localization based on bug reports. In: Proceedings of the 34th International Conference on Software Engineering, pp. 14–24 (2012)
Google Scholar
Ye, X., Bunescu, R., Liu, C.: Mapping bug reports to relevant files: a ranking model, a fine-grained benchmark, and feature evaluation. IEEE Trans. Softw. Eng. 42(4), 379–402 (2016)
Article Google Scholar
Indurkhya, N., Damerau, F.J.: Handbook of Natural Language Processing. CRC Press, New York (2010)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. https://nlp.stanford.edu/pubs/snli_paper.pdf
De Marneffe, M.C., MacCartney, B., Manning, C.D.: Generating typed dependency parses from phrase structure parses. In: Proceedings of LREC, pp. 449–54 (2006)
Google Scholar
Nizamani, Z.A., Liu, H., Chen, D.M., Niu, Z.: Automatic approval prediction for software enhancement requests. Autom. Softw. Eng. 25, 347–381 (2017). https://doi.org/10.1007/s10515-017-0229-y
Article Google Scholar
Mitra, V., Wang, C.-J., Banerjee, S.: Text classification: a least square support vector machine approach. Appl. Soft Comput. 7, 908–914 (2007)
Article Google Scholar
Webb, AR., Copsey, K.D.: Statistical Pattern Recognition, 3rd edn (2011). https://doi.org/10.1002/9781119952954
Book Google Scholar
Pandey, N., Sanyal, D.K., Hudait, A., Sen, A.: Automated classification of software issue reports using machine learning techniques: an empirical study. Innov. Syst. Softw. Eng. 13, 1–19 (2017)
Article Google Scholar
Du, X., Zheng, Z., Xiao, G., Yin, B.: The automatic classification of fault trigger based bug report. In: 2017 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW), pp. 259–265. IEEE (2017)
Google Scholar
Terdchanakul, P., Hata, H., Phannachitta, P., Matsumoto, K.: Bug or not? Bug report classification using N-gram IDF. In: Proceedings 2017 IEEE International Conference on Software Maintenance and Evolution, ICSME 2017, pp. 534–538. IEEE (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Intellect Laboratory, Department of Computer Science, Faculty of Informatics, Mahasarakham University, Mahasarakham province, Thailand
Bancha Luaphol, Boonchoo Srikudkao, Tontrakant Kachai, Natthakit Srikanjanapert & Jantima Polpinij
Computer Science Program, Mahidol University International College, Mahidol University, Nakhonpathom province, Thailand
Poramin Bheganan

Authors

Bancha Luaphol
View author publications
You can also search for this author in PubMed Google Scholar
Boonchoo Srikudkao
View author publications
You can also search for this author in PubMed Google Scholar
Tontrakant Kachai
View author publications
You can also search for this author in PubMed Google Scholar
Natthakit Srikanjanapert
View author publications
You can also search for this author in PubMed Google Scholar
Jantima Polpinij
View author publications
You can also search for this author in PubMed Google Scholar
Poramin Bheganan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Bancha Luaphol or Poramin Bheganan .

Editor information

Editors and Affiliations

Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
Pongsarun Boonyopakorn
Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
Phayung Meesad
Faculty of Information Technology, King Mongkut’s University of Technology North Bangkok, Bangkok, Thailand
Sunantha Sodsee
LG Kommunikationsnetze, FernUniversität in Hagen, Hagen, Germany
Herwig Unger

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Luaphol, B., Srikudkao, B., Kachai, T., Srikanjanapert, N., Polpinij, J., Bheganan, P. (2020). Feature Comparison for Automatic Bug Report Classification. In: Boonyopakorn, P., Meesad, P., Sodsee, S., Unger, H. (eds) Recent Advances in Information and Communication Technology 2019. IC2IT 2019. Advances in Intelligent Systems and Computing, vol 936. Springer, Cham. https://doi.org/10.1007/978-3-030-19861-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-030-19861-9_7
Published: 12 May 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-19860-2
Online ISBN: 978-3-030-19861-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics