A transfer cost-sensitive boosting approach for cross-project defect prediction

Abstract

Software defect prediction has been regarded as one of the crucial tasks to improve software quality by effectively allocating valuable resources to fault-prone modules. It is necessary to have a sufficient set of historical data for building a predictor. Without a set of sufficient historical data within a company, cross-project defect prediction (CPDP) can be employed where data from other companies are used to build predictors. In such cases, a transfer learning technique, which extracts common knowledge from source projects and transfers it to a target project, can be used to enhance the prediction performance. There exists the class imbalance problem, which causes difficulties for the learner to predict defects. The main impacts of imbalanced data under cross-project settings have not been investigated in depth. We propose a transfer cost-sensitive boosting method that considers both knowledge transfer and class imbalance for CPDP when given a small amount of labeled target data. The proposed approach performs boosting that assigns weights to the training instances with consideration of both distributional characteristics and the class imbalance. Through comparative experiments with the transfer learning and the class imbalance learning techniques, we show that the proposed model provides significantly higher defect detection accuracy while retaining better overall performance. As a result, a combination of transfer learning and class imbalance learning is highly effective for improving the prediction performance under cross-project settings. The proposed approach will help to design an effective prediction model for CPDP. The improved defect prediction performance could help to direct software quality assurance activities and reduce costs. Consequently, the quality of software can be managed effectively.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

References

  1. Arcuri, A., & Briand, L. (2011). A practical guide for using statistical tests to assess randomized algorithms in software engineering. In 33rd International Conference on Software Engineering (ICSE) (pp. 1–10). doi:10.1145/1985793.1985795.

  2. Arisholm, E., Briand, L. C., & Johannessen, E. B. (2010). A systematic and comprehensive investigation of methods to build and evaluate fault prediction models. Journal of Systems and Software, 83(1), 2–17. doi:10.1016/j.jss.2009.06.055.

    Article  Google Scholar 

  3. Bansiya, J., & Davis, C. G. (2002). A hierarchical model for object-oriented design quality assessment. IEEE Transactions on Software Engineering, 28(1), 4–17. doi:10.1109/32.979986.

    Article  Google Scholar 

  4. Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE : Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

    MATH  Google Scholar 

  5. Chen, L., Fang, B., Shang, Z., & Tang, Y. (2015). Negative samples reduction in cross-company software defects prediction. Information and Software Technology, 62, 67–77. doi:10.1016/j.infsof.2015.01.014.

    Article  Google Scholar 

  6. Chidamber, S. R., & Kemerer, C. F. (1994). A metrics suite for object oriented design. IEEE Transactions on Software Engineering, 20(6), 476–493. doi:10.1109/32.295895.

    Article  Google Scholar 

  7. D’Ambros, M., Lanza, M., & Robbes, R. (2011). Evaluating defect prediction approaches: A benchmark and an extensive comparison. Empirical Software Engineering,. doi:10.1007/s10664-011-9173-9.

    Google Scholar 

  8. Dai, W., Yang, Q., Xue, G., & Yu, Y. (2007). Boosting for transfer learning. In Proceedings of the 24th international conference on Machine learning (pp. 193–200). http://dl.acm.org/citation.cfm?id=1273521. Accessed February 25, 2014.

  9. Dejaeger, K. (2013). Toward Comprehensible Software Fault Prediction Models Using Bayesian Network Classifiers. IEEE Transactions on Software Engineering, 39(2), 237–257. http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6175912. Accessed February 25, 2014.

  10. Eaton, E., & DesJardins, M. (2011). Selective transfer between learning tasks using task-based boosting. AAAI, 337–342. http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/viewFile/3752@misc/3915. Accessed June 11, 2014.

  11. Elish, K. O., & Elish, M. O. (2008). Predicting defect-prone software modules using support vector machines. Journal of Systems and Software, 81(5), 649–660. doi:10.1016/j.jss.2007.07.040.

    Article  Google Scholar 

  12. Fan, W., Stolfo, S., Zhang, J., & Chan, P. (1999). AdaCost: misclassification cost-sensitive boosting. ICML. http://scholar.google.com/scholar?hl=en&btnG=Search&q=intitle:AdaCost+:+Misclassification+Cost-sensitive+Boosting#0. Accessed November 25, 2014.

  13. Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1), 119–139. doi:10.1006/jcss.1997.1504.

    MathSciNet  Article  MATH  Google Scholar 

  14. Grbac, T., Mausa, G., & Basic, B. (2013). Stability of Software defect prediction in relation to levels of data imbalance. SQAMIA. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.402.8978&rep=rep1&type=pdf. Accessed November 13, 2014.

  15. Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6), 1276–1304. doi:10.1109/TSE.2011.103.

    Article  Google Scholar 

  16. Hall, M., Frank, E., & Holmes, G. (2009). The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1), 10–18. http://dl.acm.org/citation.cfm?id=1656278. Accessed November 13, 2014.

  17. He, Z., Shu, F., Yang, Y., Li, M., & Wang, Q. (2011). An investigation on the feasibility of cross-project defect prediction. Automated Software Engineering,. doi:10.1007/s10515-011-0090-3.

    Google Scholar 

  18. Henderson-Sellers, B. (1995). Object-oriented metrics: measures of complexity, Prentice-Hall, Inc.

  19. Jureczko, M., & Madeyski, L. (2010). Towards identifying software project clusters with regard to defect prediction. In Proceedings of the 6th international conference on predictive models in software engineeringPROMISE ‘10, 1. doi:10.1145/1868328.1868342.

  20. Jureczko, M., & Spinellis, D. (2010). Using object-oriented design metrics to predict software defects. In Models and Methods of System Dependability. Oficyna Wydawnicza Politechniki Wrocławskiej (pp. 69–81).

  21. Ma, Y., Luo, G., Zeng, X., & Chen, A. (2012). Transfer learning for cross-company software defect prediction. Information and Software Technology, 54(3), 248–256. doi:10.1016/j.infsof.2011.09.007.

    Article  Google Scholar 

  22. Martin, R. (1994). OO design quality metrics. An analysis of dependencies, 12, 151–170.

    Google Scholar 

  23. McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering SE, 2(4), 308–320. doi:10.1109/TSE.1976.233837.

    MathSciNet  Article  MATH  Google Scholar 

  24. Mei-Huei, T., Ming-Hung, K., & Mei-Hwa, C. (1999). An empirical study on object-oriented metrics. In Proceedings sixth international software metrics symposium (Cat. No.PR00403) (pp. 242–249). IEEE Computer Society. doi:10.1109/METRIC.1999.809745.

  25. Menzies, T., Caglayan, B., He, Z., Kocaguneli, E., Krall, J., Peters, F., & Turhan, B. (2012). The PROMISE Repository of empirical software engineering data. http://openscience.us/repo/.

  26. Menzies, T., Dekhtyar, A., Distefano, J., & Greenwald, J. (2007). Problems with precision: A response to “Comments on ‘data mining static code attributes to learn defect predictors’”. IEEE Transactions on Software Engineering,. doi:10.1109/TSE.2007.70721.

    Google Scholar 

  27. Menzies, T., Milton, Z., Turhan, B., Cukic, B., Jiang, Y., & Bener, A. (2010). Defect prediction from static code features: Current results, limitations, new approaches. Automated Software Engineering, 17(4), 375–407. doi:10.1007/s10515-010-0069-5.

    Article  Google Scholar 

  28. Nam, J., Pan, S. J., & Kim, S. (2013). Transfer defect learning. In 35th International Conference on Software Engineering (ICSE) (pp. 382–391). doi:10.1109/ICSE.2013.6606584.

  29. Ryu, D., Choi, O., & Baik, J. (2014). Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering. doi:10.1007/s10664-014-9346-4.

    Google Scholar 

  30. Shi, X., Fan, W., & Ren, J. (2008). Actively transfer domain knowledge. In Machine Learning and Knowledge Discovery in Databases, (60703110) (pp. 342–357). http://link.springer.com/chapter/10.1007/978-3-540-87481-2_23. Accessed November 29, 2014.

  31. Singh, Y., Kaur, A., & Malhotra, R. (2009). Empirical validation of object-oriented metrics for predicting fault proneness models. Software Quality Journal, 18(1), 3–35. doi:10.1007/s11219-009-9079-6.

    Article  Google Scholar 

  32. Tan, P.-N., Steinbach, M., & Kumar, V. (2005). Introduction to data mining. Journal of School Psychology, 19, 51–56. doi:10.1016/0022-4405(81)90007-8.

    Google Scholar 

  33. Tomek, I. (1976). Two modifications of CNN. IEEE Transaction Systems, Man and Cybernetics, 769–772. http://ci.nii.ac.jp/naid/80013575533/. Accessed January 26, 2015.

  34. Turhan, B., Menzies, T., Bener, A. B., & Di Stefano, J. (2009). On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 14(5), 540–578. doi:10.1007/s10664-008-9103-7.

    Article  Google Scholar 

  35. Turhan, B., Tosun Mısırlı, A., & Bener, A. (2013). Empirical evaluation of the effects of mixed project data on learning defect predictors. Information and Software Technology, 55(6), 1101–1118. doi:10.1016/j.infsof.2012.10.003.

    Article  Google Scholar 

  36. Vargha, A., & Delaney, H. D. (2000). A critique and improvement of the CL common language effect size statistics of McGraw and Wong. Journal of Educational and Behavioral Statistics,. doi:10.3102/10769986025002101.

    Google Scholar 

  37. Wang, S., Chen, H., & Yao, X. (2010). Negative correlation learning for classification ensembles. In The 2010 International Joint Conference on Neural Networks (IJCNN) (pp. 1–8). doi:10.1109/IJCNN.2010.5596702.

  38. Wang, B. X., & Japkowicz, N. (2009). Boosting support vector machines for imbalanced data sets. Knowledge and Information Systems, 25(1), 1–20. doi:10.1007/s10115-009-0198-y.

    Article  Google Scholar 

  39. Wang, S., & Yao, X. (2013). Using class imbalance learning for software defect prediction. IEEE Transactions on Reliability, 62(2), 434–443. doi:10.1109/TR.2013.2259203.

    Article  Google Scholar 

  40. Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics Bulletin, 1(6), 80–83. http://www.jstor.org/stable/3001968. Accessed October 14, 2014.

  41. Yao, Y., & Doretto, G. (2010). Boosting for transfer learning with multiple sources. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, 1855–1862. doi:10.1109/CVPR.2010.5539857.

    Google Scholar 

  42. Zimmermann, T., Nagappan, N., Gall, H., Giger, E., & Murphy, B. (2009). Cross-project defect prediction. In Proceedings of the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering (p. 91). doi:10.1145/1595696.1595713.

Download references

Acknowledgments

This work was partly supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (Ministry of Science, ICT and Future Planning (MSIP)) (No. NRF-2013R1A1A2006985) and Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No.R0101-15-0144, Development of Autonomous Intelligent Collaboration Framework for Knowledge Bases and Smart Devices).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Duksan Ryu.

Appendix

Appendix

Figures 10, 11, 12, and 13 show mini box plots of median PD, PF, G-mean, and Balance values of six models for each target project over 15 data sets using 5 % of WP data. Tables 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, and 30 show the comparison of TCSBoost with classification models for each target project using 5 % of WP data. The boldface in the table shows the significantly better result of TCSBoost with p value < 0.01 or A-statistics > 0.64 (For PF, A-statistics < 0.36).

Fig. 10
figure10

Mini box plots of median PD values of six models over 15 data sets using 5 % WP data

Fig. 11
figure11

Mini box plots of median PF values of six models over 15 data sets using 5 % WP data

Fig. 12
figure12

Mini box plots of median G-mean values of six models over 15 data sets using 5 % WP data

Fig. 13
figure13

Mini box plots of median Balance values of six models over 15 data sets using 5 % WP data

Table 16 Comparison of TCSBoost with classification models for the ant project using 5 % of WP data
Table 17 Comparison of TCSBoost with classification models for the arc project using 5 % of WP data
Table 18 Comparison of TCSBoost with classification models for the camel project using 5 % of WP data
Table 19 Comparison of TCSBoost with classification models for the e-learning project using 5 % of WP data
Table 20 Comparison of TCSBoost with classification models for the jedit project using 5 % of WP data
Table 21 Comparison of TCSBoost with classification models for the log4j project using 5 % of WP data
Table 22 Comparison of TCSBoost with classification models for the lucene project using 5 % of WP data
Table 23 Comparison of TCSBoost with classification models for the poi project using 5 % of WP data
Table 24 Comparison of TCSBoost with classification models for the prop-6 project using 5 % of WP data
Table 25 Comparison of TCSBoost with classification models for the redaktor using 5 % of WP data
Table 26 Comparison of TCSBoost with classification models for the synapse project using 5 % of WP data
Table 27 Comparison of TCSBoost with classification models for the systemdata project using 5 % of WP data
Table 28 Comparison of TCSBoost with classification models for the tomcat project using 5 % of WP data
Table 29 Comparison of TCSBoost with classification models for the xalan project using 5 % of WP data
Table 30 Comparison of TCSBoost with classification models for the xerces project using 5 % of WP data

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ryu, D., Jang, JI. & Baik, J. A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Qual J 25, 235–272 (2017). https://doi.org/10.1007/s11219-015-9287-1

Download citation

Keywords

  • Boosting
  • Class imbalance
  • Cost-sensitive learning
  • Cross-project defect prediction
  • Software defect prediction
  • Transfer learning