Abstract
Software Defect Prediction (SDP) is one of the highly influential software engineering research topics. Early within-project defect prediction (WPDP) used intra-project data. However, it has limitations in prediction efficiency for new projects and projects without adequate training data. Studies of prediction have been carried out on cross-project defect prediction models (CPDP), i.e. models that are trained using other projects historical data. Heterogeneous defect prediction (HDP) is the special case of CPDP with different metric sets of source and target project. Despite the effectiveness of existing HDP methods, they can be affected by the issue of class imbalance that may decrease prediction performance. The proposed framework aims to exploit cost-sensitive principal component analysis (PCA) and the feature matching to build highly effective prediction model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
D’Ambros, M., Lanza, M., Robbes, R.: Evaluating defect prediction approaches: a benchmark and an extensive comparison. Empir. Softw. Eng. 17(4–5), 531–577 (2012)
Fukushima, T., Kamei, Y., McIntosh, S., Yamashita, K., Ubayashi, N.: An empirical study of just-in-time defect prediction using cross-project models. In: Proceedings of the 11th Working Conference on Mining Software Repositories, pp. 172–181. ACM (2014)
Lee, T., Nam, J., Han, D., Kim, S., In, H.P.: Micro interaction metrics for defect prediction. In: Proceedings of the 19th ACM SIGSOFT symposium and the 13th European conference on Foundations of software engineering, pp. 311–321. ACM (2011)
Xu, Z., Li, S., Tang, Y., Luo, X., Zhang, T., Liu, J., Xu, J.: Cross version defect prediction with representative data via sparse subset selection. In: Proceedings of the 26th Conference on Program Comprehension, pp. 132–143. ACM (2018)
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)
Li, M., Zhang, H., Wu, R., Zhou, Z.H.: Sample-based software defect prediction with active and semi-supervised learning. Autom. Softw. Eng. 19(2), 201–230 (2012)
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)
Nam, J., Pan, S.J., Kim, S.: Transfer defect learning. In: 2013 35th International Conference on Software Engineering (ICSE), pp. 382–391. IEEE (2013)
Rahman, F., Posnett, D., Devanbu, P.: Recalling the imprecision of cross-project defect prediction. In: Proceedings of the ACM SIGSOFT 20th International Symposium on the Foundations of Software Engineering, p. 61. ACM (2012)
Shivaji, S., Whitehead, E.J., Akella, R., Kim, S.: Reducing features to improve code change-based bug prediction. IEEE Trans. Softw. Eng. 39(4), 552–569 (2013)
Zimmermann, T., Nagappan, N., Gall, H., Giger, E., Murphy, B.: Cross-project defect prediction: a large scale experiment on data versus domain versus process. In: Proceedings of the the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, pp. 91–100. ACM (2009)
Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: 2011 33rd International Conference on Software Engineering (ICSE), pp. 481–490. IEEE (2011)
Lee, T., Nam, J., Han, D., Kim, S., In, H.P.: Developer microinteraction metrics for software defect prediction. IEEE Trans. Softw. Eng. 42(11), 1015–1035 (2016)
Jing, X.Y., Ying, S., Zhang, Z.W., Wu, S.S., Liu, J.: Dictionary learning based software defect prediction. In: Proceedings of the 36th International Conference on Software Engineering, pp. 414–423. ACM (2014)
Liu, M., Miao, L., Zhang, D.: Two-stage cost-sensitive learning for software defect prediction. IEEE Trans. Reliab. 63(2), 676–686 (2014)
Jing, X., Wu, F., Dong, X., Qi, F., Xu, B.: Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, pp. 496–507. ACM (2015)
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. Empir. Softw. Eng. 14(5), 540–578 (2009)
Ma, Y., Luo, G., Zeng, X., Chen, A.: Transfer learning for cross-company software defect prediction. Inf. Softw. Technol. 54(3), 248–256 (2012)
Turhan, B., Misirli, A.T., Bener, A.: Empirical evaluation of the effects of mixed project data on learning defect predictors. Inf. Softw. Technol. 55(6), 1101–1118 (2013)
Peters, F., Menzies, T., Gong, L., Zhang, H.: Balancing privacy and utility in cross-company defect prediction. IEEE Trans. Softw. Eng. 39(8), 1054–1068 (2013)
Peters, F., Menzies, T., Layman, L.: Lace2: better privacy-preserving data sharing for cross project defect prediction. In: ICSE’15, pp. 801–811 (2015)
Ryu, D., Choi, O., Baik, J.: Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empir. Softw. Eng. 21(1), 43–71 (2016)
Peters, F., Menzies, T., Marcus, A.: Better cross company defect predictioon. In: Proceedings of the 10th Working Conference on Mining Software Repositories, pp. 409–418. IEEE Press (2013)
Nam, J., Fu, W., Kim, S., Menzies, T., Tan, L.: Heterogeneous defect prediction. IEEE Trans. Softw. Eng. 44(9), 874–896 (2018)
Yu, Q., Jiang, S. and Zhang, Y.: A feature matching and transfer approach for cross-company defect prediction. J. Syst. Softw. 132, 366–378 (2017)
Cheng, M., Wu, G., Jiang, M., Wan, H., You, G., Yuan, M.: Heterogeneous defect prediction via exploiting correlation subspace. In: SEKE, pp. 171–176 (2016)
He, Z., Shu, F., Yang, Y., Li, M., Wang, Q.: An investigation on the feasibility of cross-project defect prediction. Autom. Softw. Eng. 19(2), 167–199 (2012)
Ali, A., Shamsuddin, S.M., Ralescu, A.L.: Classification with class imbalance problem: a review. Int. J. Adv. Soft Comput. Appl. 7(3), 176–204 (2015)
Braytee, A., Liu, W., Kennedy, P.: A cost-sensitive learning strategy for feature extraction from imbalanced data. In: International Conference on Neural Information Processing, pp. 78–86. Springer, Cham (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Thae Hsu Hsu Mon, Hnin Min Oo (2020). Feature Representation and Feature Matching for Heterogeneous Defect Prediction. In: Lee, R. (eds) Computer and Information Science. ICIS 2019. Studies in Computational Intelligence, vol 849. Springer, Cham. https://doi.org/10.1007/978-3-030-25213-7_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-25213-7_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-25212-0
Online ISBN: 978-3-030-25213-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)