A Cost-Sensitive Shared Hidden Layer Autoencoder for Cross-Project Defect Prediction

Li, Juanjuan; Jing, Xiao-Yuan; Wu, Fei; Sun, Ying; Yang, Yongguang

doi:10.1007/978-3-030-31726-3_42

Juanjuan Li¹⁶,
Xiao-Yuan Jing¹⁷,
Fei Wu¹⁶,
Ying Sun¹⁶ &
…
Yongguang Yang¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11859))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

1885 Accesses

Abstract

Cross-project defect prediction means training a classifier model using the historical data of the other source project, and then testing whether the target project instance is defective or not. Since source and target projects have different data distributions, and data distribution difference will degrade the performance of classifier. Furthermore, the class imbalance of datasets increases the difficulty of classification. Therefore, a cost-sensitive shared hidden layer autoencoder (CSSHLA) method is proposed. CSSHLA learns a common feature representation between source and target projects by shared hidden layer autoencoder, and makes the different data distributions more similar. To solve the class imbalance problem, CSSHLA introduces a cost-sensitive factor to assign different importance weights to different instances. Experiments on 10 projects of PROMISE dataset show that CSSHLA improves the performance of cross-project defect prediction compared with baselines.

The first author is a student.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Boehm, B.W.: Industrial software metrics top 10 list. IEEE Softw. 4(5), 84–85 (1987)
Article Google Scholar
Camargo Cruz, A.E., Ochimizu, K.: Towards logistic regression models for predicting fault-prone code across software projects. In: International Symposium on Empirical Software Engineering and Measurement, pp. 460–463 (2009)
Google Scholar
Liu, C., Yang, D., Xia, X., Yan, M., Zhang, X.: A two-phase transfer learning model for cross-project defect prediction. Inf. Softw. Technol. 107, 125–136 (2019)
Article Google Scholar
Wu, F., et al.: Intraspectrum discrimination and interspectrum correlation analysis deep network for multispectral face recognition. IEEE Trans. Cybern. 1–14 (2018)
Google Scholar
Wu, F., et al.: Cross-project and within-project semisupervised software defect prediction: a unified approach. IEEE Trans. Reliab. 67(2), 581–597 (2018)
Article Google Scholar
Tong, H., Liu, B., Wang, S.: Software defect prediction using stacked denoising autoencoders and two-stage ensemble learning. Inf. Softw. Technol. 96, 94–111 (2018)
Article Google Scholar
Herbold, S.: Training data selection for cross-project defect prediction. In: International Conference on Predictive Models in Software Engineering, p. 6 (2013)
Google Scholar
Hinton, G., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Deng, J., Xia, R., Zhang, Z., Liu, Y., Schuller, B.: Introducing shared-hidden-layer autoencoders for transfer learning and their application in acoustic emotion recognition. In: International Conference on Acoustics, Speech and Signal Processing, pp. 4818–4822 (2014)
Google Scholar
Deng, J., Zhang, Z., Eyben, F., Schuller, B.: Autoencoder-based unsupervised domain adaptation for speech emotion recognition. IEEE Signal Process. Lett. 21(9), 1068–1072 (2014)
Article Google Scholar
Jureczko, M., Madeyski, L.: Towards identifying software project clusters with regard to defect prediction. In: International Conference on Predictive Models in Software Engineering, p. 9 (2010)
Google Scholar
Minku, L., Sarro, F., Mende, E., Ferrucci, F.: How to make best use of cross-company data for web effort estimation? In: International Symposium on Empirical Software Engineering and Measurement, pp. 1–10 (2015)
Google Scholar
Zhao, L., Shang, Z., Zhao, L., Qin, A., Tang, Y.Y.: Siamese dense neural network for software defect prediction with small data. IEEE Access 7, 7663–7677 (2019)
Article Google Scholar
Nam, J., Pan, S.J., Kim, S.: Transfer defect learning. In: International Conference on Software Engineering, pp. 382–391 (2013)
Google Scholar
Wang, S., Yao, X.: Using class imbalance learning for software defect prediction. IEEE Trans. Reliab. 62(2), 434–443 (2013)
Article Google Scholar
Wang, S., Liu, T., Tan, L.: Automatically learning semantic features for defect prediction. In: International Conference on Software Engineering, pp. 297–308 (2016)
Google Scholar
Kotsiantis, S.B., Kanellopoulos, D., Pintelas, P.E.: Data preprocessing for supervised learning. Int. J. Comput. Sci. 1(2), 111–117 (2006)
Google Scholar
Kim, S., Zhang, H., Wu, R., Gong, L.: Dealing with noise in defect prediction. In: International Conference on Software Engineering, pp. 481–490 (2011)
Google Scholar
Liu, W., Liu, S., Gu, Q., Chen, J., Chen, X., Chen, D.: Empirical studies of a two-stage data preprocessing approach for software fault prediction. IEEE Trans. Reliab. 65(1), 38–53 (2016)
Article Google Scholar
Yang, X., Lo, D., Xia, X., Zhang, Y., Sun, J.: Deep learning for just-in-time defect prediction. In: International Conference on Software Quality, Reliability and Security, pp. 17–26 (2015)
Google Scholar
Gao, Y., Yang, C., Liang, L.: Software defect prediction based on geometric mean for subspace learning. In: Advanced Information Technology, Electronic and Automation Control Conference, pp. 225–229 (2017)
Google Scholar
Yang, Y., et al.: Are slice-based cohesion metrics actually useful in effort-aware post-release fault-proneness prediction? An empirical study. IEEE Trans. Softw. Eng. 41(4), 331–357 (2015)
Article Google Scholar
Li, Z., Jing, X., Wu, F., Zhu, X., Xu, B., Ying, S.: Cost-sensitive transfer kernel canonical correlation analysis for heterogeneous defect prediction. Autom. Softw. Eng. 25(2), 201–245 (2018)
Article Google Scholar

Download references

Acknowledgements

The work described in this paper was supported by National Natural Science Foundation of China (No. 61702280), Natural Science Foundation of Jiangsu Province (No. BK20170900), National Postdoctoral Program for Innovative Talents (No. BX20180146), Scientific Research Starting Foundation for Introduced Talents in NJUPT (NUPTSF, No. NY217009), and the Postgraduate Research & Practice Innovation Program of Jiangsu Province KYCX17_0794.

Author information

Authors and Affiliations

College of Automation, Nanjing University of Posts and Telecommunications, Nanjing, China
Juanjuan Li, Fei Wu, Ying Sun & Yongguang Yang
School of Computer, Wuhan University, Wuhan, China
Xiao-Yuan Jing

Authors

Juanjuan Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-Yuan Jing
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar
Ying Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yongguang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao-Yuan Jing .

Editor information

Editors and Affiliations

School of EECS, Peking University, Beijing, China
Zhouchen Lin
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Liang Wang
Nanjing University of Science and Technology, Nanjing, China
Jian Yang
Xidian University, Xi’an, China
Guangming Shi
Institute of Automation, Chinese Academy of Sciences, Beijing, China
Tieniu Tan
Institute of Artificial Intelligence, Xi’an Jiaotong University, Xi’an, China
Nanning Zheng
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Northwestern Polytechnical University, Xi’an, China
Yanning Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Jing, XY., Wu, F., Sun, Y., Yang, Y. (2019). A Cost-Sensitive Shared Hidden Layer Autoencoder for Cross-Project Defect Prediction. In: Lin, Z., et al. Pattern Recognition and Computer Vision. PRCV 2019. Lecture Notes in Computer Science(), vol 11859. Springer, Cham. https://doi.org/10.1007/978-3-030-31726-3_42

Download citation

DOI: https://doi.org/10.1007/978-3-030-31726-3_42
Published: 31 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31725-6
Online ISBN: 978-3-030-31726-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics