Soft Computing

, Volume 23, Issue 24, pp 13679–13690 | Cite as

Unsupervised software defect prediction using signed Laplacian-based spectral classifier

  • Aris Marjuni
  • Teguh Bharata AdjiEmail author
  • Ridi Ferdiana
Methodologies and Application


The lack of training dataset availability is the most popular issue in the software defect prediction, especially when dealing with new project development. Adopting training dataset from other software projects probably will not be the best solution because of the software metrics heterogeneity issues across projects. Unsupervised approaches have been proposed to address this issue, where the software prediction model is built without training dataset. Spectral classifier is one of these unsupervised approaches that has been applied successfully to address the lack of training dataset. However, this method leaves an issue when the dataset does not meet the requirement of nonnegative Laplacian assumption. This case would be occurred if there were nonnegative values of the adjacency matrix. It is well known that spectral classifier works with the Laplacian matrix, where the Laplacian matrix is constructed by adjacency matrix. In this paper, the signed Laplacian-based spectral classifier is proposed to solve the negative values problem in the adjacency matrix by converting the negative values into absolute values. The experimental results show that the proposed method could improve the performance of unsupervised classifiers compared to the unsigned Laplacian-based spectral classifier method. Hence, the proposed method is strongly suggested as unsupervised software defects prediction for the software projects that have no historical software dataset.


Unsupervised software defect prediction Spectral clustering Absolute adjacency matrix Signed Laplacian Unsigned Laplacian 


Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Human and animal participants

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.


  1. Abaei G, Rezaei Z, Selamat A (2013) Fault prediction by utilizing self-organizing map and threshold. In: Proceedings of the 2013 IEEE international conference on control system, computing and engineering (ICCSCE), pp 465–470Google Scholar
  2. Aggarwal CK, Reddy C (2014) Data clustering: algorithms and applications. CRC Press, Boca Raton, pp 177–194CrossRefGoogle Scholar
  3. Arar ÖF, Ayan K (2015) Software defect prediction using cost-sensitive neural network. Appl Soft Comput 33:263–277CrossRefGoogle Scholar
  4. Bishnu PS, Bhattacherjee V (2012) Software fault prediction using quad tree-based K-means clustering algorithm. IEEE Trans Knowl Data Eng 24(6):1146–1150CrossRefGoogle Scholar
  5. Catal C, Sevim U, Diri B (2009) Software fault prediction of unlabeled program modules. In: Proceedings of the world congress on engineering, pp 1–6Google Scholar
  6. Gallier J (2016) Spectral theory of unsigned and signed graphs. applications to graph clustering: a survey, pp 1–122. arXiv:1601.04692
  7. Hall T, Beecham S, Bowes D, Gray D, Counsell S (2012) A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng 38(6):1276–1304CrossRefGoogle Scholar
  8. He Z, Shu F, Yang Y, Li M, Wang Q (2012) An investigation on the feasibility of cross-project defect prediction. Autom Softw Eng 19(2):167–199CrossRefGoogle Scholar
  9. Knyazev AV (2017) Signed Laplacian for spectral clustering revisited, pp 1–24. arXiv:1701.01394v1
  10. Kunegis J, Schmidt S, Lommatzsch A, Lerner J, De Luca EW, Albayrak S (2010) Spectral analysis of signed graphs for clustering, prediction and visualization. In: Proceedings of the SIAM international conference on data mining, pp 559–570Google Scholar
  11. Lee T, Nam J, Han D, Kim S, In H (2016) Developer micro interaction metrics for software defect prediction. IEEE Trans Softw Eng 42(11):1015–1035CrossRefGoogle Scholar
  12. Menzies T, Milton Z, Turhan B, Cukic B, Jiang Y, Bener A (2010) Defect prediction from static code features: current results, limitations, new approaches. Autom Softw Eng 17(4):375–407CrossRefGoogle Scholar
  13. Menzies T, Krishna R, Pryor D (2016) The promise repository of empirical software engineering data. North Carolina State University, Department of Computer Science, RaleighGoogle Scholar
  14. Nam J, Kim S (2015) CLAMI: defect prediction on unlabeled datasets. In: Proceedings of the 30th IEEE/ACM international conference on automated software engineering (ASE), pp 452–463Google Scholar
  15. Nam J, Pan SJ, Kim S (2013) Transfer defect learning. In: Proceedings of the 35th international conference on software engineering (ICSE), vol 34(2), pp 382–391Google Scholar
  16. Nam J, Fu W, Kim S, Menzies T, Tan L (2017) Heterogeneous defect prediction. IEEE Trans Softw Eng 99:1–23Google Scholar
  17. Ni C, Liu WS, Chen X (2017) A cluster based feature selection method for cross-project software defect prediction. J Comput Sci Technol 32(6):1090–1107CrossRefGoogle Scholar
  18. Osborne JW, Carolina N (2010) Improving your data transformations: applying the Box-Cox transformation. Pract Assess Res Eval 15(12):1–9Google Scholar
  19. Petersen K (2011) Measuring and predicting software productivity: a systematic map and review. Inf Softw Technol 53(4):317–343CrossRefGoogle Scholar
  20. Punitha K, Chitra S (2013) Software defect prediction using software metrics: a survey. In: Proceedings of the the 2013 international conference on information communication and embedded systems (ICICES), pp 555–558Google Scholar
  21. Ryu D, Jang JI, Baik J (2015) A hybrid instance selection using nearest-neighbor for cross-project defect prediction. J Comput Sci Technol 30(5):969–980CrossRefGoogle Scholar
  22. Tomar D, Agarwal S (2016) Prediction of defective software modules using class imbalance learning. Appl Comput Intell Soft Comput 2016:1–12CrossRefGoogle Scholar
  23. Wahono RS (2015) A systematic literature review of software defect prediction: research trends, datasets, methods and frameworks. J Softw Eng 1(1):1–16Google Scholar
  24. Wahono RS, Suryana N, Ahmad S (2014) Metaheuristic optimization based feature selection for software defect prediction. J Softw 9(5):1324–1333CrossRefGoogle Scholar
  25. Zaki MJ, Wagner MJ (2014) Data mining and analysis. Cambridge Univerity Press, Cambridge, pp 472–514CrossRefGoogle Scholar
  26. Zhang H, Zhang X (2007) Comments on ‘data mining static code attributes to learn defect predictors’. IEEE Trans Softw Eng 33(9):635–636CrossRefGoogle Scholar
  27. Zhang F, Mockus A, Keivanloo I, Zou Y (2014) Towards building a universal defect prediction model. In: Proceedings of the 11th working conference on mining software repositories (MSR), pp 182–191Google Scholar
  28. Zhang F, Zheng Q, Zou Y, Hassan AE (2016) Cross-project defect prediction using a connectivity based unsupervised classifier. In Proceedings of the 38th international conference on software engineering (ICSE), pp 309–320Google Scholar
  29. Zhang F, Keivanloo I, Zou Y (2017) Data transformation in cross-project defect prediction. Empir Softw Eng 22:3186–3218CrossRefGoogle Scholar
  30. Zhong S, Khoshgoftaar TM, Seliya N (2004) Unsupervised learning for expert-based software quality estimation. In: Proceedings of the eighth IEEE international conference on high assurance systems engineering, pp 149–155Google Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Electrical and Information Engineering, Faculty of EngineeringUniversitas Gadjah MadaYogyakartaIndonesia
  2. 2.Faculty of Computer ScienceDian Nuswantoro UniversitySemarangIndonesia

Personalised recommendations