Skip to main content

Applying Weighted Particle Swarm Optimization to Imbalanced Data in Software Defect Prediction

  • Conference paper
  • First Online:
Book cover New Technologies, Development and Application (NT 2018)

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 42))

Abstract

Imbalanced data typically refers to class distribution skews and underrepresented data, which affect the performance of learning algorithms. Such data are well-known in real-life situations, such as behavior analysis, cancer malignancy grading, industrial systems’ monitoring and software defect prediction. In this paper, we present a W-PSO method, which comprises weighting of instances in a dataset and the Particle Swarm Optimization algorithm. The presented method was combined with classification methods C4.5 and Naive Bayes, respectively, and tested experimentally on ten freely accessible software defect prediction datasets. Based on the results achieved, the presented W-PSO method creates better classification models than classification methods C4.5 and Naive Bayes in the majority of the cases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Brezočnik, L.: Feature selection for classification using particle swarm optimization. In: IEEE EUROCON 2017 – 17th International Conference on Smart Technologies, pp. 966–971. IEEE, Ohrid, Macedonia (2017)

    Google Scholar 

  2. Brezočnik, L., Karakatič, S., Podgorelec, V.: Weighted particle swarm optimization for imbalanced data. In: Twenty-sixth International Electrotechnical and Computer Science Conference ERK 2017, pp. 387–390. IEEE Slovenia Section, Portorož, Slovenia (2017)

    Google Scholar 

  3. Fenton, N.E., Ohlsson, N.: Quantitative analysis of faults and failures in a complex software system. IEEE Trans. Softw. Eng. 26(8), 797–814 (2000)

    Article  Google Scholar 

  4. IBM: IBM SPSS Software. https://goo.gl/djbcCa. Accessed 10 Jan 2018

  5. Karakatič, S., Heričko, M., Podgorelec, V.: Weighting and sampling data for individual classifiers and bagging with genetic algorithms. In: 7th International Joint Conference on Computational Intelligence IJCCI, pp. 180–187. IEEE, Lisbon, Portugal (2015)

    Google Scholar 

  6. Kennedy, J., Eberhart, R.: Particle swarm optimization. In: IEEE International Conference on Neural Networks, vol. 4, pp. 1942–1948. IEEE, Perth, Australia (1995)

    Google Scholar 

  7. Khoshgoftaar, T.M., Allen, E.B., Deng, J.: Using regression trees to classify fault-prone software modules. IEEE Trans. Reliab. 51(4), 455–462 (2002)

    Article  Google Scholar 

  8. Menzies, T., Di Stefano, J.S., Chapman, M., McGrill, K.: Metrics that matter. In: 27th Annual NASA Goddard/IEEE Software Engineering Workshop, pp. 51–57. IEEE (2002)

    Google Scholar 

  9. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)

    Article  Google Scholar 

  10. Pighin, M., Podgorelec, V., Kokol, P.: Fault-threshold prediction with linear programming methodologies. Empirical Softw. Eng. 8(2), 117–138 (2003)

    Article  Google Scholar 

  11. Podgorelec, V.: Improved mining of software complexity data on evolutionary filtered training sets. WSEAS Trans. Inf. Sci. Appl. 6(11), 1751–1760 (2009)

    Google Scholar 

  12. Podgorelec, V., Karakatič, S.: Predicting software defect-proneness from software repository data – a case of eclipse bug data. In: 18th International Multiconference Information Society – IS 2015, Collaboration, Software and Services in Information Society, pp. 5–8. InstitutJožef Stefan, Ljubljana, Slovenia (2015)

    Google Scholar 

  13. Podgorelec, V., Kokol, P.: Evolutionary induced decision trees for dangerous software modules prediction. Inf. Process. Lett. 82(1), 31–38 (2002)

    Article  MathSciNet  Google Scholar 

  14. Porter, A.A., Selby, R.W.: Empirically guided software development using metric-based classification trees. IEEE Softw. 7(2), 46–54 (1990)

    Article  Google Scholar 

  15. Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)

    Article  Google Scholar 

  16. Wahono, R.S.: A systematic literature review of software defect prediction: Research trends, datasets, methods and frameworks. J. Softw. Eng. 1(1), 1–16 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lucija Brezočnik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Brezočnik, L., Podgorelec, V. (2019). Applying Weighted Particle Swarm Optimization to Imbalanced Data in Software Defect Prediction. In: Karabegović, I. (eds) New Technologies, Development and Application. NT 2018. Lecture Notes in Networks and Systems, vol 42. Springer, Cham. https://doi.org/10.1007/978-3-319-90893-9_35

Download citation

Publish with us

Policies and ethics