Skip to main content

Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics

  • Conference paper
Engineering Applications of Neural Networks (EANN 2009)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 43))

Abstract

The automated detection of defective modules within software systems could lead to reduced development costs and more reliable software. In this work the static code metrics for a collection of modules contained within eleven NASA data sets are used with a Support Vector Machine classifier. A rigorous sequence of pre-processing steps were applied to the data prior to classification, including the balancing of both classes (defective or otherwise) and the removal of a large number of repeating instances. The Support Vector Machine in this experiment yields an average accuracy of 70% on previously unseen data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Levinson, M.: Lets stop wasting $78 billion per year. CIO Magazine (2001)

    Google Scholar 

  2. Halstead, M.H.: Elements of Software Science (Operating and programming systems series). Elsevier Science Inc., New York (1977)

    MATH  Google Scholar 

  3. McCabe, T.J.: A complexity measure. In: ICSE 1976: Proceedings of the 2nd international conference on Software engineering, p. 407. IEEE Computer Society Press, Los Alamitos (1976)

    Google Scholar 

  4. Hamer, P.G., Frewin, G.D.: M.H. Halstead’s Software Science - a critical examination. In: ICSE 1982: Proceedings of the 6th international conference on Software engineering, pp. 197–206. IEEE Computer Society Press, Los Alamitos (1982)

    Google Scholar 

  5. Shen, V.Y., Conte, S.D., Dunsmore, H.E.: Software Science Revisited: A critical analysis of the theory and its empirical support. IEEE Trans. Softw. Eng. 9(2), 155–165 (1983)

    Article  Google Scholar 

  6. Shepperd, M.: A critique of cyclomatic complexity as a software metric. Softw. Eng. J. 3(2), 30–36 (1988)

    Article  Google Scholar 

  7. Sommerville, I.: Software Engineering, 8th edn. International Computer Science Series. Addison Wesley, Reading (2006)

    MATH  Google Scholar 

  8. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 33(1), 2–13 (2007)

    Article  Google Scholar 

  9. Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. In: Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2001)

    Google Scholar 

  10. Sun, Y., Robinson, M., Adams, R., Boekhorst, R.T., Rust, A.G., Davey, N.: Using sampling methods to improve binding site predictions. In: Proceedings of ESANN (2006)

    Google Scholar 

  11. Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Technical report, Taipei (2003)

    Google Scholar 

  12. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  13. Wu, G., Chang, E.Y.: Class-boundary alignment for imbalanced dataset learning. In: ICML 2003 Workshop on Learning from Imbalanced Data Sets, pp. 49–56 (2003)

    Google Scholar 

  14. Fisher, D.: Ordering effects in incremental learning. In: Proc. of the 1993 AAAI Spring Symposium on Training Issues in Incremental Learning, Stanford, California, pp. 34–41 (1993)

    Google Scholar 

  15. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm

  16. Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: IEEE International Conference on Information Reuse and Integration, 2007. IRI 2007, pp. 659–666 (2007)

    Google Scholar 

  17. Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering 34(4), 485–496 (2008)

    Article  Google Scholar 

  18. Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)

    Article  Google Scholar 

  19. Liebchen, G.A., Shepperd, M.: Data sets and data quality in software engineering. In: PROMISE 2008: Proceedings of the 4th international workshop on Predictor models in software engineering, pp. 39–44. ACM, New York (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B. (2009). Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics. In: Palmer-Brown, D., Draganova, C., Pimenidis, E., Mouratidis, H. (eds) Engineering Applications of Neural Networks. EANN 2009. Communications in Computer and Information Science, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03969-0_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-03969-0_21

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-03968-3

  • Online ISBN: 978-3-642-03969-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics