Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics

Gray, David; Bowes, David; Davey, Neil; Sun, Yi; Christianson, Bruce

doi:10.1007/978-3-642-03969-0_21

David Gray⁴,
David Bowes⁴,
Neil Davey⁴,
Yi Sun⁴ &
…
Bruce Christianson⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 43))

Included in the following conference series:

International Conference on Engineering Applications of Neural Networks

1573 Accesses
47 Citations

Abstract

The automated detection of defective modules within software systems could lead to reduced development costs and more reliable software. In this work the static code metrics for a collection of modules contained within eleven NASA data sets are used with a Support Vector Machine classifier. A rigorous sequence of pre-processing steps were applied to the data prior to classification, including the balancing of both classes (defective or otherwise) and the removal of a large number of repeating instances. The Support Vector Machine in this experiment yields an average accuracy of 70% on previously unseen data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Levinson, M.: Lets stop wasting $78 billion per year. CIO Magazine (2001)
Google Scholar
Halstead, M.H.: Elements of Software Science (Operating and programming systems series). Elsevier Science Inc., New York (1977)
MATH Google Scholar
McCabe, T.J.: A complexity measure. In: ICSE 1976: Proceedings of the 2nd international conference on Software engineering, p. 407. IEEE Computer Society Press, Los Alamitos (1976)
Google Scholar
Hamer, P.G., Frewin, G.D.: M.H. Halstead’s Software Science - a critical examination. In: ICSE 1982: Proceedings of the 6th international conference on Software engineering, pp. 197–206. IEEE Computer Society Press, Los Alamitos (1982)
Google Scholar
Shen, V.Y., Conte, S.D., Dunsmore, H.E.: Software Science Revisited: A critical analysis of the theory and its empirical support. IEEE Trans. Softw. Eng. 9(2), 155–165 (1983)
Article Google Scholar
Shepperd, M.: A critique of cyclomatic complexity as a software metric. Softw. Eng. J. 3(2), 30–36 (1988)
Article Google Scholar
Sommerville, I.: Software Engineering, 8th edn. International Computer Science Series. Addison Wesley, Reading (2006)
MATH Google Scholar
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering 33(1), 2–13 (2007)
Article Google Scholar
Schölkopf, B., Smola, A.J.: Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. In: Adaptive Computation and Machine Learning. The MIT Press, Cambridge (2001)
Google Scholar
Sun, Y., Robinson, M., Adams, R., Boekhorst, R.T., Rust, A.G., Davey, N.: Using sampling methods to improve binding site predictions. In: Proceedings of ESANN (2006)
Google Scholar
Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Technical report, Taipei (2003)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann Series in Data Management Systems. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Wu, G., Chang, E.Y.: Class-boundary alignment for imbalanced dataset learning. In: ICML 2003 Workshop on Learning from Imbalanced Data Sets, pp. 49–56 (2003)
Google Scholar
Fisher, D.: Ordering effects in incremental learning. In: Proc. of the 1993 AAAI Spring Symposium on Training Issues in Incremental Learning, Stanford, California, pp. 34–41 (1993)
Google Scholar
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Li, Z., Reformat, M.: A practical method for the software fault-prediction. In: IEEE International Conference on Information Reuse and Integration, 2007. IRI 2007, pp. 659–666 (2007)
Google Scholar
Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering 34(4), 485–496 (2008)
Article Google Scholar
Elish, K.O., Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)
Article Google Scholar
Liebchen, G.A., Shepperd, M.: Data sets and data quality in software engineering. In: PROMISE 2008: Proceedings of the 4th international workshop on Predictor models in software engineering, pp. 39–44. ACM, New York (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Science and Technology Research Institute, University of Hertfordshire, UK
David Gray, David Bowes, Neil Davey, Yi Sun & Bruce Christianson

Authors

David Gray
View author publications
You can also search for this author in PubMed Google Scholar
David Bowes
View author publications
You can also search for this author in PubMed Google Scholar
Neil Davey
View author publications
You can also search for this author in PubMed Google Scholar
Yi Sun
View author publications
You can also search for this author in PubMed Google Scholar
Bruce Christianson
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Computing, London Metropolitan University, 166-220 Holloway Road, N7 8DB, London, UK
Dominic Palmer-Brown
School of Computing, IT and Engineering, University of East London, Docklands Campus, 4-6 University Way, E16 2RD, London, UK
Chrisina Draganova & Haris Mouratidis &
School of Computing, IT and Engineering, University of East London, London, UK
Elias Pimenidis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B. (2009). Using the Support Vector Machine as a Classification Method for Software Defect Prediction with Static Code Metrics. In: Palmer-Brown, D., Draganova, C., Pimenidis, E., Mouratidis, H. (eds) Engineering Applications of Neural Networks. EANN 2009. Communications in Computer and Information Science, vol 43. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-03969-0_21

Download citation

DOI: https://doi.org/10.1007/978-3-642-03969-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-03968-3
Online ISBN: 978-3-642-03969-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics