Skip to main content

Combining Clustering and Classification for Software Quality Evaluation

  • Conference paper
Artificial Intelligence: Methods and Applications (SETN 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8445))

Included in the following conference series:

Abstract

Source code and metric mining have been used to successfully assist with software quality evaluation. This paper presents a data mining approach which incorporates clustering Java classes, as well as classifying extracted clusters, in order to assess internal software quality. We use Java classes as entities and static metrics as attributes for data mining. We identify outliers and apply K-means clustering in order to establish clusters of classes. Outliers indicate potentially fault prone classes, whilst clusters are examined so that we can establish common characteristics. Subsequently, we apply C4.5 to build classification trees for identifying metrics which determine cluster membership. We evaluate the proposed approach with two well known open source software systems, Jedit and Apache Geronimo. Results have consolidated key findings from previous work and indicated that combining clustering with classification produces better results than stand alone clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tian, J.: Quality-Evaluation Models and Measurements. IEEE Software 21, 84–91 (2004)

    Article  Google Scholar 

  2. Li, H.F., Cheung, W.K.: An Experimental investigation of software metric and their relationship to software development effort. IEEE Transaction on Software Engineering 15(5), 649–653 (1989)

    Article  Google Scholar 

  3. Kanellopoulos, Y., Makris, C., Tjortjis, C.: An Improved Methodology on Information Distillation by Mining Program Source Code. Data & Knowledge Engineering, Elsevier 61(2), 359–383 (2007)

    Article  Google Scholar 

  4. Menzies, T., Greenwald, J., Frank, A.: Data Mining Static Code Attributes to Learn Defect Predictors. IEEE Transactions on Software Engineering 32(11), 2–13 (2007)

    Article  Google Scholar 

  5. Tribus, H., Morrigl, I., Axelsson, S.: Using Data Mining for Static Code Analysis of C. In: Zhou, S., Zhang, S., Karypis, G. (eds.) ADMA 2012. LNCS (LNAI), vol. 7713, pp. 603–614. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  6. Bush, W.R., Pincus, J.D., Sielaff, D.J.: A Static Analyzer for Finding Dynamic Programming Errors. Software-Practice and Experience 20, 775–802 (2000)

    Article  Google Scholar 

  7. Spinnelis, D.: Code Quality the Open Source Perspective. Addison Wesley (2006)

    Google Scholar 

  8. Fenton, N.E.: Software Metrics: A Rigorous Approach. Cengage Learning EMEA (1991)

    Google Scholar 

  9. Chidamber, S.R., Kemerer, C.F.: Towards a Metrics Suite for Object Oriented Design. In: Proc. Conf. Object Oriented Programming Systems, Languages, and Applications (OOPSLA 1991), vol. 26(11), pp. 197–211 (1991)

    Google Scholar 

  10. Chidamber, S.R., Kemerer, C.F.: A metrics suite for object oriented design. IEEE Transactions on Software Engineering 20(6), 476–493 (1994)

    Article  Google Scholar 

  11. Halstead, M.: Elements of Software Science. Elsevier (1977)

    Google Scholar 

  12. McCabe, T.J.: A Complexity Measure. IEEE Transactions on Software Engineering SE-2(4), 308–320 (1976)

    Article  MathSciNet  Google Scholar 

  13. Dick, S., Meeks, A., Last, M., Bunke, H., Kandel, A.: Data mining in software metrics databases. Fuzzy Sets and Systems 145(1), 81–100 (2004)

    Article  MathSciNet  Google Scholar 

  14. Zhong, S., Khoshgoftaar, T.M., Seliya, N.: Expert-Based Software Measurement Data Analysis with Clustering Techniques. IEEE Intelligent Systems, Special Issue on Data and Information Cleaning and Preprocessing, 22–30 (2004)

    Google Scholar 

  15. Nagappan, N., Ball, T., Zeller, A.: Mining Metrics to Predict Component Failures. In: Proc. 28th Int’l Conf. Software Engineering (ICSE 2006), pp. 452–461 (2006)

    Google Scholar 

  16. Kanellopoulos, Y., Antonellis, P., Antoniou, D., Makris, C., Theodoridis, E., Tjortjis, C., Tsirakis, N.: Code Quality Evaluation methodology using the ISO/IEC 9126 Standard. Int’l Journal of Software Engineering & Applications 1(3), 17–36 (2010)

    Article  Google Scholar 

  17. Antonellis, P., Antoniou, D., Kanellopoulos, Y., Makris, C., Theodoridis, E., Tjortjis, C., Tsirakis, N.: Employing Clustering for Assisting Source Code Maintainability Evaluation according to ISO/IEC-9126. In: Proc. Artificial Intelligence Techniques in Software Engineering Workshop (AISEW 2008) in ECAI 2008 (2008)

    Google Scholar 

  18. Dunham, M.H.: Data Mining: Introductory and Advanced Topics. Pearson Education (2006)

    Google Scholar 

  19. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann (2005)

    Google Scholar 

  20. Vartziotis, F.: Java Source Code Analyzer for Software Assessment, BSc Dissertation, Department of Computer Science & Engineering University of Ioannina (2012)

    Google Scholar 

  21. Kanellopoulos, Y., Heitlager, I., Tjortjis, C., Visser, J.: Interpretation of Source Code Clusters in Terms of the ISO/IEC-9126 Maintainability Characteristics. In: Proc. 12th European Conf. Software Maintenance and Reengineering (CSMR 2008), pp. 63–72. IEEE Comp. Soc. Press (2008)

    Google Scholar 

  22. Antonellis, P., Antoniou, D., Kanellopoulos, Y., Makris, C., Theodoridis, E., Tjortjis, C., Tsirakis, N.: Clustering for Monitoring Software Systems Maintainability Evolution. Electronic Notes in Theoretical Computer Science, Elsevier 233, 43–57 (2009)

    Article  Google Scholar 

  23. Prasad, A.V.K., Krishna, S.R.: Data Mining for Secure Software Engineering-Source Code Management Tool Case Study. Int’l Journal of Engineering Science and Technology 2(7), 2667–2677 (2010)

    Google Scholar 

  24. JEdit website, http://www.jedit.org (last accessed: January 15, 2014)

  25. Apache Geronimo website, http://geronimo.apache.org (last accessed: January 15, 2014)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Papas, D., Tjortjis, C. (2014). Combining Clustering and Classification for Software Quality Evaluation. In: Likas, A., Blekas, K., Kalles, D. (eds) Artificial Intelligence: Methods and Applications. SETN 2014. Lecture Notes in Computer Science(), vol 8445. Springer, Cham. https://doi.org/10.1007/978-3-319-07064-3_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07064-3_22

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07063-6

  • Online ISBN: 978-3-319-07064-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics