Skip to main content

Comparing Pre-defined Software Engineering Metrics with Free-Text for the Prediction of Code ‘Ripples’

  • Conference paper
Advances in Intelligent Data Analysis XIII (IDA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8819))

Included in the following conference series:

  • 1486 Accesses

Abstract

An ongoing issue in industrial software engineering is the amount of effort it requires to make ‘maintenance’ changes to code. An equally relevant research line is determining whether the effect of any maintenance change causes a ‘ripple’ effect, characterized by extra, unforeseen and wide-ranging changes in other parts of the system in response to a single, initial change. In this paper, we exploit a combination of change data and comment data from developers in the form of free text from three ‘live’ industrial web-based systems as a basis for exploring this concept using IDA techniques. We explore the predictive power of change metrics vis-à-vis textual descriptions of the same requested changes. Interesting observations about the data and its properties emerged. In terms of predicting a ripple effect, we found using either quantitative change data or qualitative text data provided approximately the same predictive power. The result was very surprising; while we might expect the relative vagueness of textual descriptions to provide less explanatory power than the categorical metric data, it actually provided the approximate same level. Overall, the results have resonance for both IT practitioners in understanding dynamic system features and for empirical studies where only text data is available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Intl. Conference on Management of Data, Washington, USA, p. 207 (1993)

    Google Scholar 

  2. Berthold, M., Klawonn, F., Hoppner, F., Borgelt, C.: Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data. Springer (2010)

    Google Scholar 

  3. Black, S.: Computing ripple effect for software maintenance. Journal of Software Maintenance and Evolution: Research and Practice 13(4), 263–279 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  4. Brooks, F.: The Mythical Man-Month: Essays on Soft. Eng. Addison-Wesley (1975)

    Google Scholar 

  5. Canning, R.: The Maintenance ‘Iceberg’. EDP Analyzer 10(10), 1–14 (1972)

    Google Scholar 

  6. Constantinou, A., Fenton, N., Neil, M.: pi-football: A Bayesian network model for forecasting Association Football match outcomes. Know. Based Syst. 36, 322–339 (2012)

    Article  Google Scholar 

  7. Cooper, G., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9 (1992)

    Google Scholar 

  8. Fenton, N., Neil, M., Hearty, P., Marsh, W., Marquez, D., Krause, P., Mishra, R.: Predicting Software Defects in Varying Development Lifecycles using Bayesian Nets. Information & Software Technology 49, 32–43 (2007)

    Article  Google Scholar 

  9. Harman, M.: The current state and future of search based software engineering. In: Future of Software Engineering. IEEE Computer Society Press, Los Alamitos (2007)

    Google Scholar 

  10. Hearty, P., Fenton, N., Marquez, D., Neil, M.: Predicting Project Velocity in XP Using a Learning Dynamic Bayesian Network Model. IEEE Trans. Soft. Eng. 35(1), 124–137 (2009)

    Article  Google Scholar 

  11. Jain, A.: Data Clustering: 50 Years Beyond K-Means. Pattern Recognition Letters 31(8), 651–666 (2010)

    Article  Google Scholar 

  12. Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)

    Article  MATH  Google Scholar 

  13. Lauritzen, S., Spiegelhalter, D.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B Methodological) 50(2), 157–224 (1988)

    Google Scholar 

  14. Pearl, J.: Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, pp. 329–334 (1985)

    Google Scholar 

  15. Pressman, R.: Software Engineering, A Practitioner’s Approach. McGraw Hill (1982)

    Google Scholar 

  16. Shannon, C.: A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423 (1948)

    Article  MathSciNet  MATH  Google Scholar 

  17. Swanson, E.: The dimensions of maintenance. In: Proceedings of the 2nd International Conference on Software Engineering, San Francisco, US, pp. 492–497 (1996)

    Google Scholar 

  18. www.r-project.org (last accessed: May 18, 2014)

  19. www.wordle.net (last accessed: May 18, 2014)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Counsell, S., Tucker, A., Swift, S., Fitzgerald, G., Peters, J. (2014). Comparing Pre-defined Software Engineering Metrics with Free-Text for the Prediction of Code ‘Ripples’. In: Blockeel, H., van Leeuwen, M., Vinciotti, V. (eds) Advances in Intelligent Data Analysis XIII. IDA 2014. Lecture Notes in Computer Science, vol 8819. Springer, Cham. https://doi.org/10.1007/978-3-319-12571-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-12571-8_6

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-12570-1

  • Online ISBN: 978-3-319-12571-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics