Comparing Pre-defined Software Engineering Metrics with Free-Text for the Prediction of Code ‘Ripples’

  • Steve Counsell
  • Allan Tucker
  • Stephen Swift
  • Guy Fitzgerald
  • Jason Peters
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8819)


An ongoing issue in industrial software engineering is the amount of effort it requires to make ‘maintenance’ changes to code. An equally relevant research line is determining whether the effect of any maintenance change causes a ‘ripple’ effect, characterized by extra, unforeseen and wide-ranging changes in other parts of the system in response to a single, initial change. In this paper, we exploit a combination of change data and comment data from developers in the form of free text from three ‘live’ industrial web-based systems as a basis for exploring this concept using IDA techniques. We explore the predictive power of change metrics vis-à-vis textual descriptions of the same requested changes. Interesting observations about the data and its properties emerged. In terms of predicting a ripple effect, we found using either quantitative change data or qualitative text data provided approximately the same predictive power. The result was very surprising; while we might expect the relative vagueness of textual descriptions to provide less explanatory power than the categorical metric data, it actually provided the approximate same level. Overall, the results have resonance for both IT practitioners in understanding dynamic system features and for empirical studies where only text data is available.


Software maintenance web-based systems prediction metrics 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Intl. Conference on Management of Data, Washington, USA, p. 207 (1993)Google Scholar
  2. 2.
    Berthold, M., Klawonn, F., Hoppner, F., Borgelt, C.: Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data. Springer (2010)Google Scholar
  3. 3.
    Black, S.: Computing ripple effect for software maintenance. Journal of Software Maintenance and Evolution: Research and Practice 13(4), 263–279 (2001)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Brooks, F.: The Mythical Man-Month: Essays on Soft. Eng. Addison-Wesley (1975)Google Scholar
  5. 5.
    Canning, R.: The Maintenance ‘Iceberg’. EDP Analyzer 10(10), 1–14 (1972)Google Scholar
  6. 6.
    Constantinou, A., Fenton, N., Neil, M.: pi-football: A Bayesian network model for forecasting Association Football match outcomes. Know. Based Syst. 36, 322–339 (2012)CrossRefGoogle Scholar
  7. 7.
    Cooper, G., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9 (1992)Google Scholar
  8. 8.
    Fenton, N., Neil, M., Hearty, P., Marsh, W., Marquez, D., Krause, P., Mishra, R.: Predicting Software Defects in Varying Development Lifecycles using Bayesian Nets. Information & Software Technology 49, 32–43 (2007)CrossRefGoogle Scholar
  9. 9.
    Harman, M.: The current state and future of search based software engineering. In: Future of Software Engineering. IEEE Computer Society Press, Los Alamitos (2007)Google Scholar
  10. 10.
    Hearty, P., Fenton, N., Marquez, D., Neil, M.: Predicting Project Velocity in XP Using a Learning Dynamic Bayesian Network Model. IEEE Trans. Soft. Eng. 35(1), 124–137 (2009)CrossRefGoogle Scholar
  11. 11.
    Jain, A.: Data Clustering: 50 Years Beyond K-Means. Pattern Recognition Letters 31(8), 651–666 (2010)CrossRefGoogle Scholar
  12. 12.
    Kohavi, R., John, G.: Wrappers for feature subset selection. Artificial Intelligence 97(1-2), 273–324 (1997)CrossRefzbMATHGoogle Scholar
  13. 13.
    Lauritzen, S., Spiegelhalter, D.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B Methodological) 50(2), 157–224 (1988)Google Scholar
  14. 14.
    Pearl, J.: Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, pp. 329–334 (1985)Google Scholar
  15. 15.
    Pressman, R.: Software Engineering, A Practitioner’s Approach. McGraw Hill (1982)Google Scholar
  16. 16.
    Shannon, C.: A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423 (1948)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Swanson, E.: The dimensions of maintenance. In: Proceedings of the 2nd International Conference on Software Engineering, San Francisco, US, pp. 492–497 (1996)Google Scholar
  18. 18. (last accessed: May 18, 2014)
  19. 19. (last accessed: May 18, 2014)

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Steve Counsell
    • 1
  • Allan Tucker
    • 1
  • Stephen Swift
    • 1
  • Guy Fitzgerald
    • 2
  • Jason Peters
    • 1
  1. 1.Dept. of Computer ScienceBrunel UniversityUxbridgeUK
  2. 2.School of Business and EconomicsLoughborough UniversityUK

Personalised recommendations