Comparing Pre-defined Software Engineering Metrics with Free-Text for the Prediction of Code ‘Ripples’
An ongoing issue in industrial software engineering is the amount of effort it requires to make ‘maintenance’ changes to code. An equally relevant research line is determining whether the effect of any maintenance change causes a ‘ripple’ effect, characterized by extra, unforeseen and wide-ranging changes in other parts of the system in response to a single, initial change. In this paper, we exploit a combination of change data and comment data from developers in the form of free text from three ‘live’ industrial web-based systems as a basis for exploring this concept using IDA techniques. We explore the predictive power of change metrics vis-à-vis textual descriptions of the same requested changes. Interesting observations about the data and its properties emerged. In terms of predicting a ripple effect, we found using either quantitative change data or qualitative text data provided approximately the same predictive power. The result was very surprising; while we might expect the relative vagueness of textual descriptions to provide less explanatory power than the categorical metric data, it actually provided the approximate same level. Overall, the results have resonance for both IT practitioners in understanding dynamic system features and for empirical studies where only text data is available.
KeywordsSoftware maintenance web-based systems prediction metrics
Unable to display preview. Download preview PDF.
- 1.Agrawal, R., Imieliński, T., Swami, A.: Mining association rules between sets of items in large databases. In: Intl. Conference on Management of Data, Washington, USA, p. 207 (1993)Google Scholar
- 2.Berthold, M., Klawonn, F., Hoppner, F., Borgelt, C.: Guide to Intelligent Data Analysis: How to Intelligently Make Sense of Real Data. Springer (2010)Google Scholar
- 4.Brooks, F.: The Mythical Man-Month: Essays on Soft. Eng. Addison-Wesley (1975)Google Scholar
- 5.Canning, R.: The Maintenance ‘Iceberg’. EDP Analyzer 10(10), 1–14 (1972)Google Scholar
- 7.Cooper, G., Herskovits, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9 (1992)Google Scholar
- 9.Harman, M.: The current state and future of search based software engineering. In: Future of Software Engineering. IEEE Computer Society Press, Los Alamitos (2007)Google Scholar
- 13.Lauritzen, S., Spiegelhalter, D.: Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society, Series B Methodological) 50(2), 157–224 (1988)Google Scholar
- 14.Pearl, J.: Bayesian Networks: A Model of Self-Activated Memory for Evidential Reasoning. In: Proceedings of the 7th Conference of the Cognitive Science Society, University of California, Irvine, CA, pp. 329–334 (1985)Google Scholar
- 15.Pressman, R.: Software Engineering, A Practitioner’s Approach. McGraw Hill (1982)Google Scholar
- 17.Swanson, E.: The dimensions of maintenance. In: Proceedings of the 2nd International Conference on Software Engineering, San Francisco, US, pp. 492–497 (1996)Google Scholar
- 18.www.r-project.org (last accessed: May 18, 2014)
- 19.www.wordle.net (last accessed: May 18, 2014)