Abstract
Amongst the critical actions needed to be undertaken before system testing, software fault prediction is imperative. Prediction models are used to identify fault-prone classes and contribute considerably to reduce the testing time, project risks, and resource and infrastructure costs. In the development of a prediction model, the interaction of metrics results in an improved predictive capability, accruing to the fact that metrics are often correlated and do not have a strict additive effect in a regression model.
Even though the interaction amongst metrics results in the model’s improved prediction capability, it also gives rise to a large number of predictors. This leads to Multiple Linear Regression (MLR) exhibiting a reduced level of performance, since a single predictive formula occupies the entire data space. The M5’ model tree has an edge over MLR in managing such interactions, by partitioning the data space into smaller regions.
The resulting hypothesis empirically establish that the M5’ model tree, when applied to these interactions, provides a greater degree of accuracy and robustness of the model as a whole when compared with MLR models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
N. E. Fenton and M. Neil, “Software metrics: roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 357–370.
C. Catal and B. Diri, “Software fault prediction with object-oriented metrics based artificial immune recognition system,” Product-Focused Software Process Improvement, pp. 300–314, 2007.
S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” Software Engineering, IEEE Transactions on, vol. 20, no. 6, pp. 476–493, 1994.
M. D’Ambros, M. Lanza, and R. Robbes, “An extensive comparison of bug prediction approaches,” in Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, 2010, pp. 31–41.
R. Goyal, P. Chandra, and Y. Singh “Impact of interaction in the combined metrics approach for fault prediction,” Software Quality Professional (ASQ), vol. 15, no. 3. pp. 15–23, 2013.
R. Goyal, P. Chandra, and Y. Singh, “Identifying influential metrics in the combined metrics approach of fault prediction,” SpringerPlus, vol. 2, no. 1, p. 627, 2013.
Y. Wang and I. H. Witten, “Inducing model trees for continuous classes,” in Poster Papers of the 9th European Conference on Machine Learning (ECML 97), 1997, pp. 128–137.
J. R. Quinlan, “Learning with continuous classes,” in Proceedings of the 5th Australian joint Conference on Artificial Intelligence, vol. 92, 1992, pp. 343–348.
S. S. Gokhale and M. R. Lyu, “Regression tree modeling for the prediction of software quality,” in proceedings of the Third ISSAT International Conference on Reliability and Quality in Design, 1997, pp. 31–36.
T. M. Khoshgoftaar, E. B. Allen, and J. Deng, “Using regression trees to classify fault-prone software modules,” Reliability, IEEE Transactions on, vol. 51, no. 4, pp. 455–462, 2002.
S. Bibi, G. Tsoumakas, I. Stamelos, and I. Vlahavas, “Regression via Classification applied on software defect estimation,” Expert Systems with Applications, vol. 34, no. 3, pp. 2091–2101, 2008.
L. Guo, Y. Ma, B. Cukic, and H. Singh, “Robust prediction of fault-proneness by random forests,” in Software Reliability Engineering, 2004. ISSRE 2004. 15th International Symposium on, 2004, pp. 417–428.
I. Chowdhury and M. Zulkernine, “Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities,” Journal of Systems Architecture, vol. 57, no. 3, pp. 294–313, 2011.
D. Rodriguez, J. Cuadrado, M. Sicilia, and R. Ruiz, “Segmentation of software engineering datasets using the m5 algorithm,” in Computational Science-ICCS 2006, Springer, 2006, pp. 789–796.
A. Etemad-Shahidi and J. Mahjoobi, “Comparison between M5′ model tree and neural networks for prediction of significant wave height in Lake Superior,” Ocean Engineering, vol. 36, no. 15, pp. 1175–1181, 2009.
B. Bhattacharya and D. P. Solomatine, “Neural networks and M5 model trees in modelling water level-discharge relationship,” Neurocomputing, vol. 63, pp. 381–396, 2005.
D. P. Solomatine and K. N. Dulal, “Model trees as an alternative to neural networks in rainfall—Runoff modelling,” Hydrological Sciences Journal, vol. 48, no. 3, pp. 399–411, 2003.
T. A. Runkler, Data Analytics: Models and Algorithms for Intelligent Data Analysis. Vieweg + Teubner Verlag, 2012.
L. Breiman, Classification and regression trees. CRC press, 1993.
E. Frank, Y. Wang, S. Inglis, G. Holmes, and I. H. Witten, “Using model trees for classification,” Machine Learning, vol. 32, no. 1, pp. 63–76, 1998.
J. R. Quinlan, “Combining instance-based and model-based learning,” in Proceedings of the Tenth International Conference on Machine Learning, 1993, pp. 236–243.
D. P. Solomatine and M. Siek, “Flexible and optimal M5 model trees with applications to flow predictions,” in Proc. 6th Int. Conf. on Hydroinformatics. World Scientific, Singapore, 2004.
A. Marcus, D. Poshyvanyk, and R. Ferenc, “Using the conceptual cohesion of classes for fault prediction in object-oriented systems,” Software Engineering, IEEE Transactions on, vol. 34, no. 2, pp. 287–300, 2008.
K. P. Burnham and D. R. Anderson, “Multimodel inference understanding AIC and BIC in model selection,” Sociological methods \& research, vol. 33, no. 2, pp. 261–304, 2004.
Jekabsons G., M5PrimeLab: M5’ regression tree and model tree toolbox for Matlab/Octave, 2010, available at http://www.cs.rtu.lv/jekabsons/
Acknowledgment
Corresponding author would like to thank Mr. Tanveer
Oberoi and Ms. Preeti Goyal to copyedit manuscript.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix A
Appendix A
CK Metric (Chidamber and Kemerer 1994) | Interpretation |
---|---|
Weighted Methods per Class (WMC) | Identify complexity of class by finding the weighted sum of the complexity of the methods |
Coupling Between Object classes (CBO) | Identify the coupling between classes by considering the dependency of one class with other classes in the design |
Depth of the Inheritance Tree (DIT): | Identify the complexity of inheritance hierarchy by calculating the maximum length of a given class to the root class |
Lack of Cohesion metric (LCOM) | Identify cohesion with a class by counting the number of method pairs with zero similarity |
Number of Children (NOC): | Identify complexity of inheritance hierarchy by counting the number of immediate child classes that have inherited from a given class |
Response for the classes (RFC) | Identify the coupling between classes by calculating the sum of the number of local methods and the methods that can be called remotely |
OO (Object Oriented) | Interpretation |
NOM | Number of methods |
NOPM | Number of public methods |
NOPRM | Number of private methods |
NOMI | Number of methods inherited |
Fan-in | Number of other classes that reference the class |
Fan-out | Number of other classes referenced by the class |
NOAI | Number of attributes inherited |
NOA | Number of attributes |
NLOC | Number of lines of code |
NOPRA | Number of private attributes |
NOPA | Number of public attributes |
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Goyal, R., Chandra, P., Singh, Y. (2015). Comparison of M5’ Model Tree with MLR in the Development of Fault Prediction Models Involving Interaction Between Metrics. In: Elleithy, K., Sobh, T. (eds) New Trends in Networking, Computing, E-learning, Systems Sciences, and Engineering. Lecture Notes in Electrical Engineering, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-319-06764-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-06764-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06763-6
Online ISBN: 978-3-319-06764-3
eBook Packages: EngineeringEngineering (R0)