Comparison of M5’ Model Tree with MLR in the Development of Fault Prediction Models Involving Interaction Between Metrics

Goyal, Rinkaj; Chandra, Pravin; Singh, Yogesh

doi:10.1007/978-3-319-06764-3_19

Rinkaj Goyal³,
Pravin Chandra³ &
Yogesh Singh³

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 312))

2393 Accesses
2 Citations

Abstract

Amongst the critical actions needed to be undertaken before system testing, software fault prediction is imperative. Prediction models are used to identify fault-prone classes and contribute considerably to reduce the testing time, project risks, and resource and infrastructure costs. In the development of a prediction model, the interaction of metrics results in an improved predictive capability, accruing to the fact that metrics are often correlated and do not have a strict additive effect in a regression model.

Even though the interaction amongst metrics results in the model’s improved prediction capability, it also gives rise to a large number of predictors. This leads to Multiple Linear Regression (MLR) exhibiting a reduced level of performance, since a single predictive formula occupies the entire data space. The M5’ model tree has an edge over MLR in managing such interactions, by partitioning the data space into smaller regions.

The resulting hypothesis empirically establish that the M5’ model tree, when applied to these interactions, provides a greater degree of accuracy and robustness of the model as a whole when compared with MLR models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

N. E. Fenton and M. Neil, “Software metrics: roadmap,” in Proceedings of the Conference on the Future of Software Engineering, 2000, pp. 357–370.
Google Scholar
C. Catal and B. Diri, “Software fault prediction with object-oriented metrics based artificial immune recognition system,” Product-Focused Software Process Improvement, pp. 300–314, 2007.
Google Scholar
S. R. Chidamber and C. F. Kemerer, “A metrics suite for object oriented design,” Software Engineering, IEEE Transactions on, vol. 20, no. 6, pp. 476–493, 1994.
Article Google Scholar
M. D’Ambros, M. Lanza, and R. Robbes, “An extensive comparison of bug prediction approaches,” in Mining Software Repositories (MSR), 2010 7th IEEE Working Conference on, 2010, pp. 31–41.
Google Scholar
R. Goyal, P. Chandra, and Y. Singh “Impact of interaction in the combined metrics approach for fault prediction,” Software Quality Professional (ASQ), vol. 15, no. 3. pp. 15–23, 2013.
Google Scholar
R. Goyal, P. Chandra, and Y. Singh, “Identifying influential metrics in the combined metrics approach of fault prediction,” SpringerPlus, vol. 2, no. 1, p. 627, 2013.
Google Scholar
Y. Wang and I. H. Witten, “Inducing model trees for continuous classes,” in Poster Papers of the 9th European Conference on Machine Learning (ECML 97), 1997, pp. 128–137.
Google Scholar
J. R. Quinlan, “Learning with continuous classes,” in Proceedings of the 5th Australian joint Conference on Artificial Intelligence, vol. 92, 1992, pp. 343–348.
Google Scholar
S. S. Gokhale and M. R. Lyu, “Regression tree modeling for the prediction of software quality,” in proceedings of the Third ISSAT International Conference on Reliability and Quality in Design, 1997, pp. 31–36.
Google Scholar
T. M. Khoshgoftaar, E. B. Allen, and J. Deng, “Using regression trees to classify fault-prone software modules,” Reliability, IEEE Transactions on, vol. 51, no. 4, pp. 455–462, 2002.
Article Google Scholar
S. Bibi, G. Tsoumakas, I. Stamelos, and I. Vlahavas, “Regression via Classification applied on software defect estimation,” Expert Systems with Applications, vol. 34, no. 3, pp. 2091–2101, 2008.
Article Google Scholar
L. Guo, Y. Ma, B. Cukic, and H. Singh, “Robust prediction of fault-proneness by random forests,” in Software Reliability Engineering, 2004. ISSRE 2004. 15th International Symposium on, 2004, pp. 417–428.
Google Scholar
I. Chowdhury and M. Zulkernine, “Using complexity, coupling, and cohesion metrics as early indicators of vulnerabilities,” Journal of Systems Architecture, vol. 57, no. 3, pp. 294–313, 2011.
Article Google Scholar
D. Rodriguez, J. Cuadrado, M. Sicilia, and R. Ruiz, “Segmentation of software engineering datasets using the m5 algorithm,” in Computational Science-ICCS 2006, Springer, 2006, pp. 789–796.
Google Scholar
A. Etemad-Shahidi and J. Mahjoobi, “Comparison between M5′ model tree and neural networks for prediction of significant wave height in Lake Superior,” Ocean Engineering, vol. 36, no. 15, pp. 1175–1181, 2009.
Article Google Scholar
B. Bhattacharya and D. P. Solomatine, “Neural networks and M5 model trees in modelling water level-discharge relationship,” Neurocomputing, vol. 63, pp. 381–396, 2005.
Article Google Scholar
D. P. Solomatine and K. N. Dulal, “Model trees as an alternative to neural networks in rainfall—Runoff modelling,” Hydrological Sciences Journal, vol. 48, no. 3, pp. 399–411, 2003.
Article Google Scholar
T. A. Runkler, Data Analytics: Models and Algorithms for Intelligent Data Analysis. Vieweg + Teubner Verlag, 2012.
Google Scholar
L. Breiman, Classification and regression trees. CRC press, 1993.
Google Scholar
E. Frank, Y. Wang, S. Inglis, G. Holmes, and I. H. Witten, “Using model trees for classification,” Machine Learning, vol. 32, no. 1, pp. 63–76, 1998.
Article MATH Google Scholar
J. R. Quinlan, “Combining instance-based and model-based learning,” in Proceedings of the Tenth International Conference on Machine Learning, 1993, pp. 236–243.
Google Scholar
D. P. Solomatine and M. Siek, “Flexible and optimal M5 model trees with applications to flow predictions,” in Proc. 6th Int. Conf. on Hydroinformatics. World Scientific, Singapore, 2004.
Google Scholar
A. Marcus, D. Poshyvanyk, and R. Ferenc, “Using the conceptual cohesion of classes for fault prediction in object-oriented systems,” Software Engineering, IEEE Transactions on, vol. 34, no. 2, pp. 287–300, 2008.
Article Google Scholar
K. P. Burnham and D. R. Anderson, “Multimodel inference understanding AIC and BIC in model selection,” Sociological methods \& research, vol. 33, no. 2, pp. 261–304, 2004.
Article MathSciNet Google Scholar
Jekabsons G., M5PrimeLab: M5’ regression tree and model tree toolbox for Matlab/Octave, 2010, available at http://www.cs.rtu.lv/jekabsons/
Google Scholar

Download references

Acknowledgment

Corresponding author would like to thank Mr. Tanveer

Oberoi and Ms. Preeti Goyal to copyedit manuscript.

Author information

Authors and Affiliations

USICT, Guru Gobind Singh Indraprastha University, Dwarka, Delhi-78, India
Rinkaj Goyal, Pravin Chandra & Yogesh Singh

Authors

Rinkaj Goyal
View author publications
You can also search for this author in PubMed Google Scholar
Pravin Chandra
View author publications
You can also search for this author in PubMed Google Scholar
Yogesh Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rinkaj Goyal .

Editor information

Editors and Affiliations

Computer Science and Engineering, University of Bridgeport Associate Dean for Graduate Programs, Bridgeport, Connecticut, USA
Khaled Elleithy
Engineering and Computer Science, University of Bridgeport Dean of the School of Engineering, Bridgeport, Connecticut, USA
Tarek Sobh

Appendix A

CK Metric (Chidamber and Kemerer 1994)	Interpretation
Weighted Methods per Class (WMC)	Identify complexity of class by finding the weighted sum of the complexity of the methods
Coupling Between Object classes (CBO)	Identify the coupling between classes by considering the dependency of one class with other classes in the design
Depth of the Inheritance Tree (DIT):	Identify the complexity of inheritance hierarchy by calculating the maximum length of a given class to the root class
Lack of Cohesion metric (LCOM)	Identify cohesion with a class by counting the number of method pairs with zero similarity
Number of Children (NOC):	Identify complexity of inheritance hierarchy by counting the number of immediate child classes that have inherited from a given class
Response for the classes (RFC)	Identify the coupling between classes by calculating the sum of the number of local methods and the methods that can be called remotely
OO (Object Oriented)	Interpretation
NOM	Number of methods
NOPM	Number of public methods
NOPRM	Number of private methods
NOMI	Number of methods inherited
Fan-in	Number of other classes that reference the class
Fan-out	Number of other classes referenced by the class
NOAI	Number of attributes inherited
NOA	Number of attributes
NLOC	Number of lines of code
NOPRA	Number of private attributes
NOPA	Number of public attributes

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Goyal, R., Chandra, P., Singh, Y. (2015). Comparison of M5’ Model Tree with MLR in the Development of Fault Prediction Models Involving Interaction Between Metrics. In: Elleithy, K., Sobh, T. (eds) New Trends in Networking, Computing, E-learning, Systems Sciences, and Engineering. Lecture Notes in Electrical Engineering, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-319-06764-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-06764-3_19
Published: 08 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-06763-6
Online ISBN: 978-3-319-06764-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Comparison of M5’ Model Tree with MLR in the Development of Fault Prediction Models Involving Interaction Between Metrics

Abstract

Access this chapter

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix A

Appendix A

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation