Arabian Journal for Science and Engineering

, Volume 43, Issue 8, pp 3875–3885 | Cite as

Evolutionary Computation-Based Techniques Over Multiple Data Sets: An Empirical Assessment

  • Manju KhariEmail author
  • Prabhat Kumar
Research Article-Special Issue-Computer Engineering and Computer Science


In the realm of software testing various organizations wish to predict the faults in their software systems prior to their deployment. This improves the delivered quality and also reduces the maintenance effort. A multitude of software metrics and statistical models have been developed to solve this problem and one such method is called defect prediction. Defect prediction is the process of identifying the defects in the software program prior to its deployment. In recent times, a class of learners called evolutionary computation (EC) techniques has emerged. These EC techniques apply the Darwinian principle of ‘survival of the fittest’. This study performs an empirical assessment of the performance of various EC techniques in the prediction of software defects over multiple data sets. An empirical assessment compares and assesses the performance capability of 16 EC techniques for evaluating the relationship between object-oriented metrics and defect prediction. The developed models are validated using 7 data sets obtained from open source software systems developed by the Software Foundation. On investigating their predictive capabilities and comparative performance, it was found that a majority of EC techniques proved to be highly effective. DTG (a hybridized algorithm) was observed to be the best performing technique. The work done in the current study shows that EC techniques are very effective and can be highly beneficial to testers in the realm of defect prediction in the future.


Software defects Software metrics Evolutionary computation Software quality Defect prediction Metrics Defects 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Grosan, C.; Abraham, A.: Hybrid evolutionary algorithms: methodologies, architectures, and reviews. In: Abraham, A., Grosan, C., Ishibuchi, H. (eds.) Hybrid Evolutionary Algorithms, pp. 1–17. Springer, Berlin Heidelberg (2007)Google Scholar
  2. 2.
    Elish, K.O.; Elish, M.O.: Predicting defect-prone software modules using support vector machines. J. Syst. Softw. 81(5), 649–660 (2008)CrossRefGoogle Scholar
  3. 3.
    Harman, M.: Why the virtual nature of software makes it ideal for search based optimization. In: International Conference on Fundamental Approaches to Software Engineering, pp. 1–12. Springer, Berlin, Heidelberg (2010)Google Scholar
  4. 4.
    Jiang, Y.; Cukic, B.; Menzies, T.; Lin, J.: Incremental development of fault prediction models. Int. J. Softw. Eng. Knowl. Eng. 23(10), 1399–1425 (2013)CrossRefGoogle Scholar
  5. 5.
    Chhillar, R.S.: Empirical analysis of object-oriented design metrics for predicting high, medium and low severity faults using mallows C p. ACM SIGSOFT Softw. Eng. Notes 36(6), 1–9 (2011)CrossRefGoogle Scholar
  6. 6.
    Chidamber, S.R.; Kemerer, C.F.: A metrics suite for object oriented design. IEEE Trans. Softw. Eng. 20(6), 476–493 (1994)CrossRefGoogle Scholar
  7. 7.
    Gyimothy, T.; Ferenc, R.; Siket, I.: Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans. Softw. Eng. 31(10), 897–910 (2005)CrossRefGoogle Scholar
  8. 8.
    Zhou, Y.; Xu, B.; Leung, H.: On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J. Syst. Softw. 83(4), 660–674 (2010)CrossRefGoogle Scholar
  9. 9.
    Rodriguez, D.; Ruiz, R.; Riquelme, J.C.; Harrison, R.: A study of subgroup discovery approaches for defect prediction. Inf. Softw. Technol. 55(10), 1810–1822 (2013)CrossRefGoogle Scholar
  10. 10.
    Yu, L.: An evolutionary programming based asymmetric weighted least squares support vector machine ensemble learning methodology for software repository mining. Inf. Sci. 191, 31–46 (2012)Google Scholar
  11. 11.
    Rodrguez, D.; Ruiz, R.; Riquelme, J.C.; Aguilar-Ruiz, J.S.: Searching for rules to detect defective modules: a subgroup discovery approach. Inf. Sci. 191, 14–30 (2012)CrossRefGoogle Scholar
  12. 12.
    Liu, Y.; Khoshgoftaar, T.M.; Seliya, N.: Evolutionary optimization of software quality modeling with multiple repositories. IEEE Trans. Softw. Eng. 36(6), 852–864 (2010)CrossRefGoogle Scholar
  13. 13.
    Catal, C.; Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179(8), 1040–1058 (2009)CrossRefGoogle Scholar
  14. 14.
    De Carvalho, A.B.; Pozo, A.; Vergilio, S.; Lenz, A.: Predicting fault proneness of classes trough a multiobjective particle swarm optimization algorithm. In: ICTAI’08. 20th IEEE International Conference on Tools with Artificial Intelligence, 2008, vol. 2, pp. 387–394. IEEE (2008)Google Scholar
  15. 15.
    Catal, C.; Diri, B.; Ozumut, B.: An artificial immune system approach for fault prediction in object-oriented software. In: 2nd International Conference on Dependability of Computer Systems, 2007. DepCoS-RELCOMEX’07, pp. 238–245. IEEE (2007)Google Scholar
  16. 16.
    Lessmann, S.; Baesens, B.; Mues, C.; Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)CrossRefGoogle Scholar
  17. 17.
    Shatnawi, R.; Li, W.: The effectiveness of software metrics in identifying error-prone classes in post-release software evolution process. J. Syst. Softw. 81(11), 1868–1882 (2008)CrossRefGoogle Scholar
  18. 18.
    Semwal, V.B.; Mondal, K.; Nandi, G.C.: Robust and accurate feature selection for humanoid push recovery and classification: deep learning approach. Neural Comput. Appl. 28, 1–10 (2015)Google Scholar
  19. 19.
    Semwal, V.B.; Singha, J.; Sharma, P.K.; Chauhan, A.; Behera, B.: An optimized feature selection technique based on incremental feature analysis for bio-metric gait data classification. Multimed. Tools Appl. 76, 1–192 (2016)Google Scholar
  20. 20.
    Vandecruys, O.; Martens, D.; Baesens, B.; Mues, C.; De Backer, M.; Haesen, R.: Mining software repositories for comprehensible software fault prediction models. J. Syst. Softw. 81(5), 823–839 (2008)CrossRefGoogle Scholar
  21. 21.
    Gondra, I.: Applying machine learning to software fault-proneness prediction. J. Syst. Softw. 81(2), 186–195 (2008)CrossRefGoogle Scholar
  22. 22.
    Kanmani, S.; Uthariaraj, V.R.; Sankaranarayanan, V.; Thambidurai, P.: Object-oriented software fault prediction using neural networks. Inf. Softw. Technol. 49(5), 483–492 (2007)CrossRefGoogle Scholar
  23. 23.
    Di Martino, S.; Ferrucci, F.; Gravino, C.; Sarro, F.: A genetic algorithm to configure support vector machines for predicting fault-prone components. In: International Conference on Product Focused Software Process Improvement, pp. 247–261. Springer, Berlin, Heidelberg (2011)Google Scholar
  24. 24.
    Azar, D.; Vybihal, J.: An ant colony optimization algorithm to improve software quality prediction models: case of class stability. Inf. Softw. Technol. 53(4), 388–393 (2011)CrossRefGoogle Scholar
  25. 25.
    Pal, A.; Jain, H.; Kumar, M.: Optimizing software error proneness prediction using bird mating algorithm. In: Mahmood, Z. (ed.) Software Project Management for Distributed Computing, pp. 257–287. Springer International Publishing (2017)Google Scholar
  26. 26.
    Rathore, S.S.; Kumar, S.: Towards an ensemble based system for predicting the number of software faults. Expert Syst. Appl. 82, 357–382 (2017)CrossRefGoogle Scholar
  27. 27.
    Bansiya, J.; Davis, C.G.: A hierarchical model for object-oriented design quality assessment. IEEE Trans. Softw. Eng. 28(1), 4–17 (2002)CrossRefGoogle Scholar
  28. 28.
    Henderson-Sellers, B.: Object-Oriented Metrics: Measures of Complexity. Prentice-Hall Inc, Englewood Cliffs (1995)Google Scholar
  29. 29.
    Zou, H.; Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© King Fahd University of Petroleum & Minerals 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceNational Institute of Techology PatnaPatnaIndia

Personalised recommendations