Skip to main content

An Ensemble of Optimal Trees for Software Development Effort Estimation

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 66))

Abstract

Accurate estimation of software development effort plays a pivotal role in managing and controlling the software development projects more efficiently and effectively. Several software development effort estimation (SDEE) models have been proposed in the literature including machine learning techniques. However, none of these models proved to be powerful in all situation and their performance varies from one dataset to another. To overcome the weaknesses of single estimation techniques, the ensemble methods have been recently employed and evaluated in SDEE. In this paper, we have developed an ensemble of optimal trees for software development effort estimation. We have conducted an empirical study to evaluate and compare the performance of this optimal trees ensemble using five popular datasets and the 30% hold-out validation method. The results show that the proposed ensemble outperforms regression trees and random forest models in terms of MMRE, MdMRE and Pred(0.25) in all datasets used in this paper.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Albrecht, A.J., Gaffney, J.E.: Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. SE–9(6), 639–648 (1983)

    Article  Google Scholar 

  2. Amazal, F.A., Idri, A., Abran, A.: Software development effort estimation using classical and fuzzy analogy: a cross-validation comparative study. Int. J. Comput. Intell. Appl. 13(3), 1450013 (2014)

    Article  Google Scholar 

  3. Andreou, A.S., Papatheocharous, E.: Software cost estimation using fuzzy decision trees. In: ASE 2008 - 23rd IEEE/ACM International Conference on Automated Software Engineering, pp. 371–374 (2008)

    Google Scholar 

  4. Azzeh, M.: Software effort estimation based on optimized model tree. In: 7th International Conference on Predictive Models in Software Engineering, PROMISE 2011, Co-located with ESEM 2011 (2011)

    Google Scholar 

  5. Basgalupp, M.P., Barros, R.C., Da Silva, T.S., De Carvalho, A.C.P.L.F.: Software effort prediction: a hyper-heuristic decision-tree based approach. In: 28th Annual ACM Symposium on Applied Computing, SAC 2013, pp. 1109–1116 (2013)

    Google Scholar 

  6. Basri, S., Kama, N., Sarkan, H.M., Adli, S., Haneem, F.: An algorithmic-based change effort estimation model for software development. In: Murphy, G.C., Reeves, S., Potanin, A., Dietrich, J. (eds.) 23rd Asia-Pacific Software Engineering Conference, APSEC 2016, pp. 177–184. IEEE Computer Society (2016)

    Google Scholar 

  7. Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: 2009 International Joint Conference on Neural Networks, pp. 302–307. IEEE (2009)

    Google Scholar 

  8. Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Upper Saddle River (1981)

    MATH  Google Scholar 

  9. Boehm, B.W., Clark, Horowitz, Brown, Reifer, Chulani, Madachy, R., Steece, B.: Software Cost Estimation with COCOMO II with CDROM. Prentice Hall PTR, Upper Saddle River (2000)

    Google Scholar 

  10. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    MATH  Google Scholar 

  11. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  12. Dawer, G., Barbu, A.: Relevant ensemble of trees. CoRR abs/1709.05545 (2017)

    Google Scholar 

  13. Desharnais, J.M.: Analyse statistique de la productivitie des projets informatiques a partie de la technique des points de fonction. Master, University of Montreal (1989)

    Google Scholar 

  14. Edinson, P., Muthuraj, L.: Performance analysis of FCM based ANFIS and ELMAN neural network in software effort estimation. Int. Arab. J. Inf. Technol. 15(1), 94–102 (2018)

    Google Scholar 

  15. Elish, M.O.: Improved estimation of software project effort using multiple additive regression trees. Expert. Syst. Appl. 36(7), 10774–10778 (2009)

    Article  Google Scholar 

  16. Elish, M.O., Helmy, T., Hussain, M.I.: Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math. Probl. Eng. 2013 (2013)

    Google Scholar 

  17. Elyassami, S., Idri, A.: Applying fuzzy ID3 decision tree for software effort estimation. CoRR abs/1111.0158 (2011)

    Google Scholar 

  18. Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model evaluation criterion MMRE. IEEE Trans. Softw. Eng. 29(11), 985–995 (2003)

    Article  Google Scholar 

  19. Hosni, M., Idri, A., Nassif, A.B., Abran, A.: Heterogeneous ensembles for software development effort estimation. In: 3rd International Conference on Soft Computing and Machine Intelligence, ISCMI 2016, pp. 174–178. Institute of Electrical and Electronics Engineers Inc. (2016)

    Google Scholar 

  20. Idri, A., Abnane, I.: Fuzzy analogy based effort estimation: an empirical comparative study. In: 17th IEEE International Conference on Computer and Information Technology, CIT 2017, pp. 114–121. IEEE Inc. (2017)

    Google Scholar 

  21. Idri, A., Abnane, I., Abran, A.: Evaluating Pred(p) and standardized accuracy criteria in software development effort estimation. J. Softw. Evol. Process. 30(4), e1925 (2018). https://doi.org/10.1002/smr.1925

    Article  Google Scholar 

  22. Idri, A., Abran, A., Khoshgoftaar, T.M.: Estimating software project effort by analogy based on linguistic values. In: 8th IEEE Symposium on Software Metrics, METRICS 2002, vol. 2002-January, pp. 21–30. IEEE Computer Society (2002)

    Google Scholar 

  23. Idri, A., Hosni, M., Abran, A.: Systematic literature review of ensemble effort estimation. J. Syst. Softw. 118, 151–175 (2016)

    Article  Google Scholar 

  24. ISBSG: International software benchmarking standards group. Data Release 8 Repository, Data Release 8 Repository (2003). http://www.isbsg.org

  25. Jorgensen, M.: Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw. 22(3), 57–63 (2005)

    Article  Google Scholar 

  26. Jørgensen, M., Halkjelsvik, T.: The effects of request formats on judgment-based effort estimation. J. Syst. Softw. 83(1), 29–36 (2010)

    Article  Google Scholar 

  27. Jørgensen, M., Shepperd, M.J.: A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33(1), 33–53 (2007)

    Article  Google Scholar 

  28. Kemerer, C.F.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)

    Article  Google Scholar 

  29. Kendall, M., Stuart, A.: The Advanced Theory of Statistics. Vol. 1: Distribution Theory, 4th edn. Griffin, London (1977)

    MATH  Google Scholar 

  30. Khan, Z., Gul, A., Mahmoud, O., Miftahuddin, M., Perperoglou, A., Adler, W., Lausen, B.: An ensemble of optimal trees for class membership probability estimation. In: Analysis of Large and Complex Data, pp. 395–409. Springer, Cham (2016)

    MATH  Google Scholar 

  31. Kocaguneli, E., Menzies, T.: Software effort models should be assessed via leave-one-out validation. J. Syst. Softw. 86(7), 1879–1890 (2013)

    Article  Google Scholar 

  32. Latinne, P., Debeir, O., Decaestecker, C.: Limiting the number of trees in random forests. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, pp. 178–187. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  33. Li, Y.F., Xie, M., Goh, T.N.: A study of the non-linear adjustment for analogy based software cost estimation. Empir. Softw. Eng. 14(6), 603–643 (2009)

    Article  Google Scholar 

  34. Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)

    Google Scholar 

  35. MacDonell, S.G., Shepperd, M.J.: Combining techniques to optimize effort predictions in software project management. J. Syst. Softw. 66(2), 91–98 (2003)

    Article  Google Scholar 

  36. Mendes, E., Kitchenham, B.: Further comparison of cross-company and within-company effort estimation models for web applications. In: 10th International Symposium on Software Metrics, 2004. Proceedings, pp. 348–357. IEEE (2004)

    Google Scholar 

  37. Nassif, A.B., Azzeh, M., Capretz, L.F., Ho, D.: A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 3rd International Conference on Communications and Information Technology, ICCIT 2013, pp. 220–224 (2013)

    Google Scholar 

  38. Nassif, A.B., Capretz, L.F., Ho, D., Azzeh, M.: A treeboost model for software effort estimation based on use case points. In: 11th IEEE International Conference on Machine Learning and Applications, ICMLA 2012, vol. 2, pp. 314–319 (2012)

    Google Scholar 

  39. Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, pp. 154–168. Springer (2012)

    Google Scholar 

  40. Porter, A.A., Selby, R.W.: Evaluating techniques for generating metric-based classification trees. J. Syst. Softw. 12(3), 209–218 (1990)

    Article  Google Scholar 

  41. Rudin, C., Daubechies, I., Schapire, R.E.: The dynamics of AdaBoost: cyclic behavior and convergence of margins. J. Mach. Learn. Res. 5, 1557–1595 (2004)

    MathSciNet  MATH  Google Scholar 

  42. Selby, R.W., Porter, A.A.: Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Trans. Softw. Eng. 14(12), 1743–1757 (1988)

    Article  Google Scholar 

  43. Srinivasan, K., Fisher, D.: Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21(2), 126–137 (1995)

    Article  Google Scholar 

  44. Wen, J., Li, S., Lin, Z., Hu, Y., Huang, C.: Systematic literature review of machine learning based software development effort estimation models. Inf. Softw. Technol. 54(1), 41–59 (2012)

    Article  Google Scholar 

  45. Zakrani, A., Hain, M., Namir, A.: Investigating the use of random forests in software effort estimation. In: Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018) (2018)

    Google Scholar 

  46. Zakrani, A., Idri, A.: Applying radial basis function neural networks based on fuzzy clustering to estimate web applications effort. Int. Rev. Comput. Softw. 5(5), 516–524 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zakrani Abdelali .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Abdelali, Z., Hicham, M., Abdelwahed, N. (2019). An Ensemble of Optimal Trees for Software Development Effort Estimation. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_6

Download citation

Publish with us

Policies and ethics