Abstract
Accurate estimation of software development effort plays a pivotal role in managing and controlling the software development projects more efficiently and effectively. Several software development effort estimation (SDEE) models have been proposed in the literature including machine learning techniques. However, none of these models proved to be powerful in all situation and their performance varies from one dataset to another. To overcome the weaknesses of single estimation techniques, the ensemble methods have been recently employed and evaluated in SDEE. In this paper, we have developed an ensemble of optimal trees for software development effort estimation. We have conducted an empirical study to evaluate and compare the performance of this optimal trees ensemble using five popular datasets and the 30% hold-out validation method. The results show that the proposed ensemble outperforms regression trees and random forest models in terms of MMRE, MdMRE and Pred(0.25) in all datasets used in this paper.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Albrecht, A.J., Gaffney, J.E.: Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. SE–9(6), 639–648 (1983)
Amazal, F.A., Idri, A., Abran, A.: Software development effort estimation using classical and fuzzy analogy: a cross-validation comparative study. Int. J. Comput. Intell. Appl. 13(3), 1450013 (2014)
Andreou, A.S., Papatheocharous, E.: Software cost estimation using fuzzy decision trees. In: ASE 2008 - 23rd IEEE/ACM International Conference on Automated Software Engineering, pp. 371–374 (2008)
Azzeh, M.: Software effort estimation based on optimized model tree. In: 7th International Conference on Predictive Models in Software Engineering, PROMISE 2011, Co-located with ESEM 2011 (2011)
Basgalupp, M.P., Barros, R.C., Da Silva, T.S., De Carvalho, A.C.P.L.F.: Software effort prediction: a hyper-heuristic decision-tree based approach. In: 28th Annual ACM Symposium on Applied Computing, SAC 2013, pp. 1109–1116 (2013)
Basri, S., Kama, N., Sarkan, H.M., Adli, S., Haneem, F.: An algorithmic-based change effort estimation model for software development. In: Murphy, G.C., Reeves, S., Potanin, A., Dietrich, J. (eds.) 23rd Asia-Pacific Software Engineering Conference, APSEC 2016, pp. 177–184. IEEE Computer Society (2016)
Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: 2009 International Joint Conference on Neural Networks, pp. 302–307. IEEE (2009)
Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Upper Saddle River (1981)
Boehm, B.W., Clark, Horowitz, Brown, Reifer, Chulani, Madachy, R., Steece, B.: Software Cost Estimation with COCOMO II with CDROM. Prentice Hall PTR, Upper Saddle River (2000)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
Dawer, G., Barbu, A.: Relevant ensemble of trees. CoRR abs/1709.05545 (2017)
Desharnais, J.M.: Analyse statistique de la productivitie des projets informatiques a partie de la technique des points de fonction. Master, University of Montreal (1989)
Edinson, P., Muthuraj, L.: Performance analysis of FCM based ANFIS and ELMAN neural network in software effort estimation. Int. Arab. J. Inf. Technol. 15(1), 94–102 (2018)
Elish, M.O.: Improved estimation of software project effort using multiple additive regression trees. Expert. Syst. Appl. 36(7), 10774–10778 (2009)
Elish, M.O., Helmy, T., Hussain, M.I.: Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math. Probl. Eng. 2013 (2013)
Elyassami, S., Idri, A.: Applying fuzzy ID3 decision tree for software effort estimation. CoRR abs/1111.0158 (2011)
Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model evaluation criterion MMRE. IEEE Trans. Softw. Eng. 29(11), 985–995 (2003)
Hosni, M., Idri, A., Nassif, A.B., Abran, A.: Heterogeneous ensembles for software development effort estimation. In: 3rd International Conference on Soft Computing and Machine Intelligence, ISCMI 2016, pp. 174–178. Institute of Electrical and Electronics Engineers Inc. (2016)
Idri, A., Abnane, I.: Fuzzy analogy based effort estimation: an empirical comparative study. In: 17th IEEE International Conference on Computer and Information Technology, CIT 2017, pp. 114–121. IEEE Inc. (2017)
Idri, A., Abnane, I., Abran, A.: Evaluating Pred(p) and standardized accuracy criteria in software development effort estimation. J. Softw. Evol. Process. 30(4), e1925 (2018). https://doi.org/10.1002/smr.1925
Idri, A., Abran, A., Khoshgoftaar, T.M.: Estimating software project effort by analogy based on linguistic values. In: 8th IEEE Symposium on Software Metrics, METRICS 2002, vol. 2002-January, pp. 21–30. IEEE Computer Society (2002)
Idri, A., Hosni, M., Abran, A.: Systematic literature review of ensemble effort estimation. J. Syst. Softw. 118, 151–175 (2016)
ISBSG: International software benchmarking standards group. Data Release 8 Repository, Data Release 8 Repository (2003). http://www.isbsg.org
Jorgensen, M.: Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw. 22(3), 57–63 (2005)
Jørgensen, M., Halkjelsvik, T.: The effects of request formats on judgment-based effort estimation. J. Syst. Softw. 83(1), 29–36 (2010)
Jørgensen, M., Shepperd, M.J.: A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33(1), 33–53 (2007)
Kemerer, C.F.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)
Kendall, M., Stuart, A.: The Advanced Theory of Statistics. Vol. 1: Distribution Theory, 4th edn. Griffin, London (1977)
Khan, Z., Gul, A., Mahmoud, O., Miftahuddin, M., Perperoglou, A., Adler, W., Lausen, B.: An ensemble of optimal trees for class membership probability estimation. In: Analysis of Large and Complex Data, pp. 395–409. Springer, Cham (2016)
Kocaguneli, E., Menzies, T.: Software effort models should be assessed via leave-one-out validation. J. Syst. Softw. 86(7), 1879–1890 (2013)
Latinne, P., Debeir, O., Decaestecker, C.: Limiting the number of trees in random forests. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, pp. 178–187. Springer, Heidelberg (2001)
Li, Y.F., Xie, M., Goh, T.N.: A study of the non-linear adjustment for analogy based software cost estimation. Empir. Softw. Eng. 14(6), 603–643 (2009)
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
MacDonell, S.G., Shepperd, M.J.: Combining techniques to optimize effort predictions in software project management. J. Syst. Softw. 66(2), 91–98 (2003)
Mendes, E., Kitchenham, B.: Further comparison of cross-company and within-company effort estimation models for web applications. In: 10th International Symposium on Software Metrics, 2004. Proceedings, pp. 348–357. IEEE (2004)
Nassif, A.B., Azzeh, M., Capretz, L.F., Ho, D.: A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 3rd International Conference on Communications and Information Technology, ICCIT 2013, pp. 220–224 (2013)
Nassif, A.B., Capretz, L.F., Ho, D., Azzeh, M.: A treeboost model for software effort estimation based on use case points. In: 11th IEEE International Conference on Machine Learning and Applications, ICMLA 2012, vol. 2, pp. 314–319 (2012)
Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, pp. 154–168. Springer (2012)
Porter, A.A., Selby, R.W.: Evaluating techniques for generating metric-based classification trees. J. Syst. Softw. 12(3), 209–218 (1990)
Rudin, C., Daubechies, I., Schapire, R.E.: The dynamics of AdaBoost: cyclic behavior and convergence of margins. J. Mach. Learn. Res. 5, 1557–1595 (2004)
Selby, R.W., Porter, A.A.: Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Trans. Softw. Eng. 14(12), 1743–1757 (1988)
Srinivasan, K., Fisher, D.: Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21(2), 126–137 (1995)
Wen, J., Li, S., Lin, Z., Hu, Y., Huang, C.: Systematic literature review of machine learning based software development effort estimation models. Inf. Softw. Technol. 54(1), 41–59 (2012)
Zakrani, A., Hain, M., Namir, A.: Investigating the use of random forests in software effort estimation. In: Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018) (2018)
Zakrani, A., Idri, A.: Applying radial basis function neural networks based on fuzzy clustering to estimate web applications effort. Int. Rev. Comput. Softw. 5(5), 516–524 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Abdelali, Z., Hicham, M., Abdelwahed, N. (2019). An Ensemble of Optimal Trees for Software Development Effort Estimation. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-11914-0_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11913-3
Online ISBN: 978-3-030-11914-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)