An Ensemble of Optimal Trees for Software Development Effort Estimation

Abdelali, Zakrani; Hicham, Moutachaouik; Abdelwahed, Namir

doi:10.1007/978-3-030-11914-0_6

An Ensemble of Optimal Trees for Software Development Effort Estimation

Zakrani Abdelali⁵,
Moutachaouik Hicham⁵ &
Namir Abdelwahed⁶

Conference paper
First Online: 01 March 2019

344 Accesses
4 Citations

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 66))

Abstract

Accurate estimation of software development effort plays a pivotal role in managing and controlling the software development projects more efficiently and effectively. Several software development effort estimation (SDEE) models have been proposed in the literature including machine learning techniques. However, none of these models proved to be powerful in all situation and their performance varies from one dataset to another. To overcome the weaknesses of single estimation techniques, the ensemble methods have been recently employed and evaluated in SDEE. In this paper, we have developed an ensemble of optimal trees for software development effort estimation. We have conducted an empirical study to evaluate and compare the performance of this optimal trees ensemble using five popular datasets and the 30% hold-out validation method. The results show that the proposed ensemble outperforms regression trees and random forest models in terms of MMRE, MdMRE and Pred(0.25) in all datasets used in this paper.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Albrecht, A.J., Gaffney, J.E.: Software function, source lines of code, and development effort prediction: a software science validation. IEEE Trans. Softw. Eng. SE–9(6), 639–648 (1983)
Article Google Scholar
Amazal, F.A., Idri, A., Abran, A.: Software development effort estimation using classical and fuzzy analogy: a cross-validation comparative study. Int. J. Comput. Intell. Appl. 13(3), 1450013 (2014)
Article Google Scholar
Andreou, A.S., Papatheocharous, E.: Software cost estimation using fuzzy decision trees. In: ASE 2008 - 23rd IEEE/ACM International Conference on Automated Software Engineering, pp. 371–374 (2008)
Google Scholar
Azzeh, M.: Software effort estimation based on optimized model tree. In: 7th International Conference on Predictive Models in Software Engineering, PROMISE 2011, Co-located with ESEM 2011 (2011)
Google Scholar
Basgalupp, M.P., Barros, R.C., Da Silva, T.S., De Carvalho, A.C.P.L.F.: Software effort prediction: a hyper-heuristic decision-tree based approach. In: 28th Annual ACM Symposium on Applied Computing, SAC 2013, pp. 1109–1116 (2013)
Google Scholar
Basri, S., Kama, N., Sarkan, H.M., Adli, S., Haneem, F.: An algorithmic-based change effort estimation model for software development. In: Murphy, G.C., Reeves, S., Potanin, A., Dietrich, J. (eds.) 23rd Asia-Pacific Software Engineering Conference, APSEC 2016, pp. 177–184. IEEE Computer Society (2016)
Google Scholar
Bernard, S., Heutte, L., Adam, S.: On the selection of decision trees in random forests. In: 2009 International Joint Conference on Neural Networks, pp. 302–307. IEEE (2009)
Google Scholar
Boehm, B.W.: Software Engineering Economics. Prentice Hall PTR, Upper Saddle River (1981)
MATH Google Scholar
Boehm, B.W., Clark, Horowitz, Brown, Reifer, Chulani, Madachy, R., Steece, B.: Software Cost Estimation with COCOMO II with CDROM. Prentice Hall PTR, Upper Saddle River (2000)
Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
MATH Google Scholar
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and Regression Trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Dawer, G., Barbu, A.: Relevant ensemble of trees. CoRR abs/1709.05545 (2017)
Google Scholar
Desharnais, J.M.: Analyse statistique de la productivitie des projets informatiques a partie de la technique des points de fonction. Master, University of Montreal (1989)
Google Scholar
Edinson, P., Muthuraj, L.: Performance analysis of FCM based ANFIS and ELMAN neural network in software effort estimation. Int. Arab. J. Inf. Technol. 15(1), 94–102 (2018)
Google Scholar
Elish, M.O.: Improved estimation of software project effort using multiple additive regression trees. Expert. Syst. Appl. 36(7), 10774–10778 (2009)
Article Google Scholar
Elish, M.O., Helmy, T., Hussain, M.I.: Empirical study of homogeneous and heterogeneous ensemble models for software development effort estimation. Math. Probl. Eng. 2013 (2013)
Google Scholar
Elyassami, S., Idri, A.: Applying fuzzy ID3 decision tree for software effort estimation. CoRR abs/1111.0158 (2011)
Google Scholar
Foss, T., Stensrud, E., Kitchenham, B., Myrtveit, I.: A simulation study of the model evaluation criterion MMRE. IEEE Trans. Softw. Eng. 29(11), 985–995 (2003)
Article Google Scholar
Hosni, M., Idri, A., Nassif, A.B., Abran, A.: Heterogeneous ensembles for software development effort estimation. In: 3rd International Conference on Soft Computing and Machine Intelligence, ISCMI 2016, pp. 174–178. Institute of Electrical and Electronics Engineers Inc. (2016)
Google Scholar
Idri, A., Abnane, I.: Fuzzy analogy based effort estimation: an empirical comparative study. In: 17th IEEE International Conference on Computer and Information Technology, CIT 2017, pp. 114–121. IEEE Inc. (2017)
Google Scholar
Idri, A., Abnane, I., Abran, A.: Evaluating Pred(p) and standardized accuracy criteria in software development effort estimation. J. Softw. Evol. Process. 30(4), e1925 (2018). https://doi.org/10.1002/smr.1925
Article Google Scholar
Idri, A., Abran, A., Khoshgoftaar, T.M.: Estimating software project effort by analogy based on linguistic values. In: 8th IEEE Symposium on Software Metrics, METRICS 2002, vol. 2002-January, pp. 21–30. IEEE Computer Society (2002)
Google Scholar
Idri, A., Hosni, M., Abran, A.: Systematic literature review of ensemble effort estimation. J. Syst. Softw. 118, 151–175 (2016)
Article Google Scholar
ISBSG: International software benchmarking standards group. Data Release 8 Repository, Data Release 8 Repository (2003). http://www.isbsg.org
Jorgensen, M.: Practical guidelines for expert-judgment-based software effort estimation. IEEE Softw. 22(3), 57–63 (2005)
Article Google Scholar
Jørgensen, M., Halkjelsvik, T.: The effects of request formats on judgment-based effort estimation. J. Syst. Softw. 83(1), 29–36 (2010)
Article Google Scholar
Jørgensen, M., Shepperd, M.J.: A systematic review of software development cost estimation studies. IEEE Trans. Softw. Eng. 33(1), 33–53 (2007)
Article Google Scholar
Kemerer, C.F.: An empirical validation of software cost estimation models. Commun. ACM 30(5), 416–429 (1987)
Article Google Scholar
Kendall, M., Stuart, A.: The Advanced Theory of Statistics. Vol. 1: Distribution Theory, 4th edn. Griffin, London (1977)
MATH Google Scholar
Khan, Z., Gul, A., Mahmoud, O., Miftahuddin, M., Perperoglou, A., Adler, W., Lausen, B.: An ensemble of optimal trees for class membership probability estimation. In: Analysis of Large and Complex Data, pp. 395–409. Springer, Cham (2016)
MATH Google Scholar
Kocaguneli, E., Menzies, T.: Software effort models should be assessed via leave-one-out validation. J. Syst. Softw. 86(7), 1879–1890 (2013)
Article Google Scholar
Latinne, P., Debeir, O., Decaestecker, C.: Limiting the number of trees in random forests. In: Kittler, J., Roli, F. (eds.) Multiple Classifier Systems, pp. 178–187. Springer, Heidelberg (2001)
Chapter Google Scholar
Li, Y.F., Xie, M., Goh, T.N.: A study of the non-linear adjustment for analogy based software cost estimation. Empir. Softw. Eng. 14(6), 603–643 (2009)
Article Google Scholar
Liaw, A., Wiener, M.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Google Scholar
MacDonell, S.G., Shepperd, M.J.: Combining techniques to optimize effort predictions in software project management. J. Syst. Softw. 66(2), 91–98 (2003)
Article Google Scholar
Mendes, E., Kitchenham, B.: Further comparison of cross-company and within-company effort estimation models for web applications. In: 10th International Symposium on Software Metrics, 2004. Proceedings, pp. 348–357. IEEE (2004)
Google Scholar
Nassif, A.B., Azzeh, M., Capretz, L.F., Ho, D.: A comparison between decision trees and decision tree forest models for software development effort estimation. In: 2013 3rd International Conference on Communications and Information Technology, ICCIT 2013, pp. 220–224 (2013)
Google Scholar
Nassif, A.B., Capretz, L.F., Ho, D., Azzeh, M.: A treeboost model for software effort estimation based on use case points. In: 11th IEEE International Conference on Machine Learning and Applications, ICMLA 2012, vol. 2, pp. 314–319 (2012)
Google Scholar
Oshiro, T.M., Perez, P.S., Baranauskas, J.A.: How many trees in a random forest? In: International Workshop on Machine Learning and Data Mining in Pattern Recognition, pp. 154–168. Springer (2012)
Google Scholar
Porter, A.A., Selby, R.W.: Evaluating techniques for generating metric-based classification trees. J. Syst. Softw. 12(3), 209–218 (1990)
Article Google Scholar
Rudin, C., Daubechies, I., Schapire, R.E.: The dynamics of AdaBoost: cyclic behavior and convergence of margins. J. Mach. Learn. Res. 5, 1557–1595 (2004)
MathSciNet MATH Google Scholar
Selby, R.W., Porter, A.A.: Learning from examples: generation and evaluation of decision trees for software resource analysis. IEEE Trans. Softw. Eng. 14(12), 1743–1757 (1988)
Article Google Scholar
Srinivasan, K., Fisher, D.: Machine learning approaches to estimating software development effort. IEEE Trans. Softw. Eng. 21(2), 126–137 (1995)
Article Google Scholar
Wen, J., Li, S., Lin, Z., Hu, Y., Huang, C.: Systematic literature review of machine learning based software development effort estimation models. Inf. Softw. Technol. 54(1), 41–59 (2012)
Article Google Scholar
Zakrani, A., Hain, M., Namir, A.: Investigating the use of random forests in software effort estimation. In: Second International Conference on Intelligent Computing in Data Sciences (ICDS 2018) (2018)
Google Scholar
Zakrani, A., Idri, A.: Applying radial basis function neural networks based on fuzzy clustering to estimate web applications effort. Int. Rev. Comput. Softw. 5(5), 516–524 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

ENSAM, 150 Boulevard Nile, Casablanca, Morocco
Zakrani Abdelali & Moutachaouik Hicham
Faculté des Sciences Ben M’sik, Bd Driss El Harti, Casablanca, Morocco
Namir Abdelwahed

Authors

Zakrani Abdelali
View author publications
You can also search for this author in PubMed Google Scholar
Moutachaouik Hicham
View author publications
You can also search for this author in PubMed Google Scholar
Namir Abdelwahed
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zakrani Abdelali .

Editor information

Editors and Affiliations

Faculty of Sciences and Technologies, Mohammedia, Morocco
Faddoul Khoukhi
Faculty of Sciences and Technologies, Settat, Morocco
Mohamed Bahaj
Faculty of Sciences and Technologies, Boukhalef Tangier, Morocco
Mostafa Ezziyyani

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Abdelali, Z., Hicham, M., Abdelwahed, N. (2019). An Ensemble of Optimal Trees for Software Development Effort Estimation. In: Khoukhi, F., Bahaj, M., Ezziyyani, M. (eds) Smart Data and Computational Intelligence. AIT2S 2018. Lecture Notes in Networks and Systems, vol 66. Springer, Cham. https://doi.org/10.1007/978-3-030-11914-0_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-11914-0_6
Published: 01 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-11913-3
Online ISBN: 978-3-030-11914-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics