Diabetes Mellitus Prediction Using Ensemble Machine Learning Techniques

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 1192)


Diabetes is a non-communicable disease and currently it is increasing at an alarming rate. It may cause different serious damage in particular; blur vision, myopia, burning extremities, kidney and heart failure. At this moment it is becoming one of the major diseases. Diabetes occurs when the level of sugar crosses a certain level or the human body can not produce sufficient insulin to balance the level. Therefore, diabetes affected patients need to be informed about it so that they can get proper treatments to control diabetes. For this reason, it is important to predict and classify diabetes at an early stage. So, in this analysis, two Machine Learning algorithms have been used to classify diabetes and compared the performances of the algorithms. The collected dataset has 340 instances and each instance has 26 features. In this study, two Ensemble Machine Learning algorithms have been used, namely Bagging and Decorate. Bagging classified the types of diabetes 95.59% accurately, whereas Decorate classified 98.53% accurately.


Machine Learning Bagging Decorate Diabetes Mellitus Classification Ensemble learning Algorithms Prediction 


  1. 1.
  2. 2.
    Diabetes definition causes and symptoms. (2019). Accessed 05 Jan 2019
  3. 3.
    Introduction: Standards of Medical Care in Diabetes-2019. Diabetes Care 42(1), S1–S2 (2018). Accessed 12 Jan 2019
  4. 4.
  5. 5.
    Akter, S., Rahman, M., Abe, S., Sultana, P.: Prevalence of diabetes and prediabetes and their risk factors among Bangladeshi adults: a nationwide survey. World Health Organ. 92, 204–213A (2014)CrossRefGoogle Scholar
  6. 6.
    Alahmar, A., Mohammed, E., Benlamri, R.: Application of data mining techniques to predict the length of stay of hospitalized patients with diabetes. In: 2018 4th International Conference on Big Data Innovations and Applications (Innovate-Data), Barcelona, Spain (2018)Google Scholar
  7. 7.
    Mir, A., Dhage, S.: Diabetes disease prediction using machine learning on big data of healthcare. In: 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA), Pune, India (2018)Google Scholar
  8. 8.
    Dutta, D., Paul, D., Ghosh, P.: Analysing feature importances for diabetes prediction using machine learning. In: 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, Canada (2018)Google Scholar
  9. 9.
    Rallapalli, S., Suryakanthi, T.: Predicting the risk of diabetes in big data electronic health Records by using scalable random forest classification algorithm. In: 2016 International Conference on Advances in Computing and Communication Engineering (ICACCE), Durban, South Africa (2016)Google Scholar
  10. 10.
    Manna, S., Maity, S., Munshi, S., Adhikari, M.: Diabetes prediction model using cloud analytics. In: 2018 International Conference on Advances in Computing, Communications and Informatics (ICACCI), Bangalore, India (2018)Google Scholar
  11. 11.
    Raihan, M., et al.: A comprehensive analysis on risk prediction of acute coronary syndrome using machine learning approaches. In: 21st International Conference of Computer and Information Technology (ICCIT), Dhaka, Bangladesh, pp. 1–6 (2018)Google Scholar
  12. 12.
    Xu, W., Zhang, J., Zhang, Q., Wei, X.: Risk prediction of type II diabetes based on random forest model. In: Third International Conference on Advances in Electrical, p. 2017. Electronics, Information, Communication and Bio-Informatics (AEEICB), Chennai, India (2017)Google Scholar
  13. 13.
    Kinge, D., Gaikwad, S.: Survey on data mining techniques for disease prediction. Int. Res. J. Eng. Technol. (IRJET) 05(01), 630–636 (2018). Accessed 11 May 2018Google Scholar
  14. 14.
    Verma, D., Mishra, N.: Analysis and prediction of breast cancer and diabetes disease datasets using data mining classification techniques. In: 2017 International Conference on Intelligent Sustainable Systems (ICISS), Palladam, India (2017)Google Scholar
  15. 15.
    Witten, I., Frank, E., Hall, M.: Data Mining Practical Machine Learning Tools and Techniques, 3rd edn, pp. 166–580. Morgan Kaufmann, Burlington (2011)Google Scholar
  16. 16.
    Han, J., Kamber, M., Pei, J.: Data Mining Concepts and Techniques, 3rd edn, pp. 370–382. Morgan Kaufmann, Burlington (2011)zbMATHGoogle Scholar
  17. 17.
    Brownlee, J.: A Gentle Introduction to k-fold Cross-Validation, Machine Learning Mastery (2018)Google Scholar
  18. 18.
    Shubham, J.: Ensemble Learning — Bagging and Boosting, Medium (2018). Accessed 27 Jun 2018
  19. 19.
    Dean, J.: Big data, Data Mining, and Machine Learning, pp. 124–125. Wiley, Hoboken (2014)Google Scholar
  20. 20.
    Han, J., Kamber, M., Pei, J.: Data Mining, 3rd edn, pp. 370–382. Elsevier, Amsterdam (2011)Google Scholar
  21. 21.
    Kandan, H.: Bagging the skill of Bagging(Bootstrap aggregating). Medium (2018). Accessed 29 Jun 2019
  22. 22.
    Decorate, (2019). Accessed 31 Dec 2018
  23. 23.
    Melville, P., Mooney, R.: Constructing diverse classifier ensembles using artificial training examples. In: Proceedings of the 18th International Joint Conference on Artificial Intelligence, IJCAI 2003, Mexico, Acapulco, pp. 505–510 (2003)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.North Western UniversityKhulnaBangladesh
  2. 2.Jashore University of Science and TechnologyKhulnaBangladesh

Personalised recommendations