Abstract
People with diabetes should be examined periodically to create a thorough clinical record and enable preventive measures. Machine learning is the technology that is currently being preferred to perform early-stage disease detection. Although various machine learning models are available, the most accurate model for forecasting the risk of developing early-stage diabetes is still unknown. This research study compares 14 machine learning models—neural network, logistic regression, SVC, gradient boosting classifier, extra trees classifier, bagging classifier, AdaBoost classifier, gaussian NB, MLP classifier, XGB classifier, LGBM classifier, k-nearest neighbor classifier, decision tree classifier, and random forest classifier to determine the best-suited algorithm for diabetes risk prediction. From the obtained results, it is evident that the random forest with extra trees classifier has delivered the best accuracy for predicting diabetes with an accuracy rate of 99.04%. With an accuracy value of 97.12%, the gradient boosting and bagging classifier models produced the next-best result.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pangribowo S (2020) Infodatin 2020 diabetes melitus. In: Pusat Data dan Informasi Kementerian Kesehatan RI. https://www.kemkes.go.id/downloads/resources/download/pusdatin/infodatin/Infodatin%202020%20Diabetes%20Melitus.pdf. Last Accessed 13 July 2023
Younis K, Alkhateeb A (2017) A new implementation of deep neural networks for optical character recognition and face recognition. In: Proceedings of the new trends in information technology (NTIT-2017), pp 25–27
Hu G, Yang Y, Yi D, Kittler J, Christmas W, Li SZ, Hospedales T (2015) When face recognition meets with deep learning: an evaluation of convolutional neural networks for face recognition
Faruque MF, Asaduzzaman, Sarker IH (2019) Performance analysis of machine learning techniques to predict diabetes mellitus. In: 2019 International conference on electrical, computer and communication engineering (ECCE). IEEE, pp 1–4
Sarwar MA, Kamal N, Hamid W, Shah MA (2018) Prediction of diabetes using machine learning algorithms in healthcare. In: 2018 24th International conference on automation and computing (ICAC). IEEE, pp 1–6
Wei S, Zhao X, Miao C (2018) A comprehensive exploration to the machine learning techniques for diabetes identification. In: 2018 IEEE 4th world forum on Internet of Things (WF-IoT). IEEE, pp 291–295
Kumari S, Kumar D, Mittal M (2021) An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. Int J Cogn Comput Eng 2:40–46
Mir A, Dhage SN (2018) Diabetes disease prediction using machine learning on big data of healthcare. In: 2018 Fourth international conference on computing communication control and automation (ICCUBEA). IEEE, pp 1–6
Dey SK, Hossain A, Rahman MM (2018) Implementation of a web application to predict diabetes disease: an approach using machine learning algorithm. In: 2018 21st International conference of computer and information technology (ICCIT). IEEE, pp 1–5
Dewi AMSI, Dwidasmara IBG (2020) Implementation of the K-Nearest Neighbor (KNN) algorithm for classification of obesity levels. JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) 9:277
Saru S, Subashree S (2019) Analysis and prediction of diabetes using machine learning. Int J Emerg Technol Innov Eng 5
Jain B, Ranawat N, Chittora P, Chakrabarti P, Poddar S (2021) WITHDRAWN: a machine learning perspective: to analyze diabetes. Mater Today Proc. https://doi.org/10.1016/j.matpr.2020.12.445
Abadi M, Agarwal A, Barham P et al (2015) TensorFlow: large-scale machine learning on heterogeneous distributed systems
GitHub-keras-team/keras: deep learning for humans
Pradhan G, Pradhan R, Khandelwal B (2021) A study on various machine learning algorithms used for prediction of diabetes mellitus, pp 553–561
Sonar P, JayaMalini K (2019) Diabetes prediction using different machine learning approaches. In: 2019 3rd International conference on computing methodologies and communication (ICCMC). IEEE, pp 367–371
Putra TAJ, Lesmana DC, Purnaba IGP (2021) Penghitungan Premi Asuransi Kendaraan Bermotor Menggunakan Generalized Linear Models dengan Distribusi Tweedie. Jambura J Math 3:115–127
Vapnik VN (2002) The nature of statistical learning theory, 2nd edn. Springer-Verlag, New York
Llora X, Garrell J-M (2002) Evolution of decision trees. In: Proceedings of 4th Catalan conference on artificial intelligence
Karsoliya S (2012) Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture. Int J Eng Trends Technol 3:713–717
Sexton RS, Gupta JND (2000) Comparative evaluation of genetic algorithm and backpropagation for training neural networks. Inf Sci (N Y) 129:45–59
Kabir MR, Ashraf FB, Ajwad R (2019) Analysis of different predicting model for online shoppers’ purchase intention from empirical data. In: 2019 22nd International conference on computer and information technology (ICCIT). IEEE, pp 1–6
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Darmawan, I., Gunawan, R.I., Rahmatulloh, A. (2024). Model Accuracy Test for Early Stage of Diabetes Risk Prediction with Data Science Approach. In: Asirvatham, D., Gonzalez-Longatt, F.M., Falkowski-Gilski, P., Kanthavel, R. (eds) Evolutionary Artificial Intelligence. ICEASSM 2017. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-99-8438-1_5
Download citation
DOI: https://doi.org/10.1007/978-981-99-8438-1_5
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8437-4
Online ISBN: 978-981-99-8438-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)