Predicting the Risk Factor for Developing Chronic Kidney Disease Using a 3-Stage Prediction Model

  • Hossam Medhat AlyEmail author
  • Mohamed Aborizka
  • Soha Safwat Labib
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1130)


Chronic Kidney Disease (CKD) is considered one of the major high-risk chronic diseases on humans’ health that causes death in its late stages. Moreover, treating CKD patients cost huge amounts to be paid based on their stage. In fact, it becomes significant not only to detect the disease in its early stages, but also, to have a way to early assess and predict the possibility for individuals to get affected in the future. In this research, a 3-Stage predictor is introduced to help predicting the risk factor for developing CKD during the healthcare screening based on a questionnaire and some laboratory tests. Also, it aims to reduce and eliminate the unjustified tests’ costs unless the tests are needed for the assessment by categorizing parameters into stages. A comparison between 12 classifiers led to choosing the 3 classifiers used in designing the 3-Stage model, based on the best accuracy and prediction speed. The 3-Stage model is designed using Bagged, Boosted and Medium Trees classifiers. The model was assessed on the dataset collected from the Centers of Disease Control and Prevention (CDC) in the United States. The trained 3-Stage model resulted in 99.97% accuracy by predicting around 3K cases in comparison with a 1-Stage model.


Machine learning Healthcare CKD Risk factor Data mining 


  1. 1.
    WHO, Preventing chronic diseases: a vital investment, World Health Organization (2005).
  2. 2.
    Pavithra, N., Shanmugavadivu, R.: Efficient early risk factor analysis of kidney disorder using data mining technique, pp. 1690–1698 (2017)Google Scholar
  3. 3.
  4. 4.
    LeDuc Media, World Rankings - Total Deaths.
  5. 5.
    National Chronic Kidney Disease Fact Sheet (2017)Google Scholar
  6. 6.
    Honeycutt, A.A., Segel, J.E., Zhuo, X., Hoerger, T.J., Imai, K., Williams, D.: Medical costs of CKD in the medicare population. J. Am. Soc. Nephrol. 24(9), 1478–1483 (2013)CrossRefGoogle Scholar
  7. 7.
    Vijayarani, S., Dhayanand, S.: Data mining classification algorithms for kidney disease prediction. J. Cybern. Inform. 4(4), 13–25 (2015)CrossRefGoogle Scholar
  8. 8.
    Koklu, M., Tutuncu, K.: Classification of chronic kidney disease with most known data mining methods. Int. J. Adv. Sci. Eng. Technol. 5(2), 14–18 (2017)Google Scholar
  9. 9.
    Anantha Padmanaban, K.R., Parthiban, G.: Applying machine learning techniques for predicting the risk of chronic kidney disease. Indian J. Sci. Technol. 9(29), 1–5 (2016)CrossRefGoogle Scholar
  10. 10.
    Sharma, S., Sharma, V., Sharma, A.: Performance based evaluation of various machine learning classification techniques for chronic kidney disease diagnosis (2016)Google Scholar
  11. 11.
    Tahmasebian, S., Ghazisaeedi, M., Langarizadeh, M., Mokhtaran, M.: Applying data mining techniques to determine important parameters in chronic kidney disease and the relations of these parameters to each other. J. Ren. Inj. Prev. 6(2), 83–87 (2017)CrossRefGoogle Scholar
  12. 12.
    CDC NHANES Dataset, Centers for Disease Control and Prevention (CDC). National Center for Health Statistics (NCHS). National Health and Nutrition Examination Survey Data. Hyattsville, MD, U.S. Department of Health and Human Services, Centers for Disease Control and Prevention.

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Hossam Medhat Aly
    • 1
    Email author
  • Mohamed Aborizka
    • 1
  • Soha Safwat Labib
    • 2
  1. 1.Arab Academy for Science, Technology and Maritime TransportCairoEgypt
  2. 2.The Egyptian Chinese UniversityCairoEgypt

Personalised recommendations