Establishment of Risk Prediction Model for Retinopathy in Type 2 Diabetic Patients

  • Jianzhuo Yan
  • Xiaoxue DuEmail author
  • Yongchuan Yu
  • Hongxia Xu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11976)


Diabetic retinopathy (DR) is one of the complications of diabetes mellitus, which is an important manifestation of diabetic microangiopathy and major cause of vision loss in middle-aged and elderly people worldwide. Establishing a risk prediction model for diabetic retinopathy can discover high-risk groups and early warn diabetic retinopathy, which can effectively reduce the medical cost of diabetes. The experimental data was derived from the electronic medical records of a tertiary hospital of Beijing from 2013 to 2017, including 29 inspection indicators. In this study, we compared the predictive models of type 2 diabetes mellitus complicated with retinopathy, and finally selected the random forest method to construct the risk prediction model. The weights of each index are analyzed by linear regression algorithm, the combination of inspection indicators with the highest accuracy is selected, and the random forest model is optimized to improve the accuracy of the classification prediction model, accuracy increased by 3.7264%. The predictive model provides a basis for early diagnosis of diabetic retina and optimization of the diagnostic process.


Type 2 diabetic retinopathy Risk prediction model Random forest algorithm Linear regression analysis 



This work is supported by the CERNET Innovation Project (No. NGII20170719) and the Beijing Municipal Education Commission.



  1. 1.
    Field, R.A.: The current status of hormonal suppression in the treatment of diabetic retinopathy. Trans. – Am. Acad. Ophthalmol. Otolaryngol. 72(2), 241–245 (1968)Google Scholar
  2. 2.
    Zheng, Z.: Clinical prevention and treatment of diabetic retinopathy: progress, challenges and prospects. Chin. J. Fundus Dis. 28(3), 209–213 (2012)Google Scholar
  3. 3.
    Klein, B.E.: Overview of epidemiologic studies of diabetic retinopathy. Ophthalmic Epidemiol. 14(4), 179–183 (2007)Google Scholar
  4. 4.
    Nelson, R.G., Newman, J.M., Knowler, W.C.: Incidence of end-stage renal disease in type 2 (non-insulin-dependent) diabetes mellitus in Pima Indians. Diabetologia 31(10), 730–736 (1988)Google Scholar
  5. 5.
    Levy, J.C., Cull, C.A., Stratton, I.M.: The UKPDS study on glycemic control and arterial hypertension in type II diabetes: objectives, structure and preliminary results. J. Annu. Diabetol. Hotel Dieu (1993)Google Scholar
  6. 6.
    Mcewan, P., Foos, V., Palmer, J.L.: Validation of the IMS CORE diabetes model. Value Health J. Int. Soc. Pharmacoeconomics Outcomes Res. 17(6), 714–724 (2014)Google Scholar
  7. 7.
    Brändle, M., Herman, W.H.: The CORE diabetes mode. Curr. Med. Res. Opin. 20(sup1), S1–S3 (2004)Google Scholar
  8. 8.
    Ge, L.I., Jin, L.Z.: Establishing a model for predicting diabetes complications based on the LVQ neural work. Chin. J. Nat. Med. (2006)Google Scholar
  9. 9.
    Wang, Z., Song, Z., Bai, J.: Decision tree analysis of nephropathy risk in patients with type 2 diabetes mellitus. Chin. J. Integr. Tradit. West. Med. Nephrol. 14(3), 238–239 (2013)Google Scholar
  10. 10.
    Geng, L., Li, X.: Research on KNN algorithm for big data classification. J. Comput. Appl. 31(5), 1342–1344 (2014)Google Scholar
  11. 11.
    Safavian, S.R., Landgrebe, D.: A survey of decision tree classifier methodology. IEEE Trans. Syst. Man Cybernet. 21(3), 660–674 (2002)MathSciNetGoogle Scholar
  12. 12.
    Freund, Y., Mason, L.: The alternating decision tree learning algorithm. In: Proceeding of the, International Conference on Machine Learning, pp. 124–133. Morgan Kaufmann (1999)Google Scholar
  13. 13.
    Mingers, J.: An empirical comparison of pruning methods for decision tree induction. Mach. Learn. 4(2), 227–243 (1989)Google Scholar
  14. 14.
    Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)zbMATHGoogle Scholar
  15. 15.
    Cutler, A., Cutler, D.R., Stevens, J.R.: Random forests. Mach. Learn. 45(1), 157–176 (2004)Google Scholar
  16. 16.
    Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2005)Google Scholar
  17. 17.
    Metz, C.E.: Basic principles of ROC analysis. Semin. Nucl. Med. 8(4), 283–298 (1978)Google Scholar
  18. 18.
    Genuer, R.: VSURF: variable selection using random forests. Pattern Recogn. Lett. 31(14), 2225–2236 (2016)Google Scholar
  19. 19.
    Biau, G.: Analysis of a random forests model. J. Mach. Learn. Res. 13(2), 1063–1095 (2010)MathSciNetzbMATHGoogle Scholar
  20. 20.
    Archer, K.J., Kimes, R.V.: Empirical characterization of random forest variable importance measures. Elsevier Science Publishers B. V. (2008)Google Scholar
  21. 21.
    Lindner, C., Bromiley, P.A., Ionita, M.C.: Robust and accurate shape model matching using random forest regression-voting. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1862–1874 (2015)Google Scholar
  22. 22.
    Palmer, D.S., O’Boyle, N.M., Glen, R.C.: Random forest models to predict aqueous solubility. J. Chem. Inf. Model. 47(1), 150 (2007)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Jianzhuo Yan
    • 1
  • Xiaoxue Du
    • 1
    Email author
  • Yongchuan Yu
    • 1
  • Hongxia Xu
    • 1
  1. 1.Faculty of Information TechnologyBeijing University of TechnologyBeijingChina

Personalised recommendations