Skip to main content

Derivation of a Novel Diabetes Risk Score Using Semantic Discretization for Indian Population

  • Conference paper
  • First Online:
Ambient Communications and Computer Systems

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 696))

  • 1202 Accesses

Abstract

The objective of this study is to derive a simple, yet effective type 2 Diabetes Risk Score Tool for Indian population using semantic discretization and machine learning techniques. The dataset used for training and validation is taken from Annual Health Survey, containing over 1.65 million people’s health-related information from 284 districts of India. This is the first study of its kind that truly represents the Indian population. A combination of feature selections techniques is used to find the minimal subset of attributes that optimally contribute in determining the class attribute. Continuous independent variables (various diabetes risk factors) are discretized using semantic discretization technique. The discretized dataset is then used in deriving Weighted Diabetes Risk Score for each risk factor. An optimal cutoff value for Total Weighted Diabetes Risk Score (TWDRS) is determined based on the evaluation parameters such as sensitivity, specificity, prediction accuracy, and proportion of population kept in high risk. The dataset used for this study contains 16,38,923 records. Records (7,42,605) that meet our criteria are selected for this study. Experimental results show that, at optimal cut point, TWDRS >=19, sensitivity is 72.55%, specificity is 61.99%, and proportion of population at high risk is 39.29%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. David R. Whiting, Leonor Guariguata, Clara Weil, Jonathan Shaw, IDF Diabetes Atlas: Global estimates of the prevalence of diabetes for 2011 and 2030, Diabetes research and clinical practice 94 (2011)311–321, Elsevier.

    Google Scholar 

  2. Shashank R Joshi, “Indian Diabetes Risk Score”, JAPI, VOL. 53, SEPTEMBER 2005.

    Google Scholar 

  3. Noncommunicable diseases in the South-East Asia Region: Situation and response 2011, World Health Organization, ISBN 978-92-9022-413-6.

    Google Scholar 

  4. Charlotte Glümer, DorteVistisen, Knut Borch-Johnsen, Stephen Colagiuri, Risk Scores for Type 2 Diabetes Can Be Applied in Some Populations but Not All, Diabetes Care Feb 2006, 29 (2) 410–414; https://doi.org/10.2337/diacare.29.02.06.dc05-0945.

  5. Mohan V, Deepa R, Deepa M, Somannavar S, Datta M., “A simplified Indian Diabetes Risk Score for screening for undiagnosed diabetic subjects”, J Assoc Physicians India. 2005 Sep; 53:759–63.

    Google Scholar 

  6. Ramachandran, C. Snehalatha, V. Vijay, N.J. Wareham, S. Colagiuri, “Derivation and validation of diabetes risk score for urban Asian Indians”, Diabetes Research and Clinical Practice, October 2005.

    Google Scholar 

  7. Chaturvedi, V., Reddy, K. S., Prabhakaran, D., Jeemon, P., Ramakrishnan, L., Shah, P., & Shah, B. (2008). Development of a clinical risk score in predicting undiagnosed diabetes in urban Asian Indian adults: a population-based study. CVD prevention and control, 3(3), 141–151.

    Google Scholar 

  8. Chen L, Magliano DJ, Balkau B, Colagiuri S, Zimmet PZ, Tonkin AM, et al. AUSDRISK: an Australian type 2 diabetes risk assessment tool based on demographic, lifestyle and simple anthropometric measures. Med J Aust 2010; 192:197–202.

    Google Scholar 

  9. Chien K, Cai T, Hsu H, Su T, Chang W, Chen M, et al. A prediction model for type 2 diabetes risk among Chinese people. Diabetologia 2009; 52:443–50.

    Google Scholar 

  10. Chuang SY, Yeh WT, Wu YL, Chang HY, Pan WH, Tsao CK. Prediction equations and point system derived from large-scale health check-up data for estimating diabetic risk in the Chinese population of Taiwan. Diabetes Res ClinPract 2011; 92:128–36.

    Google Scholar 

  11. Gao WG, Qiao Q, Pitkaniemi J, Wild S, Magliano D, Shaw J, et al. Risk prediction models for the development of diabetes in Mauritian Indians. Diabet Med 2009; 26:996–1002.

    Google Scholar 

  12. Hippisley-Cox J, Coupland C, Robson J, Sheikh A, Brindle P. Predicting risk of type 2 diabetes in England and Wales: prospective derivation and validation of QDScore. BMJ 2009; 338:b880.

    Google Scholar 

  13. Kahn HS, Cheng YJ, Thompson TJ, Imperatore G, Gregg EW. Two risk-scoring systems for predicting incident diabetes mellitus in US adults age 45–64 years. Ann Intern Med 2009; 150:741–51.

    Google Scholar 

  14. Katulanda, P., Hill, N. R., Stratton, I., Sheriff, R., De Silva, S. D. N., & Matthews, D. R. (2016). Development and validation of a Diabetes Risk Score for screening undiagnosed diabetes in Sri Lanka (SLDRISK). BMC Endocrine Disorders, 16, 42. http://doi.org/10.1186/s12902-016-0124-8.

  15. Lindstrom J, Tuomilehto J. The diabetes risk score: a practical tool to predict type 2 diabetes risk. Diabetes Care 2003; 26:725–31.

    Google Scholar 

  16. Mann DM, Bertoni AG, Shimbo D, Carnethon MR, Chen H, Jenny NS, et al. Comparative validity of 3 diabetes mellitus risk prediction scoring models in a multiethnic US cohort: the Multi-Ethnic Study of Atherosclerosis. Am J Epidemiol 2010; 171:980–8.

    Google Scholar 

  17. Mehrabi Y, Sarbakhsh P, Hadaegh F, Khadem-Maboudi A. Prediction of diabetes using logic regression. Iran J EndocrinolMetab 2010; 12:16–24.

    Google Scholar 

  18. Nichols GA, Brown JB. Validating the Framingham Offspring Study equations for predicting incident diabetes mellitus. Am J Manag Care 2008; 14:574–80.

    Google Scholar 

  19. Rathmann W, Kowall B, Schulze MB. Development of a type 2 diabetes risk model from a panel of serum biomarkers from the Inter99 cohort: response to Kolberg et al. Diabetes Care 2010; 33:e28.

    Google Scholar 

  20. Schulze MB, Hoffmann K, Boeing H, Linseisen J, Rohrmann S, Mohlig M, et al. An accurate risk score based on anthropometric, dietary, and lifestyle factors to predict the development of type 2 diabetes. Diabetes Care 2007; 30:510–5.

    Google Scholar 

  21. Simmons RK, Harding AH, Wareham NJ, Griffin SJ. Do simple questions about diet and physical activity help to identify those at risk of type 2 diabetes? Diabet Med 2007; 24:830–5.

    Google Scholar 

  22. Stern M, Williams K, Eddy D, Kahn R. Validation of prediction of diabetes by the Archimedes model and comparison with other predicting models. Diabetes Care 2008; 31:1670–1.

    Google Scholar 

  23. Stern MP, Morales PA, Valdez RA, Monterrosa A, Haffner SM, Mitchell BD, et al. Predicting diabetes. Moving beyond impaired glucose tolerance. Diabetes 1993; 42:706–14.

    Google Scholar 

  24. Sun F, Tao Q, Zhan S. An accurate risk score for estimation 5-year risk of type 2 diabetes based on a health screening population in Taiwan. Diabetes Res Clin Pract 2009; 85:228–34.

    Google Scholar 

  25. Urdea M, Kolberg J, Wilber J, Gerwien R, Moler E, Rowe M, et al. Validation of a multimarker model for assessing risk of type 2 diabetes from a five-year prospective study of 6784 Danish people (Inter99). J Diabetes SciTechnol 2009; 3:748–55.

    Google Scholar 

  26. Wannamethee SG, Papacosta O, Whincup PH, Thomas MC, Carson C, Lawlor DA, et al. The potential for a two-stage diabetes risk algorithm combining non-laboratory-based scores with subsequent routine non-fasting blood tests: results from prospective studies in older men and women. Diabet Med 2011; 28:23–30.

    Google Scholar 

  27. Office of the Registrar General & Census Commissioner, India, “Annual Health Survey Report—A Report on Core and Vital Health Indicators Part I”, Ministry of Home Affairs, Government of India, New Delhi, 2016.

    Google Scholar 

  28. Office of the Registrar General & Census Commissioner, India, “Annual Health Survey Report—A Report on Core and Vital Health Indicators Part II”, Ministry of Home Affairs, Government of India, New Delhi, 2016.

    Google Scholar 

  29. World Health Organization. (2006). Definition and diagnosis of diabetes mellitus and intermediate hyperglycemia: report of a WHO/IDF consultation.

    Google Scholar 

  30. Visalakshi, S., & Radha, V. (2014, December). A literature review of feature selection techniques and applications: Review of feature selection in data mining. In Computational Intelligence and Computing Research (ICCIC), 2014 IEEE International Conference on (pp. 1–6). IEEE.

    Google Scholar 

  31. Huan Liu and Lei Yu, “Toward integrating feature selection algorithms for classification and clustering,” in IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 491–502, April 2005. https://doi.org/10.1109/tkde.2005.66.

  32. Remco R. Bouckaert, Eibe Frank, Mark Hall, Richard Kirkby, Peter Reutemann, Alex Seewald, David Scuse, WEKA Manual for Version 3-8-1 (December, 2016), University of Waikato, Hamilton, New Zealand.

    Google Scholar 

  33. Omprakash Chandrakar, Jatinderkumar R. Saini, “Development of Indian Weighted Diabetic Risk Score (IWDRS) using Machine Learning Techniques for Type-2 Diabetes”, COMPUTE ‘16 Proceedings of the 9th Annual ACM India Conference, Pages 125–128, ACM New York, NY, USA ©2016, ISBN: 978-1-4503-4808-9, https://doi.org/10.1145/2998476.2998497.

  34. Omprakash Chandrakar, Jatinderkumar R. Saini, “Classification using Knowledge based Semantic Discretization”, 2nd International Conference on Sustainable Computing Techniques in Engineering, Science and Management, Belagavi, Goa, 27-Jan-2017. pp 0116–21.

    Google Scholar 

  35. Omprakash Chandrakar and Dr. Jatinderkumar R. Saini (2017) (In Press) ‘Knowledge based Semantic Discretization using Data Mining Techniques’, Int. J. Advanced Intelligence Paradigms, Inderscience Publication.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Omprakash Chandrakar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Chandrakar, O., Saini, J.R. (2018). Derivation of a Novel Diabetes Risk Score Using Semantic Discretization for Indian Population. In: Perez, G., Tiwari, S., Trivedi, M., Mishra, K. (eds) Ambient Communications and Computer Systems. Advances in Intelligent Systems and Computing, vol 696. Springer, Singapore. https://doi.org/10.1007/978-981-10-7386-1_29

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7386-1_29

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7385-4

  • Online ISBN: 978-981-10-7386-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics