Univariate data-driven models for glucose level prediction of CGM sensor dataset for T1DM management


The advent of machine learning has made a remarkable impact in the field of healthcare. Diabetes mellitus is a metabolism abnormality that is posing severe threat, exercising substantial pressure on human health worldwide. Diabetes mellitus is a public health problem around the world. In 1980, 108 million adults worldwide had diabetes. By 2040 the number is expected to reach 642 million adults. Hence extensive research in interdisciplinary field that uses skills from various fields such as statistics machine learning, artificial intelligence, visualization, etc. is carried out for better management of diabetes. In this paper, the focus is to use time series forecasting algorithms. Data-driven models in time series machine learning are used to derive meaningful and appropriate information from large volumes of blood glucose level and related data for precise forecasting of upcoming blood glucose level fluctuations. Not only can the patient and physician be informed beforehand, to avert complications, but also it aids in predicting response to certain medications with ease. In this case, univariate data-driven models from time series machine learning algorithms are implemented on 2 different continuous glucose monitoring sensor datasets: Libre Pro dataset of 10 patients and Ohio T1DM dataset of 6 patients. A comparison of performance evaluation metrics of the different time series machine learning algorithms is drawn based on root mean squared error (RMSE), mean average percentage error (MAPE) and Theil’s U, which are statistical analyses, and Clarke’s error grid, which is clinical analysis for prediction horizon from 15 to 45 min. Using Holt’s Linear AAN Algorithm on Libre Pro dataset with alpha and beta of 0.99 provided the least error among exponential smoothing algorithms with RMSE of 7.98 mg/dl for 15 min, 19.47 mg/dl for 30 min and 28.40 mg/dl for 45 min prediction horizon. Theil’s U coefficient was 0.12 for 15 min, 0.39 for 30 min and 0.72 for 45 min prediction horizon. Autoregressive Integrated Moving Average (ARIMA) Algorithm gave the best performance evaluation results with RMSE of 7.07 mg/dl for 15 min with a MAPE of 3.98. The performance results were on par when these algorithms were tested on Ohio T1DM dataset. ARIMA Algorithm gave the best performance evaluation results with RMSE of 13.14 mg/dl for 15 min with a MAPE of 8.213. The difference in the error coefficient for Ohio dataset was due to missing data.

This is a preview of subscription content, access via your institution.

Figure 1
Figure 2

(Courtesy: Jnana Sanjeevini Diabetes Hospital and Medical Center, Bangalore).

Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24


  1. 1

    Ignacio Hidalgo J, Manuel Colmenar J, Risco-Martin J L, Cuesta-Infante A, Maqueda E, Botella M and Rubio J A 2014 Modeling glycemia in humans by means of Grammatical Evolution. Appl. Soft Comput. 20: 40–53, https://doi.org/10.1016/j.asoc.2013.11.006.

    Article  Google Scholar 

  2. 2

    Diabetes: the basics, http://www.diabetes.org.uk/diabetes-the-basics. Accessed on 20 December 2018

  3. 3

    Whiting D R, et al 2011 IDF diabetes atlas: global estimates of the prevalence of diabetes for 2011 and 2030. Diabetes Res. Clin. Pract. 94(3): 311–321

    Article  Google Scholar 

  4. 4

    Plis K, Shubrook R J and Schwartz F 2014 A machine learning approach to predicting blood glucose levels for diabetes management. In: Modern Artificial Intelligence for Health Analytics, pp. 1–14

  5. 5

    Lu Y, Gribok A V, Ward W K and Reifman J 2010 The importance of different frequency bands in predicting subcutaneous glucose concentration in type 1 diabetic patients. IEEE Trans. Biomed. Eng. 57(8): 1839–1846

    Article  Google Scholar 

  6. 6

    Harsh S, Molenaar P and Freeman K 2013 Developing personalized empirical models for Type-I diabetes: an extended Kalman filter approach. In: Proceedings of the American Control Conference, pp. 2923–2928

  7. 7

    Fong S, Zhang Y, Fiaidhi J, Mohammed O and Mohammed S 2013 Evaluation of stream mining classifiers for real-time clinical decision support system: a case study of blood glucose prediction in diabetes therapy. Biomed Res. Int. 2013: 274193

    Google Scholar 

  8. 8

    Dassau E, Cameron F, Bequette B W, Zisser H, Jovanovič L, Chase H P, Wilson D M, Buckingham B A and Doyle F J 2010 Real-time hypoglycemia prediction suite using continuous glucose monitoring. Diabetes Care 33(6): 1249–1254

    Article  Google Scholar 

  9. 9

    Efendic H K, Freckmann G and del Re L 2014 Short term prediction of blood glucose concentration using interval probabilistic models. In: Proceedings of the 22nd Mediterranean Conference on Control and Automation, pp. 1494–1499

  10. 10

    Gani A, Gribok A V, Ward W K, Vigersky R A and Reifman J 2010 Universal glucose models for predicting subcutaneous glucose concentration in humans. IEEE Trans. Inf. Technol. Biomed. 14(1): 157–165

    Article  Google Scholar 

  11. 11

    Botwey RH, Daskalaki E, Diem P and Mougiakakou S G 2014 Multi-model data fusion to improve an early warning system for hypo/hyperglycemic events. In: Proceedings of the 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2014, pp. 4843–6, https://doi.org/10.1109/embc.2014.6944708

  12. 12

    Pappada S, Cameron B D, Rosman P M, Bourey R E, Papadimos T J, Olorunto W and Borst M J 2011 Neural network based real time prediction of glucose in patients with insulin dependent diabetes. Diabetes Technol. Ther. 13(2): 135–141

    Article  Google Scholar 

  13. 13

    MeriyanEren-Orukulu M, Cinar A, Quinn L and Smith D 2009 Estimation of future glucose concentrations with subject specific recursive linear models. Diabetes Technol. Ther. 11(4): 243–253

    Article  Google Scholar 

  14. 14

    Kennedy L and Brown A 2016 Abbott’s FreeStyle Libre Pro Professional CGM System Receives FDA Approval, https://diatribe.org/abbott-freestyle-libre-pro-cgm-system-fda-approval. Accessed on 2 March 2016

  15. 15

    Hammond P 2015 About AGP—the Single Page Report for Everyone, http://www.agpreport.org/agp/about. Accessed on 2 March 2015

  16. 16

    Marling C and Bunescu C 2018 The OhioT1DM dataset for blood glucose level prediction. In: Proceedings of the III International Workshop on Knowledge Discovery in Healthcare Data, Stockholm, Sweden, July 13

  17. 17

    Time Series, 2016, https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/. Accessed on 19 December 2016

  18. 18

    Scmueli G and Lichtendahl Jr K C 2016 Practical time series forecasting with R: a hands-on guide

  19. 19

    Makridakis S and Wheelright S C 2018 Forecasting methods and applications

  20. 20

    Zbikowski K 2014 Using volume weighted support vector machines with walk forward testing and feature selection for the purpose of creating stock trading strategy. Expert Syst. Appl. 42(4): 1797–1805

    Article  Google Scholar 

  21. 21

    Hyndman R J and Koehler A B 2006 Another look at measures of forecast accuracy. Int. J. Forecast. 22: 679–688

    Article  Google Scholar 

  22. 22

    Clarke W L 2005 The original Clarke error grid analysis (EGA). Diabetes Technol. Ther. 7(5): 776–779, https://doi.org/10.1089/dia.2005.7.776

    Article  Google Scholar 

  23. 23

    Hyndman R J 2014 Exponential smoothing. In: Forecasting: Principles and Practice, University of Western Australia, https://robjhyndman.com/uwafiles/fpp-notes.pdf. Accessed 18 Oct 2018

  24. 24

    Hyndman R J, Koehler A B, Ord J K and Snyder R D 2008 Forecasting with exponential smoothing: the state space approach, ISBN 978-3-540-71918-2

  25. 25

    Hyndman R J and Khandakar Y 2008 Automatic time series forecasting: the forecast package for R. J. Stat. Softw. 27(3): 1–22, https://doi.org/10.18637/jss.v027.i03

    Article  Google Scholar 

  26. 26

    Bhansali R J 1996 Asymptotically efficient autoregressive model selection for multistep prediction. Ann. Inst. Stat. Math. 48: 577–602

    MathSciNet  Article  Google Scholar 

  27. 27

    Kim J H 2003 Forecasting autoregressive time series with bias-corrected parameter estimators. Int. J. Forecast. 19: 493–502

    Article  Google Scholar 

Download references


We acknowledge C Marling, Ohio University, for helping us with Ohio T1DM dataset, Vibha Rao from Jnana Sanjeevini, Bangalore, for providing us with Libre Pro dataset and also Dr Abhijit Bhogaraj, Endocrinologist who validated the results obtained in this paper.

Author information



Corresponding author

Correspondence to Rekha Phadke.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Phadke, R., Prasad, V., Nagaraj, H.C. et al. Univariate data-driven models for glucose level prediction of CGM sensor dataset for T1DM management. Sādhanā 45, 46 (2020). https://doi.org/10.1007/s12046-020-1277-8

Download citation


  • data-driven models
  • overlapping walk forward validation window
  • R
  • RMSE
  • Theil’s U
  • time series forecasting