Combining data augmentation, EDAs and grammatical evolution for blood glucose forecasting
The ideal solution for diabetes mellitus type 1 patients is the generalization of artificial pancreas systems. Artificial pancreas will control blood glucose levels of diabetics, improving their quality of live. At the core of the system, an algorithm will forecast future glucose levels as a function of food ingestion and insulin bolus sizes. In previous works several evolutionary computation techniques has been proposed as modeling or identification techniques in this area. One of the main obstacles that researchers have found for training the models is the lack of significant amounts of data. As in many other fields in medicine, the collection of data from real patients is not an easy task, since it is necessary to control the environmental and patient conditions. In this paper, we propose three evolutionary algorithms that generate synthetic glucose time series using real data from a patient. This way, the models can be trained with an augmented data set. The synthetic time series are used to train grammatical evolution models that work together in an ensemble. Experimental results show that, in a scarce data context, grammatical evolution models can get more accurate and robust predictions using data augmentation. In particular we reduce the number of potentially dangerous predictions to 0 for a 30 min horizon, 2.5% for 60 min, 3.6% on 90 min and 5.5% for 2 h. The Ensemble approach presented in this paper showed excellent performance when compared to not only a classical approach such as ARIMA, but also with other grammatical evolution approaches. We tested our techniques with data from real patients.
KeywordsGrammatical evolution Diabetes Time series forecasting Data augmentation
This research is supported by the Spanish Minister of Science and Innovation (TIN2014-54806-R). The authors would like to thank the staff in the Principe de Asturias Hospital at Alcala de Henares for their support and assistance with this project. Special thanks also go to Maria Aranzazu Aramendi Zurimendi and Remedios Martinez Rodriguez.
- 8.Velasco JM, Garnica O, Contador S, Colmenar JM, Maqueda E, Botella M, Lanchares J, Hidalgo JI (2017) Enhancing grammatical evolution through data augmentation: application to blood glucose forecasting. In: European conference on the applications of evolutionary computation. Springer, pp 142–157Google Scholar
- 10.Yadav M, Malhotra P, Vig L, Sriram K, Shroff G (2016) ODE—augmented training improves anomaly detection in sensor data from machines. arXiv preprint arXiv:1605.01534
- 14.Hovorka R, Allen JM, Elleri D, Chassin LJ, Harris J, Xing D, Kollman C, Hovorka T, Larsen AMF, Nodale M, De Palma A, Wilinska ME, Acerini CL, Dunger DB (2010) Manual closed-loop insulin delivery in children and adolescents with type 1 diabetes: a phase 2 randomised crossover trial. Lancet 375:743–751CrossRefGoogle Scholar
- 15.Kovatchev B, Cobelli C, Renard E, Anderson S, Breton M, Patek S, Clarke W, Bruttomesso D, Maran A, Costa S, Avogaro A, Man CD, Facchinetti A, Magni L, De Nicolao G, Place J, Farret A (2010) Multinational study of subcutaneous model-predictive closed loop control in type 1 diabetes mellitus: summary of the results. Diabetes Sci Technol 4:1374–1381CrossRefGoogle Scholar
- 18.Dassau E, Zisser H, Grosman B, Bevier W, Percival MW, Jovanovic L, Doyle III FJ (2009) Artificial pancreatic beta-cell protocol for enhanced model identification. Diabetes 58:A105–A106Google Scholar
- 24.Pelikan M, Mühlenbein H (1998) Marginal distributions in evolutionary algorithms. In Proceedings of the international conference on genetic algorithms mendel, vol 98. Citeseer, pp 90–95Google Scholar
- 26.McDermott J, White DR, Luke S, Manzoni L, Castelli M, Vanneschi L, Jaskowski W, Krawiec K, Harper R, De Jong K, O’Reilly U-M (2012) Genetic programming needs better benchmarks. In Proceedings of the 14th annual conference on genetic and evolutionary computation, GECCO ’12, ACM, New York, NY, USA, pp 791–798Google Scholar
- 27.Razali N, Wah YB (2011) Power comparisons of Shapiro-Wilk, Kolmogorov-Smirnov, Lilliefors and Anderson-Darling tests. J Stat Model Anal 2(1):21–33Google Scholar