Skip to main content

Gaussian Process Regression with Categorical Inputs for Predicting the Blood Glucose Level

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 539))

Abstract

In diabetes treatment, the blood glucose level is key quantity for evaluating patient’s condition. Typically, measurements of the blood glucose level are recorded by patients and they are annotated by symbolic quantities, such as, date, timestamp, measurement code (insulin dose, food intake, exercises). In clinical practice, predicting the blood glucose level for different conditions is an important task and plays crucial role in personalized treatment. This paper describes a predictive model for the blood glucose level based on Gaussian processes. The covariance function is proposed to deal with categorical inputs. The usefulness of the presented model is demonstrated on real-life datasets concerning 10 patients. The results obtained in the experiment reveal that the proposed model has small predictive error measured by the Mean Absolute Error criterion even for small training samples.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    Further, in the experiment, we will consider only three inputs (\(D=3\)) which are typical in the diabetes treatment, namely, day of a week, period of a day, and a measurement code. However, the presented idea is given in a general case for any number of inputs.

  2. 2.

    Kernel function is a symmetric function and the Gram matrix whose elements are given by \(k(\mathbf {x}_{n}, \mathbf {x}_{m})\) is positive semidefinite for any set \(\{\mathbf {x}_{n}\}_{n=1}^{N}\) [20].

  3. 3.

    By reasonably small we mean up to \(N=1000\).

  4. 4.

    We have omitted the day of a week and the part of a day because of two reasons. First, we wanted to have less parameters of the mean function. Second, in the preliminary experiments, including also \(x_{1}\) and \(x_{2}\) resulted in no significant change in the performance of the GP.

References

  1. Agresti, A.: An Introduction to Categorical Data Analysis. Wiley-Interscience, New York (2007)

    Book  MATH  Google Scholar 

  2. Alemdar, H., Ersoy, C.: Wireless sensor networks for healthcare: a survey. Comput. Netw. 54(15), 2688–2710 (2010)

    Article  Google Scholar 

  3. Ažman, K., Kocijan, J.: Application of Gaussian processes for black-box modelling of biosystems. ISA Trans. 46, 443–457 (2007)

    Article  Google Scholar 

  4. Billard, L., Diday, E.: From the statistics of data to the statistics of knowledge: symbolic data analysis. J. Am. Stat. Assoc. 98(462), 470–487 (2003)

    Article  MathSciNet  Google Scholar 

  5. Bishop, C.: Pattern Recognition and Machine Learning. Elsevier, Amsterdam (2006)

    MATH  Google Scholar 

  6. Breiman, L., Friedman, J., Olshen, R., Stone, C., Steinberg, D., Colla, P.: CART: Classification and Regression Trees. Wadsworth, Belmont (1983)

    Google Scholar 

  7. Chu, W., Ghahramani, Z., Falciani, F., Wild, D.: Biomarker discovery in microarray gene expression data with Gaussian processes. Bioinforma 21(16), 3385–3393 (2005)

    Article  Google Scholar 

  8. Daemen. A., De Moor, B.: Development of a kernel function for clinical data. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2009), pp. 5913–5917. IEEE (2009)

    Google Scholar 

  9. De Gaetano, A., Arino, O.: Mathematical modelling of the intravenous glucose tolerance test. J. Math. Biol. 40, 136–168 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  10. Fischer, I., Meinl, T.: Graph based molecular data mining - an overview. In: IEEE International Conference on Systems, Man and Cybernetics, vol. 5, pp. 4578–4582. IEEE (2004)

    Google Scholar 

  11. Frank, A., Asuncion, A.: UCI machine learning repository (2010). http://archive.ics.uci.edu/ml

  12. Gärtner, T.: A survey of kernels for structured data. ACM SIGKDD Explor. Newsl. 5(1), 49–58 (2003)

    Article  Google Scholar 

  13. Grzech, A., Juszczyszyn, K., Swiatek, P., Mazurek, C. Sochan, A.: Applications of the future internet engineering project. In: International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel & Distributed Computing (SNPD), pp. 635–642. IEEE (2012)

    Google Scholar 

  14. Huang, Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Min. Knowl. Discov. 2(3), 283–304 (1998)

    Article  Google Scholar 

  15. Hyndman, R., Koehler, A.: Another look at measures of forecast accuracy. Int. J. Forecast 22(4), 679–688 (2006)

    Article  Google Scholar 

  16. Iannario, M.: Preliminary estimators for a mixture model of ordinal data. Adv. Data Anal. Classif. 6, 163–184 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  17. Likar, B., Kocijan, J.: Predictive control of a gas-liquid separation plant based on a Gaussian process model. Comput. Chem. Eng. 31, 142–152 (2007)

    Article  Google Scholar 

  18. Makosso-Kallyth, S., Diday, E.: Adaptation of interval PCA to symbolic histogram variables. Adv. Data Anal. Classif. 6, 1–13 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  19. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning. MIT Press, London (2006)

    MATH  Google Scholar 

  20. Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press, Cambridge (2004)

    Book  MATH  Google Scholar 

  21. Srinivasan, A., King, R.D.: Feature construction with inductive logic programming: a study of quantitative predictions of biological activity aided by structural attributes. Data Min. Knowl. Discov. 3(1), 37–57 (1999)

    Article  Google Scholar 

  22. Tomczak, J., Gonczarek, A.: Decision rules extraction from data stream in the presence of changing context for diabetes treatment. Knowl. Inf. Syst. 34, 521–546 (2013)

    Article  Google Scholar 

  23. Tomczak, J., Świątek, J., Latawiec, K.: Gaussian process regression as a predictive model for quality-of-service in web service systems. arXiv preprint arXiv: 1207.6910 (2012)

  24. Turner, R., Deisenroth, M.P., Rasmussen, C.E.: System identification in Gaussian process dynamical systems. In: Görür, D. (ed.) NIPS Workshop on Nonparametric Bayes. Whistler, Canada (2009)

    Google Scholar 

  25. Węglarz-Tomczak, E., Vassiliou, S., Mucha, A.: Discovery of potent and selective inhibitors of human aminopeptidases erap. 1 and erap. 2 by screening libraries of phosphorus-containing amino acid and dipeptide analogues. Bioorg. Med. Chem. Lett. 26(16), 4122–4126 (2016)

    Article  Google Scholar 

  26. World Health Organization. Definition and diagnosis of diabetes mellitus and intermediate hyperglycemia. Report of a WHO/IDF Consultation (2006)

    Google Scholar 

  27. Zięba, M., Świątek, J.: Ensemble classifier for solving credit scoring problems. IFIP AICT 372, 59–66 (2012)

    Google Scholar 

Download references

Acknowledgements

The research is partially supported by the grant co-financed by the Ministry of Science and Higher Education in Poland.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jakub M. Tomczak .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Tomczak, J.M. (2017). Gaussian Process Regression with Categorical Inputs for Predicting the Blood Glucose Level. In: Świątek, J., Tomczak, J. (eds) Advances in Systems Science. ICSS 2016. Advances in Intelligent Systems and Computing, vol 539. Springer, Cham. https://doi.org/10.1007/978-3-319-48944-5_10

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-48944-5_10

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-48943-8

  • Online ISBN: 978-3-319-48944-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics