Skip to main content

Advertisement

Log in

Using machine learning to predict nosocomial infections and medical accidents in a NICU

  • Original Paper
  • Published:
Health and Technology Aims and scope Submit manuscript

Abstract

Background

Adult studies have shown that nursing overtime and unit overcrowding are associated with increased adverse patient events but there exists little evidence for the Neonatal Intensive Care Unit (NICU). We investigate the main determinants of nosocomial infections and medical accidents in a NICU using state-of-the-art machine learning techniques. Our analysis focuses on a retrospective study on the 7,438 neonates admitted in the CHU de Québec NICU (capacity of 51 beds) from 10 April 2008 to 28 March 2013. Daily administrative data on nursing overtime hours, total regular hours, number of admissions, patient characteristics, as well as information on nosocomial infections and on the timing and type of medical errors were retrieved from various hospital-level datasets.

Methods

We use a generalized mixed effects regression tree model with random effects (GMERT-RI) to elaborate predictions trees for the two outcomes. Neonates' characteristics and daily exposure to numerous covariates are used in the model. GMERT-RI is suitable for binary outcomes and is a recent extension of the standard tree-based method. The model allows to determine the most important predictors.

Results

Diagnosis-related group level, regular hours of work, overtime, admission rates, birth weight and occupation rates are the main predictors for both outcomes. On the other hand, gestational age, C-Section, multiple births, medical/surgical and number of admissions are poor predictors.

Conclusion

The GMERT-RI algorithm is a powerful tool. It is well suited to unearth potential correlations in the context of unbalanced panel data and discrete health outcomes, two common features of clinical data. In the particular setting of a NICU, we find that institutional features (overtime hours, occupancy rates, etc.) are just as important drivers as neonate-specific medical conditions in predicting medical accidents and health care associated infections. From an operational point of view, prediction trees can complement traditional management tools in preventing undesirable health outcomes in the NICU.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data availability

Confidential proprietary hospital-level data. Can not be shared.

Code availability

Public domain R packages.

Notes

  1. ML has been used for identification of disease onset, classification of disease severity, predicting epileptic seizures, ... It is fast becoming a hybrid physician-support tools thanks to the vast amount of data generated in healthcare systems. Although machine learning can prove a powerful tool, there is potential for misuse; model performance can be inflated through overfitting and, consequently, will not generalize to the greater population. But a number of recent methods – including the one we use – have been proposed to expand the applicability of machine learning tools and ensure robustness of results for within-subject factors and random effects (see Schultz et al. [18]).

  2. The logistic regression is often used as a baseline model with which to gauge more sophisticated machine learning approaches. It is often appropriate for clinical outcomes when using a small set of variables (see Gao et al. [14]). More sophisticated regularized variants of the logistic regression (e.g., lasso-regularized and ridge logistic regressions) allow to remove uninformative variables and/or identify near-linear relationships between some subsets (see Tibshirani [15]). Yet, the main disadvantage of logistic regression is that it may require large sample sizes to achieve reliable performance, particularly in the presence of high-dimensional variable sets (see Schultz et al. [18]).

  3. This occurred whenever a nurse either started her shift earlier than planned or finished later than scheduled. Working beyond 16 consecutive hours per day was prohibited.

  4. Reporting the information on the timing as well as the type of MA is mandatory.

  5. The Appendix provides additional details on the GMERT-RI algorithm of Hajjem et al. [13]. The R codes are available in the supplementary material of Hajjem et al. [13].

  6. It represents the highest cross validation error less than the sum of the minimum cross validation error and the standard deviation of the error on that tree.

  7. The misclassification rate (MCR) is given by \(MCR=\left( \sum _{i=1}^{N^{(v)} } \sum _{t=1}^{T^{(v)}_i} \mid y_{it} - \widehat{y}_{it} \mid \right) /T^{(v)}\) where \(\widehat{y}_{it}\) is the predicted class of observation t in cluster i: \(\widehat{y}_{it} = \text {Bernoulli} \left( \widehat{\mu }_{it} \right)\) with \(\widehat{\mu }_{it} = \left( 1 + \exp \left( - \widehat{f}(X^{\prime }_{it}) - Z^{\prime }_{it} \widehat{u}_i \right) \right) ^{-1}\). \(\widehat{f}(X^{\prime }_{it})\) is the predicted fixed component that results from the tree and \(Z^{\prime }_{it} \widehat{u}_i\) is its predicted random part corresponding to its cluster. \(N^{(v)}\) is the number of clusters in the validation set, \(T^{(v)}_i\) is the size of cluster i and \(T^{(v)}\) is the total number of observations in the validation set.

  8. In a Monte Carlo simulation study with random effects, Hajjem et al. [13] have shown that the mixed-effects classification trees give better results than the usual classification trees even with a misspecified random component part.

  9. Gini impurity is a measure of how often a randomly chosen element from the set would be incorrectly labeled if it was randomly labeled according to the distribution of labels in the subset. The Gini impurity can be computed by summing the probability \(p_i\) of an item with label i being chosen times the probability \((1-p_i)\) of a mistake in categorizing that item. To compute Gini impurity \(I_G\) for a set of items with J classes, suppose \(i \in \left\{ 1, 2, ...,J \right\}\) and let \(p_i\) be the fraction of items labeled with class i, then \(I_G = 1 - \sum _{i=1}^{J} p^{2}_i\). In our case, \(J=2\) for accident (1) and no accident (0).

  10. We use the recent R package ROCR which allows to create cutoff-parameterized 2D performance curves by freely combining any two from over 25 performance measures.

  11. To save on space, probabilities smaller than 1% appear as “0” and those above 99% appear as “1” inside the nodes.

  12. Figures 3 and 4 only give nodes and ancestors for predicted probabilities \(> 75\%\).

  13. These numbers correspond to the sum of the observations in each cell of the terminal nodes, or leaves. On the left-hand side this gives 126 = 22 + 13 + 37 + 13 + 13 + 28, while on the right-hand side it corresponds to 152 = 20 + 22 + 33 + 17 + 28 + 17 + 15.

  14. Note that the model slightly overestimates the true number of infections, i.e. 372 instead of 272. On the other hand, if we focus on events with probabilities strictly larger than 80% then overestimation is reduced significantly, i.e. from 372 to 289.

References

  1. Tucker J, Tarnow-Mordi W, Gould C, Parry G, Marlow N. On behalf of the UK neonatal staffing study collaborative group. UK neonatal intensive care services in 1996. Child Fetal Neonatal Ed. 1999;80:F233-34.

    Article  Google Scholar 

  2. Polin RA, Denson S, Brady MT. Strategies for prevention of health care–associated infections in the NICU. Pediatrics. 2012;129(4):e1085–93.

    Article  Google Scholar 

  3. Beltempo M, Lacroix G, Cabot M, Blais R, Piedboeuf B. Association of nursing overtime, nurse staffing and unit occupancy with medical incidents and outcomes of very preterm infants. J Perinatol. 2017;38:175 EP –. https://doi.org/10.1038/jp.2017.146.

  4. Russell RB, Green NS, Steiner CA, Meikle S, Howse JL, Poschman K, Dias T, Potetz L, Davidoff MJ, Damus K, Petrini JR. Cost of hospitalization for preterm and low birth weight infants in the United States. Pediatrics. 2007;120(1):1–9.

    Article  Google Scholar 

  5. Beltempo M, Lacroix G, Cabot M, Piedboeuf B. Factors and costs associated with the use of registered nurse overtime in the neonatal intensive care unit. Pediatrics and Neonatal Nursing Open Journal. 2016;4:17–23.

    Article  Google Scholar 

  6. Berney B, Needleman J. Trends in nurse overtime, 1995–2002. Policy Polit Nurs Pract. 2005;6:183–90.

    Article  Google Scholar 

  7. Bae S-H. Presence of nurse mandatory overtime regulations and nurse and patient outcomes. Nursing Economic$. 2013;31(2):59–89.

    MathSciNet  Google Scholar 

  8. Lin H. Revisiting the relationship between nurse staffing and quality of care in nursing homes: An instrumental variables approach. J Health Econ. 2014;37:13–24.

    Article  Google Scholar 

  9. Cimiotti JP, Aiken LH, Sloane DM, Evan SWu. Nurse staffing, burnout, and health care-associated infection. Am J Infect Control. 2012;40(6):486–90.

    Article  Google Scholar 

  10. Trinkoff AM, Johantgen M, Storr CL, Gurses AP, Liang Y, Han K. Nurses’ work schedule characteristics, nurse staffing, and patient mortality. Nurs Res. 2011;60(1):1–8.

    Article  Google Scholar 

  11. Beltempo M, Bresson G, Étienne J-M, Lacroix G. Infections, accidents and nursing overtime in a neonatal intensive care unit. Eur J Health Econ. 2021.

  12. Clarke SLN, Parmesar K, Saleem MA, Ramanan AV. Future of machine learning in paediatrics. Arch Dis Child. 2021;1–6.

  13. Hajjem A, Larocque D, Bellavance F. Generalized mixed effects regression trees. Statist Probab Lett. 2017;126:114–8.

    Article  MathSciNet  MATH  Google Scholar 

  14. Gao C, Sun H, Wang T, Tang M, Bohnen NI, Müller MLTM, Herman T, Giladi N, Kalinin A, Spino C, et al. Model-based and model-free machine learning techniques for diagnostic prediction and classification of clinical outcomes in Parkinson’s disease. Sci Rep. 2018;8(1):1–21.

    Google Scholar 

  15. Tibshirani R. Regression shrinkage and selection via the lasso. J Roy Stat Soc: Ser B (Methodol). 1996;58(1):267–88.

    MathSciNet  MATH  Google Scholar 

  16. Hsiao C. An Econometrician’s perspective on Big Data. In: Li T, Pesaran MH, Terrell D, editors. Essays in Honor of Cheng Hsiao. Emerald Publishing Limited; 2020. p. 413–23.

  17. Bresson G. Comments on “An econometrician’s perspective on big data” by Cheng Hsiao. In: Li T, Pesaran MH, Terrell D, editors. Essays in Honor of Cheng Hsiao. Emerald Publishing Limited; 2020. p 431–43.

  18. Schultz BG, Joukhadar Z, Nattala U, Quiroga MDM, Bolk F, Vogel AP. Best practices for supervised machine learning when examining biomarkers in clinical populations. In: Moustafa AA, editor. Big Data in Psychiatry & Neurology. Elsevier; 2021. p. 1–34.

  19. Fédération Interprofessionnelle de la Santé du Québec. Convention collective 2011-2015, article 19.01. 2011.

  20. Hajjem A, Bellavance F, Larocque D. Mixed-effects random forest for clustered data. J Stat Comput Simul. 2014;84(6):1313–28.

    Article  MathSciNet  MATH  Google Scholar 

  21. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning: Prediction. New York: Inference and Data Mining. Springer-Verlag; 2009.

    Book  MATH  Google Scholar 

  22. Hugonnet S, Chevrolet J-C, Pittet D. The effect of workload on infection risk in critically ill patients. Crit Care Med. 2007;35(1):76–81.

    Article  Google Scholar 

  23. Firth D. Bias reduction of maximum likelihood estimates. Biometrika. 1993;80(1):27–38.

    Article  MathSciNet  MATH  Google Scholar 

  24. King G, Zeng L. Logistic regression in rare events data. Polit Anal. 2001;9(2):137–63.

    Article  Google Scholar 

  25. Bradburn MJ, Deeks JJ, Berlin JA, Russell Localio A. Much ado about nothing: a comparison of the performance of meta-analytical methods with rare events. Stat Med. 2007;26(1):53–77.

    Article  MathSciNet  Google Scholar 

  26. Hegelich S. Decision trees and random forests: machine learning techniques to classify rare events. European Policy Analysis. 2016;2(1):98–120.

    Article  Google Scholar 

  27. Zhao Y, Wong ZS-Y, Tsui KL. A framework of rebalancing imbalanced healthcare data for rare events’ classification: a case of look-alike sound-alike mix-up incident detection. J Healthc Eng. 2018;2018:1–11.

    Google Scholar 

  28. Fujiwara K, Huang Y, Hori K, Nishioji K, Kobayashi M, Kamaguchi M, Kano M. Over-and under-sampling approach for extremely imbalanced and small minority data problem in health record analysis. Front Public Health. 2020;8(178):1–15.

    Google Scholar 

  29. Wang HY. Logistic regression for massive data with rare events. In: International Conference on Machine Learning. Proceedings of Machine Learning Research. 2020. p. 9829–36.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Guy Lacroix.

Ethics declarations

Ethical approval

This is an observational study. The Research Ethics Board of the Centre Universitaire de l’Hôpital de Québec (CHU de Québec) has approved the research that has been conducted for this study.

Conflicts of interest

The authors have no relevant financial or non-financial interests to disclose.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

We are grateful to the participants of the research seminar at the Institut de Science Financière et d’Assurances (Institute of Financial Sciences and Insurance), Lyon, France, for their comments and remarks. The usual disclaimer applies.

Appendix: the GMERT-RI algorithm

Appendix: the GMERT-RI algorithm

The GMERT-RI algorithm of [13] is defined as follows. Recall that for the generalized mixed model (GLMM),

  • \(y_{it} \mid u_i\) belongs to the exponential family of distribution.

  • \(\mu _{it}=E \left[ y_{it} \mid u_i \right]\) and \(g \left( \mu _{it} \right) = \eta _{it} = X_{it} \beta + Z_{it} u_{i}\) for some known link function g. In our case, we use the logit link. \(\mu _{it} = \frac{e^{ \eta _{it} }}{1 + e^{ \eta _{it} }}\) and \(g \left( \mu _{it} \right) = \log \left( \frac{\mu _{it} }{1 - \mu _{it} } \right) = \eta _{it}\). So, \(\mu _{i}=E \left[ y_{i} \mid u_i \right]\) and \(g \left( \mu _{i} \right) = \eta _{i} = X_{i} \beta + Z_{i} u_{i}\) with \(u_{i} \sim N \left( 0, \Sigma \right)\).

  • \(Cov \left[ y_{i} \mid u_i \right] =\sigma ^2 v \left( \mu _{i} \right)\) where \(\sigma ^2\) is a dispersion parameter and \(v \left( \mu _{i} \right) = diag \left[ v_{i1}, ..., v_{iT_i} \right] = diag \left[ v \left( \mu _{i1} \right) , ..., v \left( \mu _{iT_i} \right) \right]\) where \(v \left( . \right)\) is a known variance function.

The generalized mixed effects regression tree (GMERT-RI) model, proposed by [13], can be written as \(\eta _{i} = f\left( X_{i} \right) + Z_{i} u_{i}\) with \(u_{i} \sim N \left( 0, \Sigma \right)\) where the linear fixed part \(X_{i} \beta\) is replaced by the function \(f\left( X_{i} \right)\) that will be estimated with a standard regression tree model. A first-order Taylor-series expansion yields the linearized response variable, \(\widetilde{y}_i = g \left( \mu _{i} \right) + \left( y_i - \mu _i \right) g^{\prime } \left( \mu _{i} \right)\) and the mixed fixed effect regression tree (MERT) pseudo-model is defined as follows: \(\widetilde{y}_i = f\left( X_{i} \right) + Z_{i} u_{i} + e_i\). The GMERT-RI algorithm is basically the penalized quasi-likelihood (PQL) algorithm used to fit GLMMs where the weighted linear mixed effects (LME) pseudo-model is replaced by a weighted MERT pseudo-model. Therefore, the fixed part \(f\left( X_{i} \right)\) is estimated with a standard regression tree model. The GMERT-RI algorithm of [13] is the following:

Algorithm 1
figure a

GMERT-RI ALGORITHM (see Hajjem et al. [13], p. 115)

As shown by [13], the GMERT-RI model can be used to predict the response for two categories of new observations: those who belong to a cluster included in the sample used to fit the model and those excluded from the sample. To predict the response for a new observation from the first category, one uses both its corresponding fixed component prediction \(\widehat{f}\left( X_i \right)\) and the predicted random part \(Z_i \widehat{u}_{i}\) corresponding to its cluster. This is a cluster-specific estimate. For the latter category, one can only use its corresponding fixed component prediction (i.e., the random part is set to 0).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Beltempo, M., Bresson, G. & Lacroix, G. Using machine learning to predict nosocomial infections and medical accidents in a NICU. Health Technol. 13, 75–87 (2023). https://doi.org/10.1007/s12553-022-00723-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12553-022-00723-1

Keywords

Navigation