Effective large for gestational age prediction using machine learning techniques with monitoring biochemical indicators
- 28 Downloads
A newborn with a birth weight above the 90th percentile of same gestational age is termed as large for gestational age. Large for gestational age suffers from serious complications during and after the antepartum period because they do not get earlier identification of the disease. Earlier recognition of large for gestational age infants could slow progression and prevent further complication of the disease. In medical science, prevention and mitigation of disease require examination of biochemical indicators. Machine learning has been evolved and envisioned as a tool to predict large for gestational age infants with most deterministic characteristics. This study aims to identify most deterministic biochemical indicators for large for gestational age prediction with minimal computational overhead. To the best of my knowledge, this is the first time a study is carried out to identify the most deterministic risk factors associated with large for gestational age and to develop large for gestational age prediction model using machine learning techniques. To develop an efficient large for gestational age prediction model, we conducted three group of experiments that considered basic machine learning methods; feature selection; and imbalanced data, respectively. Support vector machine, logistic regression, Naive Bayes and Random Forest were trained using tenfold cross-validation on large for gestational age dataset; we selected precision and area under the curve as a performance evaluation metrics; information gain an entropy-based feature selection method was adopted to rank features; we introduced an ensemble data imbalance technique in the last group of experiments. For each group of experiments, support vector machine performed best compared to other machine learning classifiers by producing the highest prediction precision score of 85%. All of the classifiers performed best with thirty ranked features subset, which validates the applied method to recognize the most deterministic risk factors associated with large for gestational age prediction.
KeywordsLarge for gestational age Feature selection Machine learning Risk factors Prediction model Data imbalance Ensemble technique
This work is supported by National Key Research and Development Program of China with project No. 2017YFB1400803.
- 2.Lazer S, Biale Y, Mazor M, Lewenthal H, Insler V (1986) Complications associated with the macrosomic fetus. J Reprod Med 31(6):501–505Google Scholar
- 3.Spellacy W, Miller S, Winegar A, Peterson P (1985) Macrosomia-maternal characteristics and infant complications. Obstet Gynecol 66(2):158–161Google Scholar
- 18.Luangkwan S, Vetchapanpasat S, Panditpanitcha P, Yimsabai R, Subhaluksuksakorn P, Loyd RA, Uengarporn N (2015) Risk factors of small for gestational age and large for gestational age at buriram hospital. J Med Assoc Thail 98(Suppl 4):S71–S78Google Scholar
- 19.Institute of Medicine (2009) Weight gain during pregnancy: reexamining the guidelines. National Academies Press, Washington, DCGoogle Scholar
- 30.Zhang S, Wang Q, Shen H (2015) Design implementation and significance of chinese free pre-pregnancy eugenics checks project. Natl Med J China 95(3):162–165Google Scholar
- 32.Zhu L, Zhang R, Zhang S, Shi W, Yan W, Wang X, Lyu Q, Liu L, Zhou Q, Qiu Q et al (2015) Chinese neonatal birth weight curve for different gestational age. Chin J Pediatr 53(2):97–103Google Scholar
- 34.Khashei M, Eftekhari S, Parvizian J (2012) Diagnosing diabetes type ii using a soft intelligent binary classification model. Rev Bioinf Biom 1:9–23Google Scholar
- 37.Zhang H, Su J (2004) Naive bayesian classifiers for ranking. In: European Conference on Machine Learning. Springer, pp 501–512Google Scholar
- 39.Corp N IBM (2013) Ibm spss statistics for windows. Version, vol 22Google Scholar
- 41.Zar JH et al (1999) Biostatistical analysis. Pearson Education India, BengaluruGoogle Scholar