A metabolomics-based approach for non-invasive diagnosis of chromosomal anomalies
Chromosomal anomalies (CA) are the most frequent fetal anomalies.
To evaluate the diagnostic performance of a machine learning ensemble model based on the maternal serum metabolomic fingerprint of fetal aneuploidies during the second trimester .
This is a case-control pilot study. Metabolomic profiles have been obtained on serum of 328 mothers (220 controls and 108 cases), using gas chromatography coupled to mass spectrometry. Eight machines learning and classification models were built and optimized. An ensemble model was built using a voting scheme. All samples were randomly divided into two sets. One was used as training set, the other one for diagnostic performance assessment.
Ensemble machine learning model correctly classified all cases and controls. The accuracy was the same for trisomy 21 and 18; also, the other CA were correctly detected. Elaidic, stearic, linolenic, myristic, benzoic, citric and glyceric acid, mannose, 2-hydroxy butyrate, phenylalanine, proline, alanine and 3-methyl histidine were selected as the most relevant metabolites in class separation.
The proposed model, based on the maternal serum metabolomic fingerprint of fetal aneuploidies during the second trimester, correctly identifies all the cases of chromosomal abnormalities. Overall, this preliminary analysis appeared suggestive of a metabolic environment conductive to increased oxidative stress and a disturbance in the fetal central nervous system development. Maternal serum metabolomics can be a promising tool in the screening of chromosomal defects. Moreover, metabolomics allows to extend our knowledge about biochemical alterations caused by aneuploidies and responsible for the observed phenotypes.
KeywordsChromosomal abnormalities Gas chromatography mass spectrometry Machine learning Metabolomics Screening test
The authors thank prof. Giuseppe Castaldo from Department of Biochemistry and Medical Biotechnology, University of Naples, Federico II, Naples (Italy), for the kindly intellectual support and for the revision of the manuscript.
The research work was funded to MG by UNISA/FARB 2016.
Compliance with ethical standards
Conflict of interest
J. Troisi, G. Scala, and M. Guida have got an Italian patent for the diagnostic test described in the manuscript (Patent No. 0001423755/2016) and have applied for a PCT extension. All the other authors have no conflict of interest.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. The study was approved by the ethics committee CEI OORR San Giovanni di Dio e Ruggi D’Aragona (IRB No. 4/2013) and a written consent form was signed by each participant.
- Abbott, D. W. (1999). Combining Models to Improve Classifier Accuracy and Robustness. Paper presented at the 2nd International Conference on Information Fusion, San Jose, CA.Google Scholar
- Akolekar, R., Beta, J., Picciarelli, G., Ogilvie, C., & D’Antonio, F. (2015). Procedure-related risk of miscarriage following amniocentesis and chorionic villus sampling: A systematic review and meta-analysis. Ultrasound in Obstetrics & Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 45, 16–26. doi: 10.1002/uog.14636.CrossRefGoogle Scholar
- Bahado-Singh, R. O., et al. (2012). Metabolomics and first-trimester prediction of early-onset preeclampsia. The Journal of Maternal-fetal & Neonatal Medicine: The Official Journal of the European Association of Perinatal Medicine, The Federation of Asia and Oceania Perinatal Societies, The International Society of Perinatal Obstetricians, 25, 1840–1847. doi: 10.3109/14767058.2012.680254.CrossRefGoogle Scholar
- Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey, CA: WadsworthGoogle Scholar
- Demirhan, O., Tastemir, D., Günesacar, R., Güzel, A., & Alptekin, D. (2011). The first report described as an important study: The association of mannose-binding lectin gene 2 polymorphisms in children with down syndrome. Indian Journal of Human Genetics, 17, 59–64. doi: 10.4103/0971-6866.86176.CrossRefPubMedPubMedCentralGoogle Scholar
- Dietterich, T. G. (2000) Ensemble methods in machine learning. In International workshop on multiple classifier systems (pp. 1–15). Berlin: Springer.Google Scholar
- Domingos, P. (1999). Metacost: A general method for making classifiers cost-sensitive. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, p 155–164.Google Scholar
- Elrahman, S. M. A., & Abraham, A. (2013). A review of class imbalance problem. Journal of Network and Innovative Computing, 1, 332–340.Google Scholar
- Gil, M. M., Accurti, V., Santacruz, B., & Plana, M. N. (2017). Analysis of cell-free DNA in maternal blood in screening for aneuploidies: Updated meta-analysis. Ultrasound in Obstetrics & Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology. doi: 10.1002/uog.17484.Google Scholar
- Gil, M. M., Quezada, M. S., Revello, R., Akolekar, R., & Nicolaides, K. H. (2015). Analysis of cell-free DNA in maternal blood in screening for fetal aneuploidies: Updated meta-analysis. Ultrasound in Obstetrics & Gynecology: The Official Journal of the International Society of Ultrasound in Obstetrics and Gynecology, 45, 249–266. doi: 10.1002/uog.14791.CrossRefGoogle Scholar
- Harris, P. A., Taylor, R., Thielke, R., Payne, J., Gonzalez, N., & Conde, J. G. (2009). Research Electronic Data Capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. Journal of Biomedical Informatics, 42, 377–381. doi: 10.1016/j.jbi.2008.08.010.CrossRefPubMedGoogle Scholar
- Ho, T. K. (1995) Random decision forests. In Document Analysis and Recognition, 1995, Proceedings of the Third International Conference on (Vol. 1, p 278–282). IEEE.Google Scholar
- Johnson, R. C., McKean, C. M., & Shah, S. N. (1977). Fatty acid composition of lipids in cerebral myelin and synaptosomes in phenylketonuria and Down syndrome. Archives of Neurology, 34, 288–294.Google Scholar
- R Development Core Team. (2012). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing.Google Scholar
- Roobaert, D., Karakoulas, G., & Chawla, N. V. (2006). Information gain, correlation and support vector machines. In I. Guyon, M. Nikravesh, S. Gunn & L. A. Zadeh (Eds.), Feature extraction: Foundations and applications.(pp. 463-470). Berlin: Springer.Google Scholar
- Santorum, M., Wright, D., Syngelaki, A., Karagioti, N., & Nicolaides, K. H. (2016). Accuracy of first trimester combined test in screening for trisomies 21, 18 and 13. Ultrasound in Obstetrics & Gynecology: The Official Journal of The International Society of Ultrasound in Obstetrics and Gynecology, 49, 714–720. doi: 10.1002/uog.17283.CrossRefGoogle Scholar
- Trivedi, D. K., & Iles, R. K. (2015). Shotgun metabolomic profiles in maternal urine identify potential mass spectral markers of abnormal fetal biochemistry—Dihydrouracil and progesterone in the metabolism of Down syndrome. Biomedical Chromatography: BMC, 29, 1173–1183. doi: 10.1002/bmc.3404.CrossRefPubMedGoogle Scholar
- van den Berg, R. A., Hoefsloot, H. C. J., Westerhuis, J. A., Smilde, A. K., & van der Werf, M. J. (2006). Centering, scaling, and transformations: Improving the biological information content of metabolomics data. BMC Genomics, 7, 142–142. doi: 10.1186/1471-2164-7-142.CrossRefPubMedPubMedCentralGoogle Scholar
- Weiss, G. M., & Provost, F. (2003). Learning when training data are costly: The effect of class distribution on tree induction. Journal of Artificial Intelligence Research, 19, 315–354.Google Scholar
- Wilk, T., & Wozniak, M. (2011). Complexity and multithreaded implementation analysis of one class-classifiers fuzzy combiner. Paper presented at the Proceedings of the 6th international conference on Hybrid artificial intelligent systems—Volume Part II, Wroclaw, Poland.Google Scholar
- Wold, S., Johansson, E., & Cocchi, M. (1993). PLS-partial least squares projections to latent structures. 3D QSAR in Drug Design, 1, 523–550.Google Scholar