A Research and Application Based on Gradient Boosting Decision Tree
Hand, foot, and mouth disease(HFMD) is an infectious disease of the intestines that damages people’s health, severe cases could lead to cardiorespiratory failure or death.
Therefore, severe cases’ identification of HFMD is important. A real-time, automatic and efficient prediction system based on multi-source data (structured and unstructured data), and gradient boosting decision tree(GBDT) is proposed in this paper for severe HFMD identification. A missing data imputation method based on GBDT model is proposed.
Experimental result shows that our model can identify severe HFMD with a reasonable area under the ROC curve (AUC) of 0.94, and which is better than that of PCIS by 17%.
KeywordsSevere HFMD Disease identification Missing data Machine learning
We would like to thank Guangzhou Women and Children Medical Center, for supporting clinical data during this research.
This research is supported by national Natural Science Foundation of China (NSFC), grant No. 61471176, Pearl River Nova Program of Guangzhou, grant No. 201610010199, Science Foundation for Excellent Youth Scholars of Guangdong Province, grant No. YQ2015046, Science and Technology Planning Project of Guangdong Province, grant Nos. 2017A010101015, 2017B030308009, 2017KZ010101, Special Project for Youth Top-notch Scholars of Guangdong Province, grant No. 2016TQ03X100, and also supported by Joint Foundation of BLUEDON Information Security Technologies Co., grand No. LD20170204 and LD20170207.
- 7.Sui, M., Huang, X., et al.: Application and comparison of laboratory parameters for forecasting severe hand-foot-mouth disease using logistic regression, discriminant analysis and decision tree. Clin. Lab. 62(6), 1023–1031 (2016)Google Scholar
- 9.Chen, T., Guestrin, C.: XGBOOST: a scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM (2016)Google Scholar
- 10.Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 1189–1232 (2001)Google Scholar