A Comparative Study on Machine Classification Model in Lung Cancer Cases Analysis
Due to the differences of machine classification models in the application of medical data, this paper selected different classification methods to study lung cancer data collected from HIS system with experimental analysis, applying the R language on decision tree algorithm, Bagging algorithm, Adaboost algorithm, SVM, KNN and neural network algorithm for lung cancer data analysis, in order to explore the advantages and disadvantages of each machine classification algorithm. The results confirmed that in lung cancer data research, Adaboost algorithm and neural network algorithm have relatively high accuracy, with a good diagnostic performance.
KeywordsMachine classification model Cross validation Adaboost algorithm Neural network
1. Major Scientific Research Project in Higher School in Hebei Province (Grant No. ZD201310.-85), 2. Funding Project of Science and Technology Research and Development in Hebei North University (Grant No. ZD201301), 3. Major Projects in Hebei Food and Drug Administration (Grant No. ZD2015017), 4. Major Funding Project in Hebei Health Department (Grant No. ZL20140127), 5. Youth Funding Science and Technology Projects in Hebei Higher School (Grant No. QN2016192), with Hebei Province Population Health Information Engineering Technology Research Center.
- 1.Freund Y, SehaPireRE, ADecision-Theoretie Generalization of Online Leaming and an Application to Boosting [J]. Journal of ComPuter and System Seiences, 2010, 55(l):119–139Google Scholar
- 2.Dong Jun, HU Shang-xu. The progress and prospects of neural network research [J]. Information and Control, 20057, 26(5):360–368.Google Scholar
- 3.Breiman L, Friedman J H, Olshen R A et al. Classification and regression trees [C]. California:Wadsworth, 1999Google Scholar
- 4.Peter B, Yu Bin. Analysing Bagging[J]. Annals of Statistics, 2002, 30(4):927–961.Google Scholar
- 5.Zhang Xue-gong. Statistical learning theory and support vector machines [J]. Acta Automatica Sinica, 2013, 26(1):32–41.Google Scholar
- 6.Robert E. Schapier, Yoram Singer. Improved Boosting Algorithms Using Confidence-rated Predictions, Machine Learning, 1999, 37(3):297–336.Google Scholar
- 7.Guoping Zhang, Cihua Liu, Xuesi Ma. Bayes sequential estimation of Lognormal Population distribution parameters [J]. Statistics and Decision.2006(11). p 7–p8Google Scholar
- 8.Wangyong Lv, Yaoguo Wu, Hong Ma. Lognormal distribution parameter estimation based on the EM algorithm [J]. Statistics and Decision.2007(06), p 21–p23Google Scholar
- 9.Lijun Wang. An approximate method for three-parameter distribution parameter estimation log_normal [J]. Statistics and principle. Vol.18, 2.1999(01). p 40–p43、Google Scholar