Abstract
Current research on the classification for uncertain data mainly focuses on the structural changes of the classification algorithms. Existing methods have achieved encouraging results; however, they do not take an effective trade-off between accuracy and running time, and they do not have good portability. This paper proposed a new framework to solve the classification problem of uncertain data from data processing point. The proposed algorithm represents the distribution of raw data by a sampling method, which means that the uncertain data are converted into determined data. The proposed framework is suitable for all classifiers, and then, XGBoost is adopted as a specific classifier in this paper. The experimental results show that the proposed method is an effective way of handling the classification problem for uncertain data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bi, J., Zhang, T.: Support vector classification with input data uncertainty. In: Proceedings of neural information processing systems, vol 17; 2004. p. 161–8.
Chen T, Guestrin C. Xgboost: a scalable tree boosting system; 2016. p. 785–94. (2016)
Dheeru D, Karra Taniskidou E. UCI machine learning repository; 2017. URL http://archive.ics.uci.edu/ml
Domingos P, Hulten G. Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining; 2000. p. 71–80.
Duda RO, Hart PE. Pattern classification and scene analysis. Hoboken: Wiley; 1973.
He J, Zhang Y, Li X, Wang Y. Bayesian classifiers for positive unlabeled learning. Berlin, Heidelberg: Springer; 2011.
Peterson L. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.
Qin B, Xia Y, Li F. DTU: a decision tree for uncertain data. In: Advances in knowledge discovery and data mining, Pacific-Asia conference, PAKDD 2009, Bangkok, Thailand, April 27–30, 2009, Proceedings; 2009. p. 4–15.
Qin B, Xia Y, Wang S, Du X. A novel bayesian classification for uncertain data. Knowl-Based Syst. 2011;24(8):1151–8.
Quinlan JR. Induction on decision tree. Mach Learn. 1986;1(1):81–106.
Ren J, Lee, SD, Chen X, Kao B, Cheng R, Cheung D. Naive bayes classification of uncertain data. In: IEEE international conference on data mining; p. 944–9.
Tsang S, Kao B, Yip KY, Ho WS, Lee SD. Decision trees for uncertain data. IEEE Trans Knowl Data Eng. 2011;23(1):64–78.
Vapnik VN. The nature of statistical learning theory. Technometrics. 1997;8(6):1564.
Acknowledgments
This research work is funded by the National Key Research and Development Project of China (2016YFB0801003).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Huang, J., Li, Y., Qi, K., Li, F. (2020). An Efficient Classification Method of Uncertain Data with Sampling. In: Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-13-6504-1_43
Download citation
DOI: https://doi.org/10.1007/978-981-13-6504-1_43
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6503-4
Online ISBN: 978-981-13-6504-1
eBook Packages: EngineeringEngineering (R0)