An Efficient Classification Method of Uncertain Data with Sampling

Huang, Jinchao; Li, Yulin; Qi, Kaiyue; Li, Fangqi

doi:10.1007/978-981-13-6504-1_43

Jinchao Huang⁴⁰,
Yulin Li⁴¹,
Kaiyue Qi⁴² &
…
Fangqi Li⁴²

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 516))

Included in the following conference series:

International Conference in Communications, Signal Processing, and Systems

2178 Accesses
1 Citations

Abstract

Current research on the classification for uncertain data mainly focuses on the structural changes of the classification algorithms. Existing methods have achieved encouraging results; however, they do not take an effective trade-off between accuracy and running time, and they do not have good portability. This paper proposed a new framework to solve the classification problem of uncertain data from data processing point. The proposed algorithm represents the distribution of raw data by a sampling method, which means that the uncertain data are converted into determined data. The proposed framework is suitable for all classifiers, and then, XGBoost is adopted as a specific classifier in this paper. The experimental results show that the proposed method is an effective way of handling the classification problem for uncertain data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Bi, J., Zhang, T.: Support vector classification with input data uncertainty. In: Proceedings of neural information processing systems, vol 17; 2004. p. 161–8.
Google Scholar
Chen T, Guestrin C. Xgboost: a scalable tree boosting system; 2016. p. 785–94. (2016)
Google Scholar
Dheeru D, Karra Taniskidou E. UCI machine learning repository; 2017. URL http://archive.ics.uci.edu/ml
Domingos P, Hulten G. Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining; 2000. p. 71–80.
Google Scholar
Duda RO, Hart PE. Pattern classification and scene analysis. Hoboken: Wiley; 1973.
MATH Google Scholar
He J, Zhang Y, Li X, Wang Y. Bayesian classifiers for positive unlabeled learning. Berlin, Heidelberg: Springer; 2011.
Book Google Scholar
Peterson L. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.
Article Google Scholar
Qin B, Xia Y, Li F. DTU: a decision tree for uncertain data. In: Advances in knowledge discovery and data mining, Pacific-Asia conference, PAKDD 2009, Bangkok, Thailand, April 27–30, 2009, Proceedings; 2009. p. 4–15.
Google Scholar
Qin B, Xia Y, Wang S, Du X. A novel bayesian classification for uncertain data. Knowl-Based Syst. 2011;24(8):1151–8.
Article Google Scholar
Quinlan JR. Induction on decision tree. Mach Learn. 1986;1(1):81–106.
Google Scholar
Ren J, Lee, SD, Chen X, Kao B, Cheng R, Cheung D. Naive bayes classification of uncertain data. In: IEEE international conference on data mining; p. 944–9.
Google Scholar
Tsang S, Kao B, Yip KY, Ho WS, Lee SD. Decision trees for uncertain data. IEEE Trans Knowl Data Eng. 2011;23(1):64–78.
Article Google Scholar
Vapnik VN. The nature of statistical learning theory. Technometrics. 1997;8(6):1564.
Google Scholar

Download references

Acknowledgments

This research work is funded by the National Key Research and Development Project of China (2016YFB0801003).

Author information

Authors and Affiliations

School of Cyber Security, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai, 200240, China
Jinchao Huang
School of Computer Engineering, University of Illinois at Urbana-Champaign, 508 E University Ave., Champaign, IL, 61820, USA
Yulin Li
School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, 800 Dong Chuan Road, Shanghai, 200240, China
Kaiyue Qi & Fangqi Li

Authors

Jinchao Huang
View author publications
You can also search for this author in PubMed Google Scholar
Yulin Li
View author publications
You can also search for this author in PubMed Google Scholar
Kaiyue Qi
View author publications
You can also search for this author in PubMed Google Scholar
Fangqi Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinchao Huang .

Editor information

Editors and Affiliations

Department of Electrical Engineering, University of Texas at Arlington, Arlington, TX, USA
Qilian Liang
School of Information and Communication Engineering, Dalian University of Technology, Dalian, China
Xin Liu
School of Information Science and Technology, Dalian Maritime University, Dalian, China
Zhenyu Na
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Wei Wang
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Jiasong Mu
College of Electronic and Communication Engineering, Tianjin Normal University, Tianjin, China
Baoju Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, J., Li, Y., Qi, K., Li, F. (2020). An Efficient Classification Method of Uncertain Data with Sampling. In: Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-13-6504-1_43

Download citation

DOI: https://doi.org/10.1007/978-981-13-6504-1_43
Published: 14 August 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-6503-4
Online ISBN: 978-981-13-6504-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

An Efficient Classification Method of Uncertain Data with Sampling