Skip to main content

An Efficient Classification Method of Uncertain Data with Sampling

  • Conference paper
  • First Online:
Communications, Signal Processing, and Systems (CSPS 2018)

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 516))

Abstract

Current research on the classification for uncertain data mainly focuses on the structural changes of the classification algorithms. Existing methods have achieved encouraging results; however, they do not take an effective trade-off between accuracy and running time, and they do not have good portability. This paper proposed a new framework to solve the classification problem of uncertain data from data processing point. The proposed algorithm represents the distribution of raw data by a sampling method, which means that the uncertain data are converted into determined data. The proposed framework is suitable for all classifiers, and then, XGBoost is adopted as a specific classifier in this paper. The experimental results show that the proposed method is an effective way of handling the classification problem for uncertain data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bi, J., Zhang, T.: Support vector classification with input data uncertainty. In: Proceedings of neural information processing systems, vol 17; 2004. p. 161–8.

    Google Scholar 

  2. Chen T, Guestrin C. Xgboost: a scalable tree boosting system; 2016. p. 785–94. (2016)

    Google Scholar 

  3. Dheeru D, Karra Taniskidou E. UCI machine learning repository; 2017. URL http://archive.ics.uci.edu/ml

  4. Domingos P, Hulten G. Mining high-speed data streams. In: Proceedings of the sixth ACM SIGKDD international conference on Knowledge discovery and data mining; 2000. p. 71–80.

    Google Scholar 

  5. Duda RO, Hart PE. Pattern classification and scene analysis. Hoboken: Wiley; 1973.

    MATH  Google Scholar 

  6. He J, Zhang Y, Li X, Wang Y. Bayesian classifiers for positive unlabeled learning. Berlin, Heidelberg: Springer; 2011.

    Book  Google Scholar 

  7. Peterson L. K-nearest neighbor. Scholarpedia. 2009;4(2):1883.

    Article  Google Scholar 

  8. Qin B, Xia Y, Li F. DTU: a decision tree for uncertain data. In: Advances in knowledge discovery and data mining, Pacific-Asia conference, PAKDD 2009, Bangkok, Thailand, April 27–30, 2009, Proceedings; 2009. p. 4–15.

    Google Scholar 

  9. Qin B, Xia Y, Wang S, Du X. A novel bayesian classification for uncertain data. Knowl-Based Syst. 2011;24(8):1151–8.

    Article  Google Scholar 

  10. Quinlan JR. Induction on decision tree. Mach Learn. 1986;1(1):81–106.

    Google Scholar 

  11. Ren J, Lee, SD, Chen X, Kao B, Cheng R, Cheung D. Naive bayes classification of uncertain data. In: IEEE international conference on data mining; p. 944–9.

    Google Scholar 

  12. Tsang S, Kao B, Yip KY, Ho WS, Lee SD. Decision trees for uncertain data. IEEE Trans Knowl Data Eng. 2011;23(1):64–78.

    Article  Google Scholar 

  13. Vapnik VN. The nature of statistical learning theory. Technometrics. 1997;8(6):1564.

    Google Scholar 

Download references

Acknowledgments

This research work is funded by the National Key Research and Development Project of China (2016YFB0801003).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jinchao Huang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, J., Li, Y., Qi, K., Li, F. (2020). An Efficient Classification Method of Uncertain Data with Sampling. In: Liang, Q., Liu, X., Na, Z., Wang, W., Mu, J., Zhang, B. (eds) Communications, Signal Processing, and Systems. CSPS 2018. Lecture Notes in Electrical Engineering, vol 516. Springer, Singapore. https://doi.org/10.1007/978-981-13-6504-1_43

Download citation

  • DOI: https://doi.org/10.1007/978-981-13-6504-1_43

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-13-6503-4

  • Online ISBN: 978-981-13-6504-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics