Skip to main content

A New Semi-supervised Learning Based Ensemble Classifier for Recurring Data Stream

  • Conference paper
Book cover Pervasive Computing and the Networked World (ICPCA/SWS 2013)

Part of the book series: Lecture Notes in Computer Science ((LNCCN,volume 8351))

  • 3065 Accesses

Abstract

Stream data are always fast, real-time, infinite and change over time, in this paper, we propose a semi-supervised learning based ensemble classifier for solving recurring data concept drift problem. Our baseline classifiers group both labeled and unlabeled instances as the training points to obtain better learning efficiency from limited data samples, historical information are kept as part of weight decision factor when building the ensemble classifier, which helps keeping classifier ensemble set in a reasonable range without losing those repeated features. The empirical study shows that our new approach outperforms the general ensemble model and is suitable for recurring massive stream data classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Babcock, B., Babu, S., Datar, M., et al.: Models and issues in data stream systems. In: Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1–16. ACM (2002)

    Google Scholar 

  2. Masud, M.M.: Adaptive classification of scarcely labeled and evolving data streams (2009)

    Google Scholar 

  3. Zliobaite, I.: Learning under concept drift: an overview. Overview, Technical report, Vilnius University, 2009 techniques, related areas, applications Subjects: Artificial Intelligence (2009)

    Google Scholar 

  4. Domingos, P., Hulten, G.: Mining high—speed data streams. In: Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining, Boston, MA, pp. 71–80 (2000)

    Google Scholar 

  5. Gama, J., Castillo, G.: Learning with local drift detection. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 42–55. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  6. Wu, J., Ding, D., Hua, X.S., et al.: Tracking concept drifting with an online-optimized incremental learning framework. In: Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 33–40. ACM (2005)

    Google Scholar 

  7. Wang, H., Fan, W., Yu, P.S., et al.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)

    Google Scholar 

  8. Wu, F., Zeng, F.P., Xiong, N., et al.: Research of IP Flow Classification Based on heuristic Search. Journal of Chinese Computer Systems 33(10), 2153–2157 (2012)

    Google Scholar 

  9. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Third IEEE International Conference on Data Mining, pp. 123–130 (2003)

    Google Scholar 

  10. Ramamurthy, S., Bhatnagar, R.: Tracking recurrent concept drift in streaming data using ensemble classifiers. In: IEEE Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 404–409 (2007)

    Google Scholar 

  11. Gao, J., Fan, W., Han, J.: On appropriate assumptions to mine data streams: Analysis and practice. In: Seventh IEEE International Conference on Data Mining, pp. 143–152. IEEE (2007)

    Google Scholar 

  12. Zhou, Z.-H.: When semi-supervised learning meets ensemble learning. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 529–538. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Huang, S.C., Dong, Y.G., Dong, Y.S.: Semi-Supervised-Learning-Based Approach for Classifying Data Streams. Journal of Computer Research and Development (44), 225–229 (2007)

    Google Scholar 

  14. Xu, W.H., Zheng, Q., Chang, Y.: Semi-Supervised Learning Based Ensemble Classifierfor Stream Data. PR&AI (25), 292–299 (2012)

    Google Scholar 

  15. Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. School of Information and Computer Science. University of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html

  16. The Fundamental Research Funds for the Central Universities (NO.2012-II-015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, B., Chen, D., Zu, Q., Mao, Y., Pan, Y., Zhang, X. (2014). A New Semi-supervised Learning Based Ensemble Classifier for Recurring Data Stream. In: Zu, Q., Vargas-Vera, M., Hu, B. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2013. Lecture Notes in Computer Science, vol 8351. Springer, Cham. https://doi.org/10.1007/978-3-319-09265-2_77

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09265-2_77

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09264-5

  • Online ISBN: 978-3-319-09265-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics