Abstract
Stream data are always fast, real-time, infinite and change over time, in this paper, we propose a semi-supervised learning based ensemble classifier for solving recurring data concept drift problem. Our baseline classifiers group both labeled and unlabeled instances as the training points to obtain better learning efficiency from limited data samples, historical information are kept as part of weight decision factor when building the ensemble classifier, which helps keeping classifier ensemble set in a reasonable range without losing those repeated features. The empirical study shows that our new approach outperforms the general ensemble model and is suitable for recurring massive stream data classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Babcock, B., Babu, S., Datar, M., et al.: Models and issues in data stream systems. In: Proceedings of the Twenty-First ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, pp. 1–16. ACM (2002)
Masud, M.M.: Adaptive classification of scarcely labeled and evolving data streams (2009)
Zliobaite, I.: Learning under concept drift: an overview. Overview, Technical report, Vilnius University, 2009 techniques, related areas, applications Subjects: Artificial Intelligence (2009)
Domingos, P., Hulten, G.: Mining high—speed data streams. In: Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining, Boston, MA, pp. 71–80 (2000)
Gama, J., Castillo, G.: Learning with local drift detection. In: Li, X., Zaïane, O.R., Li, Z.-h. (eds.) ADMA 2006. LNCS (LNAI), vol. 4093, pp. 42–55. Springer, Heidelberg (2006)
Wu, J., Ding, D., Hua, X.S., et al.: Tracking concept drifting with an online-optimized incremental learning framework. In: Proceedings of the 7th ACM SIGMM International Workshop on Multimedia Information Retrieval, pp. 33–40. ACM (2005)
Wang, H., Fan, W., Yu, P.S., et al.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)
Wu, F., Zeng, F.P., Xiong, N., et al.: Research of IP Flow Classification Based on heuristic Search. Journal of Chinese Computer Systems 33(10), 2153–2157 (2012)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Third IEEE International Conference on Data Mining, pp. 123–130 (2003)
Ramamurthy, S., Bhatnagar, R.: Tracking recurrent concept drift in streaming data using ensemble classifiers. In: IEEE Sixth International Conference on Machine Learning and Applications, ICMLA 2007, pp. 404–409 (2007)
Gao, J., Fan, W., Han, J.: On appropriate assumptions to mine data streams: Analysis and practice. In: Seventh IEEE International Conference on Data Mining, pp. 143–152. IEEE (2007)
Zhou, Z.-H.: When semi-supervised learning meets ensemble learning. In: Benediktsson, J.A., Kittler, J., Roli, F. (eds.) MCS 2009. LNCS, vol. 5519, pp. 529–538. Springer, Heidelberg (2009)
Huang, S.C., Dong, Y.G., Dong, Y.S.: Semi-Supervised-Learning-Based Approach for Classifying Data Streams. Journal of Computer Research and Development (44), 225–229 (2007)
Xu, W.H., Zheng, Q., Chang, Y.: Semi-Supervised Learning Based Ensemble Classifierfor Stream Data. PR&AI (25), 292–299 (2012)
Asuncion, A., Newman, D.J.: UCI Machine Learning Repository. School of Information and Computer Science. University of California, Irvine (2007), http://www.ics.uci.edu/~mlearn/MLRepository.html
The Fundamental Research Funds for the Central Universities (NO.2012-II-015)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, B., Chen, D., Zu, Q., Mao, Y., Pan, Y., Zhang, X. (2014). A New Semi-supervised Learning Based Ensemble Classifier for Recurring Data Stream. In: Zu, Q., Vargas-Vera, M., Hu, B. (eds) Pervasive Computing and the Networked World. ICPCA/SWS 2013. Lecture Notes in Computer Science, vol 8351. Springer, Cham. https://doi.org/10.1007/978-3-319-09265-2_77
Download citation
DOI: https://doi.org/10.1007/978-3-319-09265-2_77
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09264-5
Online ISBN: 978-3-319-09265-2
eBook Packages: Computer ScienceComputer Science (R0)