Abstract
During constructing data stream algorithms the following three aspects have to be taken into consideration: accuracy, running time and required memory. However, in many cases, the fastest algorithms are less accurate than methods requiring high computational power and more time for data analysis. Therefore, to enhance the performance of the algorithms, which in data stream scenario must be characterized by low memory requirement and short time of learning, one can use an ensemble approach. Roughly speaking, the decision made by the ensemble of algorithms can be seen as a decision based on an opinion of a few specialists. In real life nobody is infallible, so to improve the decision making process people often take a final decision after consulting with a few various persons. The vivid example is the diagnosis of an illness. When someone gets bad news, he often goes to other doctors for a second, third, fourth opinion and so on, until we are sure about the diagnosis.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Krawczyk, B., Schaefer, G., Wozniak, M.: A cost-sensitive ensemble classifier for breast cancer classification. In: 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 427–430 (2013)
Margoosian, A., Abouei, J.: Ensemble-based classifiers for cancer classification using human tumor microarray data. In: 2013 21st Iranian Conference on Electrical Engineering (ICEE), pp. 1–6 (2013)
Turhal, U., Babur, S., Avci, C., Akbas, A.: Performance improvement for diagnosis of colon cancer by using ensemble classification methods. In: 2013 International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), pp. 271–275 (2013)
Pan, S., Zhu, X., Zhang, C., Yu, P.S.: Graph stream classification using labeled and unlabeled graphs. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 398–409 (2013)
Yu, G., Rangwala, H., Domeniconi, C., Zhang, G., Yu, Z.: Protein function prediction using multi-label ensemble classification. IEEE/ACM Trans. Comput. Biol. Bioinformat. 10(4), 1–1 (2013)
Chan, J.C.W., Demarchi, L., Van de Voorde, T., Canters, F.: Binary classification strategies for mapping urban land cover with ensemble classifiers. In: IEEE International on Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008, vol. 3, pp. II–1004–III–1007 (2008)
He, L., Kong, F., Shen, Z.: Artificial neural network ensemble for land cover classification. In: The Sixth World Congress on Intelligent Control and Automation, 2006. WCICA 2006, vol. 2, pp. 10054–10057 (2006)
Maragoudakis, M., Maglogiannis, I.: Skin lesion diagnosis from images using novel ensemble classification techniques. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–5 (2010)
Kotti, M., Paternò, F.: Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema. Int. J. Speech Technol. 15(2), 131–150 (2012)
Zhang, B.: Reliable classification of vehicle types based on cascade classifier ensembles. IEEE Trans. Intell. Trans. Syst. 14(1), 322–332 (2013)
Pietruczuk, L.: Application of Ensemble Algorithms for Data Stream Mining. Ph.D. thesis, Czestochowa University of Technology (2015)
Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)
Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)
Polikar, R., Upda, L., Upda, S.S., Honavar, V.: Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans. Syst. Man Cybernet. Part C (Appl. Rev.) 31(4), 497–508 (2001)
Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)
Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)
Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345. IEEE (2005)
Beygelzimer, A., Kale, S., Luo, H.: Optimal and adaptive algorithms for online boosting. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2323–2331 (2015)
Minku, L.L., Yao, X.: DDD: A new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)
Jaworski, M., Duda, P., Rutkowski, L., Najgebauer, P., Pawlak, M.: Heuristic regression function estimation methods for data streams with concept drift. In: Lecture Notes in Computer Science, pp. 726–737. Springer (2017)
Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–6. IEEE (2017)
Mahdi, O.A., Pardede, E., Cao, J.: Combination of information entropy and ensemble classification for detecting concept drift in data stream. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 13. ACM (2018)
Kolter, J., Maloof, M.A.: Using additive expert ensembles to cope with concept drift. In: Proceedings of the 22nd International Conference on Machine Learning. ACM (2005)
Kadlec, P., Gabrys, B.: Local learning-based adaptive soft sensor for catalyst activation prediction. AIChE J. 57(5), 1288–1301 (2011)
Ikonomovska, E., Gama, J., Dzeroski, S.: Learning model trees from evolving data streams. Data Mining Knowl. Disc. 23(1), 128–168 (2011)
Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: Proceedings of the 28th International Conference on Machine Learning (2011)
Ikonomovska, E., Gama, J., Džeroski, S.: Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458–470 (2015)
Duarte, J., Gama, J., Bifet, A.: Adaptive model rules from high-speed data streams. ACM Trans. Knowl. Disc. Data (TKDD) 10.3(30) (2016)
Xiao, H., Eckert, C.: Lazy Gaussian process committee for real-time online regression. AAAI (2013)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Rutkowski, L., Jaworski, M., Duda, P. (2020). The General Procedure of Ensembles Construction in Data Stream Scenarios. In: Stream Data Mining: Algorithms and Their Probabilistic Properties. Studies in Big Data, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-030-13962-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-13962-9_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-13961-2
Online ISBN: 978-3-030-13962-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)