The General Procedure of Ensembles Construction in Data Stream Scenarios

  • Leszek RutkowskiEmail author
  • Maciej Jaworski
  • Piotr Duda
Part of the Studies in Big Data book series (SBD, volume 56)


During constructing data stream algorithms the following three aspects have to be taken into consideration: accuracy, running time and required memory. However, in many cases, the fastest algorithms are less accurate than methods requiring high computational power and more time for data analysis. Therefore, to enhance the performance of the algorithms, which in data stream scenario must be characterized by low memory requirement and short time of learning, one can use an ensemble approach. Roughly speaking, the decision made by the ensemble of algorithms can be seen as a decision based on an opinion of a few specialists. In real life nobody is infallible, so to improve the decision making process people often take a final decision after consulting with a few various persons. The vivid example is the diagnosis of an illness. When someone gets bad news, he often goes to other doctors for a second, third, fourth opinion and so on, until we are sure about the diagnosis.


  1. 1.
    Krawczyk, B., Schaefer, G., Wozniak, M.: A cost-sensitive ensemble classifier for breast cancer classification. In: 2013 IEEE 8th International Symposium on Applied Computational Intelligence and Informatics (SACI), pp. 427–430 (2013)Google Scholar
  2. 2.
    Margoosian, A., Abouei, J.: Ensemble-based classifiers for cancer classification using human tumor microarray data. In: 2013 21st Iranian Conference on Electrical Engineering (ICEE), pp. 1–6 (2013)Google Scholar
  3. 3.
    Turhal, U., Babur, S., Avci, C., Akbas, A.: Performance improvement for diagnosis of colon cancer by using ensemble classification methods. In: 2013 International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE), pp. 271–275 (2013)Google Scholar
  4. 4.
    Pan, S., Zhu, X., Zhang, C., Yu, P.S.: Graph stream classification using labeled and unlabeled graphs. In: 2014 IEEE 30th International Conference on Data Engineering, pp. 398–409 (2013)Google Scholar
  5. 5.
    Yu, G., Rangwala, H., Domeniconi, C., Zhang, G., Yu, Z.: Protein function prediction using multi-label ensemble classification. IEEE/ACM Trans. Comput. Biol. Bioinformat. 10(4), 1–1 (2013)CrossRefGoogle Scholar
  6. 6.
    Chan, J.C.W., Demarchi, L., Van de Voorde, T., Canters, F.: Binary classification strategies for mapping urban land cover with ensemble classifiers. In: IEEE International on Geoscience and Remote Sensing Symposium, 2008. IGARSS 2008, vol. 3, pp. II–1004–III–1007 (2008)Google Scholar
  7. 7.
    He, L., Kong, F., Shen, Z.: Artificial neural network ensemble for land cover classification. In: The Sixth World Congress on Intelligent Control and Automation, 2006. WCICA 2006, vol. 2, pp. 10054–10057 (2006)Google Scholar
  8. 8.
    Maragoudakis, M., Maglogiannis, I.: Skin lesion diagnosis from images using novel ensemble classification techniques. In: 2010 10th IEEE International Conference on Information Technology and Applications in Biomedicine (ITAB), pp. 1–5 (2010)Google Scholar
  9. 9.
    Kotti, M., Paternò, F.: Speaker-independent emotion recognition exploiting a psychologically-inspired binary cascade classification schema. Int. J. Speech Technol. 15(2), 131–150 (2012)CrossRefGoogle Scholar
  10. 10.
    Zhang, B.: Reliable classification of vehicle types based on cascade classifier ensembles. IEEE Trans. Intell. Trans. Syst. 14(1), 322–332 (2013)CrossRefGoogle Scholar
  11. 11.
    Pietruczuk, L.: Application of Ensemble Algorithms for Data Stream Mining. Ph.D. thesis, Czestochowa University of Technology (2015)Google Scholar
  12. 12.
    Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)Google Scholar
  13. 13.
    Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM (2003)Google Scholar
  14. 14.
    Polikar, R., Upda, L., Upda, S.S., Honavar, V.: Learn++: an incremental learning algorithm for supervised neural networks. IEEE Trans. Syst. Man Cybernet. Part C (Appl. Rev.) 31(4), 497–508 (2001)Google Scholar
  15. 15.
    Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)CrossRefGoogle Scholar
  16. 16.
    Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)CrossRefGoogle Scholar
  17. 17.
    Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345. IEEE (2005)Google Scholar
  18. 18.
    Beygelzimer, A., Kale, S., Luo, H.: Optimal and adaptive algorithms for online boosting. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-15), pp. 2323–2331 (2015)Google Scholar
  19. 19.
    Minku, L.L., Yao, X.: DDD: A new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)CrossRefGoogle Scholar
  20. 20.
    Jaworski, M., Duda, P., Rutkowski, L., Najgebauer, P., Pawlak, M.: Heuristic regression function estimation methods for data streams with concept drift. In: Lecture Notes in Computer Science, pp. 726–737. Springer (2017)Google Scholar
  21. 21.
    Liu, A., Zhang, G., Lu, J.: Fuzzy time windowing for gradual concept drift adaptation. In: 2017 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pp. 1–6. IEEE (2017)Google Scholar
  22. 22.
    Mahdi, O.A., Pardede, E., Cao, J.: Combination of information entropy and ensemble classification for detecting concept drift in data stream. In: Proceedings of the Australasian Computer Science Week Multiconference, p. 13. ACM (2018)Google Scholar
  23. 23.
    Kolter, J., Maloof, M.A.: Using additive expert ensembles to cope with concept drift. In: Proceedings of the 22nd International Conference on Machine Learning. ACM (2005)Google Scholar
  24. 24.
    Kadlec, P., Gabrys, B.: Local learning-based adaptive soft sensor for catalyst activation prediction. AIChE J. 57(5), 1288–1301 (2011)CrossRefGoogle Scholar
  25. 25.
    Ikonomovska, E., Gama, J., Dzeroski, S.: Learning model trees from evolving data streams. Data Mining Knowl. Disc. 23(1), 128–168 (2011)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Ikonomovska, E., Gama, J., Zenko, B., Dzeroski, S.: Speeding-up hoeffding-based regression trees with options. In: Proceedings of the 28th International Conference on Machine Learning (2011)Google Scholar
  27. 27.
    Ikonomovska, E., Gama, J., Džeroski, S.: Online tree-based ensembles and option trees for regression on evolving data streams. Neurocomputing 150, 458–470 (2015)CrossRefGoogle Scholar
  28. 28.
    Duarte, J., Gama, J., Bifet, A.: Adaptive model rules from high-speed data streams. ACM Trans. Knowl. Disc. Data (TKDD) 10.3(30) (2016)Google Scholar
  29. 29.
    Xiao, H., Eckert, C.: Lazy Gaussian process committee for real-time online regression. AAAI (2013)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Leszek Rutkowski
    • 1
    • 2
    Email author
  • Maciej Jaworski
    • 1
  • Piotr Duda
    • 1
  1. 1.Institute of Computational IntelligenceCzestochowa University of TechnologyCzęstochowaPoland
  2. 2.Information Technology InstituteUniversity of Social SciencesLodzPoland

Personalised recommendations