Advertisement

Introduction and Overview of the Main Results of the Book

  • Leszek RutkowskiEmail author
  • Maciej Jaworski
  • Piotr Duda
Chapter
Part of the Studies in Big Data book series (SBD, volume 56)

Abstract

In recent decades we are observing an exponential increase in the available digital data, generated in various areas of human activity. This growth is much faster with respect to the increase in the available processing capabilities. Apart from large volumes, the data produced by modern data sources are often dynamic and generated at very high rates. Therefore, there is a big challenge to design new data mining algorithms able to deal with such a streaming nature of data. Data stream mining became a very important domain of computer science and finds applications in many areas, e.g. in engineering and industrial processes, robotics, sensor networks, social networks, spam filtering or credit card transaction flows. In this book we present a unique approach to data stream mining problems, putting emphasis on the theoretical backgrounds of considered algorithms. Contrary to the vast majority of the previously presented in the literature heuristic methods, this book focuses on algorithms which are mathematically justified. However, it should be noted that the heuristic solutions cannot be completely abandoned since they often lead to satisfactory practical results. Therefore, the mathematically justified algorithms presented in this book are sometimes slightly tuned and modified in a heuristic way to increase their final accuracy.

References

  1. 1.
    Aggarwal, C.: Data Streams: Models and Algorithms. Springer, New York (2007)Google Scholar
  2. 2.
    Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: European Business Intelligence Summer School, pp. 88–125. Springer, Berlin (2014)Google Scholar
  3. 3.
    Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Herrera, F.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)Google Scholar
  4. 4.
    Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 44:1–44:37 (2014)Google Scholar
  5. 5.
    Webb, G.I., Kuan Lee, L., Petitjean, F., Goethals, B.: Understanding concept drift. CoRR (2017). arXiv:abs/1704.00362
  6. 6.
    Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)Google Scholar
  7. 7.
    Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)CrossRefGoogle Scholar
  8. 8.
    De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), July 2015, pp. 1–8 (2015)Google Scholar
  9. 9.
    De Rosa, R., Cesa-Bianchi, N.: Confidence decision trees via online and active learning for streaming data. J. Artif. Intell. Res. 60(60), 1031–1055 (2017)CrossRefGoogle Scholar
  10. 10.
    Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29, 2516–2529 (2018)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)MathSciNetCrossRefGoogle Scholar
  12. 12.
    Rutkowski, L.: Generalized regression neural networks in time-varying environment. IEEE Trans. Neural Netw. 15, 576–596 (2004)CrossRefGoogle Scholar
  13. 13.
    Pietruczuk, L., Rutkowski, L., Maciej, J., Duda, P.: The Parzen kernel approach to learning in non-stationary environment. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3319–3323 (2014)Google Scholar
  14. 14.
    Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2018)MathSciNetCrossRefGoogle Scholar
  15. 15.
    Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive Parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)CrossRefGoogle Scholar
  16. 16.
    Rutkowski, L.: Adaptive probabilistic neural-networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15, 811–827 (2004)CrossRefGoogle Scholar
  17. 17.
    Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: A method for automatic adjustment of ensemble size in stream data mining. In: 2016 International Joint Conference on Neural Networks (IJCNN), July 2016, pp. 9–15 (2016)Google Scholar
  18. 18.
    Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381, 46–54 (2017)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Duda, P., Jaworski, M., Rutkowski, L.: Online GRNN-based ensembles for regression on evolving data streams. In: Huang, T., Lv, J., Sun, C., Tuzikov, A.V. (eds.) Advances in Neural Networks – ISNN 2018, pp. 221–228. Springer International Publishing, Cham (2018)Google Scholar
  20. 20.
    Rutkowski, L.: New Soft Computing Techniques for System Modeling, Pattern Classification and Image Processing. Springer, Berlin (2004)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Leszek Rutkowski
    • 1
    • 2
    Email author
  • Maciej Jaworski
    • 1
  • Piotr Duda
    • 1
  1. 1.Institute of Computational IntelligenceCzestochowa University of TechnologyCzęstochowaPoland
  2. 2.Information Technology InstituteUniversity of Social SciencesLodzPoland

Personalised recommendations