Skip to main content

Introduction and Overview of the Main Results of the Book

  • Chapter
  • First Online:
Stream Data Mining: Algorithms and Their Probabilistic Properties

Part of the book series: Studies in Big Data ((SBD,volume 56))

  • 954 Accesses

Abstract

In recent decades we are observing an exponential increase in the available digital data, generated in various areas of human activity. This growth is much faster with respect to the increase in the available processing capabilities. Apart from large volumes, the data produced by modern data sources are often dynamic and generated at very high rates. Therefore, there is a big challenge to design new data mining algorithms able to deal with such a streaming nature of data. Data stream mining became a very important domain of computer science and finds applications in many areas, e.g. in engineering and industrial processes, robotics, sensor networks, social networks, spam filtering or credit card transaction flows. In this book we present a unique approach to data stream mining problems, putting emphasis on the theoretical backgrounds of considered algorithms. Contrary to the vast majority of the previously presented in the literature heuristic methods, this book focuses on algorithms which are mathematically justified. However, it should be noted that the heuristic solutions cannot be completely abandoned since they often lead to satisfactory practical results. Therefore, the mathematically justified algorithms presented in this book are sometimes slightly tuned and modified in a heuristic way to increase their final accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.: Data Streams: Models and Algorithms. Springer, New York (2007)

    Google Scholar 

  2. Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: European Business Intelligence Summer School, pp. 88–125. Springer, Berlin (2014)

    Google Scholar 

  3. Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Herrera, F.: A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239, 39–57 (2017)

    Google Scholar 

  4. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 44:1–44:37 (2014)

    Google Scholar 

  5. Webb, G.I., Kuan Lee, L., Petitjean, F., Goethals, B.: Understanding concept drift. CoRR (2017). arXiv:abs/1704.00362

  6. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  7. Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)

    Article  Google Scholar 

  8. De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), July 2015, pp. 1–8 (2015)

    Google Scholar 

  9. De Rosa, R., Cesa-Bianchi, N.: Confidence decision trees via online and active learning for streaming data. J. Artif. Intell. Res. 60(60), 1031–1055 (2017)

    Article  Google Scholar 

  10. Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29, 2516–2529 (2018)

    Article  MathSciNet  Google Scholar 

  11. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)

    Article  MathSciNet  Google Scholar 

  12. Rutkowski, L.: Generalized regression neural networks in time-varying environment. IEEE Trans. Neural Netw. 15, 576–596 (2004)

    Article  Google Scholar 

  13. Pietruczuk, L., Rutkowski, L., Maciej, J., Duda, P.: The Parzen kernel approach to learning in non-stationary environment. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3319–3323 (2014)

    Google Scholar 

  14. Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2018)

    Article  MathSciNet  Google Scholar 

  15. Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive Parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)

    Article  Google Scholar 

  16. Rutkowski, L.: Adaptive probabilistic neural-networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15, 811–827 (2004)

    Article  Google Scholar 

  17. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: A method for automatic adjustment of ensemble size in stream data mining. In: 2016 International Joint Conference on Neural Networks (IJCNN), July 2016, pp. 9–15 (2016)

    Google Scholar 

  18. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381, 46–54 (2017)

    Article  MathSciNet  Google Scholar 

  19. Duda, P., Jaworski, M., Rutkowski, L.: Online GRNN-based ensembles for regression on evolving data streams. In: Huang, T., Lv, J., Sun, C., Tuzikov, A.V. (eds.) Advances in Neural Networks – ISNN 2018, pp. 221–228. Springer International Publishing, Cham (2018)

    Google Scholar 

  20. Rutkowski, L.: New Soft Computing Techniques for System Modeling, Pattern Classification and Image Processing. Springer, Berlin (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leszek Rutkowski .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rutkowski, L., Jaworski, M., Duda, P. (2020). Introduction and Overview of the Main Results of the Book. In: Stream Data Mining: Algorithms and Their Probabilistic Properties. Studies in Big Data, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-030-13962-9_1

Download citation

Publish with us

Policies and ethics