Skip to main content

Part of the book series: Studies in Big Data ((SBD,volume 56))

  • 960 Accesses

Abstract

In this book, we studied the problem of data stream mining. Recently, it became a very important and challenging issue of computer science research. The reason is the enormous growth of data amounts generated in various areas of human activities. Data streams [1,2,3] are potentially of infinite size and often arrive at the system with very high rates. Therefore, it is not possible to store all the data in memory. Appropriate algorithms should use some synopsis structures to compress the information gathered from the past data. Moreover, data stream mining algorithms should be fast enough. Most often they have an incremental nature, i.e. each data element is processed at most once. Alternatively, the data stream can be analyzed in a block-based manner. Another feature of data streams is that the underlying data distribution may change over time. It is known in the literature as ‘concept drift’ [4, 5]. A good data stream mining method should be able to react to different types of changes. In this book, we studied various data stream mining algorithms. We focused on three groups of methods, based on decision trees, probabilistic neural networks, and ensemble methods. A separate part of the book was devoted to each group.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Gama, J.: Knowledge Discovery from Data Streams, 1st edn. Chapman and Hall/CRC, United Kingdom (2010)

    Google Scholar 

  2. Lemaire, V., Salperwyck, C., Bondu, A.: A survey on supervised classification on data streams. In: European Business Intelligence Summer School, pp. 88–125. Springer, Berlin (2014)

    Google Scholar 

  3. Garofalakis, M., Gehrke, J., Rastogi, R. (eds.): Data Stream Management: Processing High-Speed Data Streams. Data-Centric Systems and Applications. Springer, Cham (2016)

    Google Scholar 

  4. Tsymbal, A.: The problem of concept drift: definitions and related work, Technical report, Department of Computer Science, Trinity College Dublin (2004)

    Google Scholar 

  5. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44:1–44:37 (2014)

    Google Scholar 

  6. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  7. Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)

    Google Scholar 

  8. Matuszyk, P., Krempl, G., Spiliopoulou, M.: Correcting the usage of the Hoeffding inequality in stream mining. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) Advances in Intelligent Data Analysis XII. Lecture Notes in Computer Science, vol. 8207, pp. 298–309. Springer, Berlin (2013)

    Google Scholar 

  9. De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)

    Google Scholar 

  10. Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29, 2516–2529 (2018)

    Google Scholar 

  11. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Knowl. Data Eng. 26(5), 1048–1059 (2015)

    Google Scholar 

  12. De Rosa, R., Cesa-Bianchi, N.: Confidence decision trees via online and active learning for streaming data. J. Artif. Intell. Res. 60(60), 1031–1055 (2017)

    Google Scholar 

  13. Rutkowski, L.: Generalized regression neural networks in time-varying environment. IEEE Trans. Neural Netw. 15(3), 576–596 (2004)

    Google Scholar 

  14. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: The Parzen kernel approach to learning in non-stationary environment. In: 2014 International Joint Conference on Neural Networks (IJCNN), pp. 3319–3323 (2014)

    Google Scholar 

  15. Duda, P., Jaworski, M., Rutkowski, L.: Knowledge discovery in data streams with the orthogonal series-based generalized regression neural networks. Inf. Sci. 460–461, 497–518 (2017)

    Google Scholar 

  16. Duda, P., Jaworski, M., Rutkowski, L.: Convergent time-varying regression models for data streams: tracking concept drift by the recursive parzen-based generalized regression neural networks. Int. J. Neural Syst. 28(02), 1750048 (2018)

    Google Scholar 

  17. Rutkowski, L.: Adaptive probabilistic neural-networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15(4), 811–827 (2004)

    Google Scholar 

  18. Jaworski, M., Duda, P., Rutkowski, L., Najgebauer, P., Pawlak, M.: Heuristic regression function estimation methods for data streams with concept drift. Lecture Notes in Computer Science, vol. 10246, pp. 726–737 (2017)

    Google Scholar 

  19. Jaworski, M.: Regression function and noise variance tracking methods for data streams with concept drift. Int. J. Appl. Math. Comput. Sci. 28(3), 559–567 (2018)

    Google Scholar 

  20. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: A method for automatic adjustment of ensemble size in stream data mining. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 9–15 (2016)

    Google Scholar 

  21. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Inf. Sci. 381, 46–54 (2017)

    Google Scholar 

  22. Duda, P., Jaworski, M., Rutkowski, L.: Online GRNN-based ensembles for regression on evolving data streams. In: Huang, T., Lv, J., Sun, C., Tuzikov, A.V. (eds.) Advances in Neural Networks – ISNN 2018, pp. 221–228. Springer International Publishing, Cham (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leszek Rutkowski .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rutkowski, L., Jaworski, M., Duda, P. (2020). Final Remarks and Challenging Problems. In: Stream Data Mining: Algorithms and Their Probabilistic Properties. Studies in Big Data, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-030-13962-9_15

Download citation

Publish with us

Policies and ethics