Skip to main content

Probabilistic Neural Networks for the Streaming Data Classification

  • Chapter
  • First Online:
Stream Data Mining: Algorithms and Their Probabilistic Properties

Part of the book series: Studies in Big Data ((SBD,volume 56))

Abstract

Among the data stream mining algorithms proposed so far in the literature most of them are devoted mainly to the data classification task [1,2,3]. Although there exist a lot of methods for classification of static datasets, they can hardly be adapted to deal with data streams. This is due to the features of the data stream such as potentially infinite volume, fast rate of data arrival and the occurrence of concept drift.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, C.: Data Streams: Models and Algorithms. Springer, New York (2007)

    MATH  Google Scholar 

  2. Gama, J.: A survey on learning from data streams: current and future trends. Prog. Artif. Intell. 1(1), 45–55 (2012)

    Google Scholar 

  3. Bifet, A., Gavalda, R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams with Practical Examples in MOA. MIT Press, Cambridge, MA, USA (2018)

    Google Scholar 

  4. Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  5. Law, Y.-N., Zaniolo, C.: An adaptive nearest neighbor classification algorrithm for data streams. Lect. Notes Comput. Sci. 3721, 108–120 (2005)

    Google Scholar 

  6. Aggarwal, C., Han, J., Wang, J., Yu, P.S.: On demand classification of data streams. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 503–508 (2004)

    Google Scholar 

  7. Ramírez-Gallego, S., Krawczyk, B., García, S., Woźniak, M., Benítez, J. M., Herrera, F.: Nearest neighbor classification for high-speed big data streams using spark. IEEE Trans. Syst. Man Cybernet. Syst. 47, 2727–2739 (2017)

    Google Scholar 

  8. Yuan, J., Wang, Z., Sun, Y., Zhang, W., Jiang, J.: An effective pattern-based Bayesian classifier for evolving data stream. Neurocomputing 295, 17–28 (2018)

    Google Scholar 

  9. Krawczyk, B., Wozniak, M.: Weighted naive Bayes classifier with forgetting for drifting data streams. In: 2015 IEEE International Conference on Systems, Man, and Cybernetics, Oct 2015, pp. 2147–2152 (2015)

    Google Scholar 

  10. Gama, J.: Accurate decision trees for mining high-speed data streams. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 523–528. ACM Press (2003)

    Google Scholar 

  11. Kirkby, R.: Improving Hoeffding Trees. Ph.D. thesis, University of Waikato (2007)

    Google Scholar 

  12. Bifet, A., Kirkby, R.: Data stream mining: a practical approach. Tech. Rep., The University of Waikato (2009)

    Google Scholar 

  13. Bouckaert, R.R.: Voting massive collections of Bayesian network classifiers for data streams. In: Australian Conference on Artificial Intelligence, Sattar, A., Kang, B.H. (eds.), vol. 4304 of Lecture Notes in Computer Science, pp. 243–252. Springer (2006)

    Google Scholar 

  14. Ratnapinda, P., Druzdzel, M.J.: Learning discrete Bayesian network parameters from continuous data streams: what is the best strategy? J. Appl. Logic 13(4), Part 2, 628–642 (2015)

    Google Scholar 

  15. Leite, D., Costa, P., Gomide, F.: Evolving granular neural network for semi-supervised data stream classification. In: Proceedings of the International Joint Conference on Neural Networks, pp. 1–8. IEEE (2010)

    Google Scholar 

  16. Leite, D., Costa, P., Gomide, F.: Evolving granular neural networks from fuzzy data streams. Neural Netw. 38, 1–16 (2013)

    MATH  Google Scholar 

  17. Bodyanskiy, Y., Vynokurova, O., Pliss, I., Setlak, G., Mulesa, P.: Fast learning algorithm for deep evolving GMDH-SVM neural network in data stream mining tasks. In: 2016 IEEE First International Conference on Data Stream Mining Processing (DSMP), Aug 2017, pp. 257–262 (2016)

    Google Scholar 

  18. Read, J., Perez-Cruz, F., Bifet, A.: Deep learning in partially-labeled data streams. In: Proceedings of the 30th Annual ACM Symposium on Applied Computing, SAC ’15, New York, NY, USA, pp. 954–959. ACM (2015)

    Google Scholar 

  19. Ororbia II, A.G., Lee Giles, C., Reitter, D.: Online semi-supervised learning with deep hybrid Boltzmann machines and denoising autoencoders. CoRR vabs/1511.06964 (2015)

    Google Scholar 

  20. Pratama, M., Angelov, P.P., Lu, J., Lughofer, E., Seera, M., Lim, C.P.: A randomized neural network for data streams. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3423–3430 (2017)

    Google Scholar 

  21. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80 (2000)

    Google Scholar 

  22. Rutkowski, L., Pietruczuk, L., Duda, P., Jaworski, M.: Decision trees for mining data streams based on the McDiarmid’s bound. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)

    Google Scholar 

  23. Matuszyk, P., Krempl, G., Spiliopoulou, M.: Correcting the usage of the Hoeffding inequality in stream mining. In: A. Tucker, F. Höppner, A. Siebes, S. Swift (eds.) Advances in Intelligent Data Analysis XII, vol. 8207 Lecture Notes in Computer Science, pp. 298–309. Springer, Berlin, Heidelberg (2013)

    Google Scholar 

  24. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Netw. Learn. Syst. 26(5), 1048–1059 (2015)

    MathSciNet  Google Scholar 

  25. Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. IOS Press (2010)

    Google Scholar 

  26. Bifet, A., Zhang, J., Fan, W., He, C., Zhang, J., Qian, J., Holmes, G., Pfahringer, B.: Extremely fast decision tree mining for evolving data streams. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’17, New York, NY, USA, pp. 1733–1742. ACM (2017)

    Google Scholar 

  27. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)

    Google Scholar 

  28. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)

    MATH  Google Scholar 

  29. Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: The CART decision tree for mining data streams. Inf. Sci. 266, 1–15 (2014)

    MATH  Google Scholar 

  30. Vinayagasundaram, B., Aarthi, R.J., Saranya, P.A.: Efficient Gaussian decision tree method for concept drift data stream. In: 2015 3rd International Conference on Signal Processing, Communication and Networking (ICSCN), pp. 1–5 (2015)

    Google Scholar 

  31. De Rosa, R., Cesa-Bianchi, N.: Splitting with confidence in decision trees with application to stream mining. In: 2015 International Joint Conference on Neural Networks (IJCNN), pp. 1–8 (2015)

    Google Scholar 

  32. De Rosa, R., Cesa-Bianchi, N.: Confidence decision trees via online and active learning for streaming data. J. Artif. Intell. Res. Sci. 60(60), 1031–1055 (2017)

    MATH  Google Scholar 

  33. Jaworski, M., Duda, P., Rutkowski, L.: New splitting criteria for decision trees in stationary data streams. IEEE Trans. Neural Netw. Learn. Syst. 29, 2516–2529 (2018)

    MathSciNet  Google Scholar 

  34. Hashemi, S., Yang, Y.: Flexible decision tree for data stream classification in the presence of concept change, noise and missing values. Data Min. Knowl. Discov. Springer 19(1), 95–131 (2009)

    MathSciNet  Google Scholar 

  35. Jankowski, D., Jackowski, K., Cyganek, B.: Learning decision trees from data streams with concept drift. Procedia Comput. Sci. 80, 1682–1691 (2016); International Conference on Computational Science 2016, ICCS 2016, 6-8 June 2016, San Diego, California, USA

    Google Scholar 

  36. Kuncheva, L.I.: Classifier ensembles for detecting concept change in streaming data: overview and perspectives. In: Proceedings of the 2nd Workshop SUEMA, ECAI, pp. 5–9 (2008)

    Google Scholar 

  37. Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: Aa survey. Inf. Fusion 37, 132–156 (2017)

    Google Scholar 

  38. Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’01, New York, NY, USA, pp. 377–382 (2001)

    Google Scholar 

  39. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’03, New York, NY, USA, pp. 226–235 (2003)

    Google Scholar 

  40. Nishida, K., Yamauchi, K., Omori, T.: ACE: adaptive classifiers-ensemble system for concept-drifting environments. In: N. C. Oza, R. Polikar, J. Kittler, F. Roli (eds.) Multiple Classifier Systems, vol. 3541. Lecture Notes in Computer Science, pp. 176–185. Springer (2005)

    Google Scholar 

  41. Krawczyk, B., Cano, A.: Online ensemble learning with abstaining classifiers for drifting and noisy data streams. Appl. Soft Comput. 68, 677–692 (2018)

    Google Scholar 

  42. Bertini Junior, J.R., do Carmo Nicoletti, M.: An iterative boosting-based ensemble for streaming data classification. Informat. Fus. 45, 66–78 (2019)

    Google Scholar 

  43. Elwell, R., Polikar, R.: Incremental learning of concept drift in nonstationary environments. IEEE Trans. Neural Netw. 22(10), 1517–1531 (2011)

    Google Scholar 

  44. He, H., Chen, S., Li, K., Xu, X.: Incremental learnng from stream data. IEEE Trans. Neural Netw. 22(12), 1901–1914 (2011)

    Google Scholar 

  45. Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)

    Google Scholar 

  46. Wozniak, M.: Accuracy based weighted aging ensemble (ab-wae) algorithm for data stream classification. In: 2017 IEEE 4th International Conference on Soft Computing Machine Intelligence (ISCMI), pp. 21–24 (2017)

    Google Scholar 

  47. Abdulsalam, H., Skillicorn, D.B., Martin, P.: Classification using streaming random forests. IEEE Trans. Knowl. Data Eng. 23(1), 22–36 (2011)

    Google Scholar 

  48. Attar, V., Sinha, P., Wankhade, K.: A fast and light classifier for data streams. Evol. Syst. 3(1), 199–207 (2010)

    Google Scholar 

  49. Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavaldà, R.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’09, New York, NY, USA, pp. 139–148 (2009)

    Google Scholar 

  50. Li, P.P., Hu, X., Wu, X.: Mining concept-drifting data streams with multiple semi-random decision trees. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N., Li, X., ADMA vol. 5139, Lecture Notes in Computer Science, pp. 733–740. Springer (2008)

    Google Scholar 

  51. Liu, X., Li, Q., Li, T., Chen, D.: Differentially private classification with decision tree ensemble. Appl. Soft Comput. 62, 807–816 (2018)

    Google Scholar 

  52. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: A method for automatic adjustment of ensemble size in stream data mining. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 9–15 (2016)

    Google Scholar 

  53. Pietruczuk, L., Rutkowski, L., Jaworski, M., Duda, P.: How to adjust an ensemble size in stream data mining? Informat. Sci. 381, 46–54 (2017)

    MathSciNet  Google Scholar 

  54. Albert, A., Gardner, L.: Stochastic Approximation and Nonlinear Regression. The MIT Press (1967)

    Google Scholar 

  55. Bendat, J., Piersol, A.: Random Data Analysis and Measurement Procedures. Wiley-Interscience, New York (1971)

    MATH  Google Scholar 

  56. Kotu, V., Deshpande, B.: Predictive Analytics and Data Mining: Concepts and Practice with RapidMiner. Morgan Kaufmann (2015)

    Google Scholar 

  57. Dong, G., Liu, H.: Feature Engineering for Machine Learning and Data Analytics. Chapman & Hall (2018)

    Google Scholar 

  58. Duda, R., Hart, P., Stork, D.: Pattern Classification. Wiley, London (2001)

    Google Scholar 

  59. Wolverton, C., Wagner, T.: Asymptotically optimal discriminant functions for pattern classification. IEEE Trans. Inf. Theor. 15(2), 258–265 (1969)

    MathSciNet  MATH  Google Scholar 

  60. Walter, G.: Properties of Hermite series estimation of probability density. Ann. Stat. 5, 1258–1264 (1977)

    MathSciNet  MATH  Google Scholar 

  61. Rao, P., Thornby, J.: A robust point estimate in a generalized regression model. Ann. Matchematic Stat. 40, 1784–1790 (1969)

    MATH  Google Scholar 

  62. Greblicki, W.: Asymptotically optimal pattern recognition procedures with density estimate. IEEE Trans. Inf. Theory 24, 250–251 (1978)

    MathSciNet  MATH  Google Scholar 

  63. Stein, E.: Singular Integrals and Differentiability Properties of Function. Princeton Univ. Press Princeton, New Jersey, New Jersey (1970)

    MATH  Google Scholar 

  64. Wheeden, R., Zygmunnd, A.: Measure and Integral. Marcel Dekker. INC., New York and Basel (1977)

    Google Scholar 

  65. Devroye, L., Györfi, L.: Nonparametric Density Estimation: The \(L_1\) View. Wiley, New York. (1985)

    MATH  Google Scholar 

  66. Rutkowski, L.: Sequential estimates of probability densities by orthogonal series and their application in pattern classification. IEEE Trans. Syst. Man Cybernet. SMC-10(12), 918–920 (1980)

    Google Scholar 

  67. Devroye, L., Wagner, T.: On the convergence of kernel estimators of regression functions with applications in discrimination. Zeitschrift für Wahrscheinlichkeitstheorie und verwandte Gebiete 51, 15–21 (1980)

    MathSciNet  MATH  Google Scholar 

  68. Greblicki, W., Pawlak, M.: Classification using the Fourier series estimate of multivariate density function. IEEE Trans. Syst. Mann. Cybernet. (1981)

    Google Scholar 

  69. Rutkowski, L.: On Bayes risk consistent pattern recognition procedures in a quasi-stationary environment. IEEE Trans. Pattern Anal. Mach. Intell. PAMI-4(1) 84–87 (1982)

    Google Scholar 

  70. Vajda, I., Györfi, L., Györfi, Z.: A strong law of large numbers and some application. Studia Sci. Math. Hung. 12, 233–244 (1977)

    MathSciNet  MATH  Google Scholar 

  71. Rutkowski, L.: Adaptive probabilistic neural networks for pattern classification in time-varying environment. IEEE Trans. Neural Netw. 15(2) (2004)

    Google Scholar 

  72. Duda, P., Rutkowski, L., Jaworski, M.: On the Parzen kernel-based probability density function learning procedures over time-varying streaming data with applications to pattern classification. IEEE Trans. Cybernet. (2019)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Leszek Rutkowski .

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Rutkowski, L., Jaworski, M., Duda, P. (2020). Probabilistic Neural Networks for the Streaming Data Classification. In: Stream Data Mining: Algorithms and Their Probabilistic Properties. Studies in Big Data, vol 56. Springer, Cham. https://doi.org/10.1007/978-3-030-13962-9_11

Download citation

Publish with us

Policies and ethics