Skip to main content

Data Stream Clustering in Conditions of an Unknown Amount of Classes

  • Conference paper
  • First Online:

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 754))

Abstract

An on-line modified X-means method is proposed for solving data stream clustering tasks in conditions when an amount of clusters is apriori unknown. This approach is based on an ensemble of clustering neural networks that contains the self-organizing maps by T. Kohonen. Each clustering neural network consists of a different number of neurons where an amount of clusters is connected to a quality of the clustering process. All ensemble’s members process information which is fed sequentially to the system in a parallel mode. The effectiveness of the clustering process is determined using the Caliński-Harabasz index. The self-learning algorithm uses a similarity measure of a special type. A main feature of the proposed method is an absence of the competition step, i.e. neuron-winner is not determined. A number of experiments has been held in order to investigate the proposed system’s properties. Experimental results have confirmed the fact that the system under consideration could be used for solving a wide range of Data Mining tasks when data sets are processed in an on-line mode. The proposed ensemble system provides computational simplicity, and data sets are processed faster due to the possibility of parallel tuning.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Gan, G., Ma, C., Wu, J.: Data Clustering: Theory, Algorithms and Application. SIAM, Philadelphia (2007)

    Book  Google Scholar 

  2. Xu, R., Wunsch, D.C.: Clustering. Computational Intelligence. IEEE Press/Wiley, Hoboken (2009)

    Google Scholar 

  3. Hu, Z., Bodyanskiy, Y.V., Tyshchenko, O.K., Tkachov, V.M.: Fuzzy clustering data arrays with omitted observations. Int. J. Intell. Syst. Appl. (IJISA) 9(6), 24–32 (2017). https://doi.org/10.5815/ijisa.2017.06.03

    Google Scholar 

  4. Zhengbing, H., Bodyanskiy, Y.V., Tyshchenko, O.K., Samitova, V.O.: Possibilistic fuzzy clustering for categorical data arrays based on frequency prototypes and dissimilarity measures. Int. J. Intell. Syst. Appl. (IJISA) 9(5), 55–61 (2017). https://doi.org/10.5815/ijisa.2017.05.07

    Google Scholar 

  5. Pelleg, D., Moor, A.: X-means: extending K-means with efficient estimation of the number of clusters. In: Proceedings of 17th International Conference on Machine Learning, pp. 727–730. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  6. Ishioka, T.: An expansion of X-means for automatically determining the optimal number of clusters. In: Proceedings of 4th IASTED International Conference on Computational Intelligence, pp. 91–96. Calgary, Alberta (2005)

    Google Scholar 

  7. Zhengbing, H., Bodyanskiy, Y.V., Tyshchenko, O.K., Samitova, V.O.: Fuzzy clustering data given on the ordinal scale based on membership and likelihood functions sharing. Int. J. Intell. Syst. Appl. (IJISA) 9(2), 1–9 (2017). https://doi.org/10.5815/ijisa.2017.02.01

    Google Scholar 

  8. Hu, Z., Bodyanskiy, Y.V., Tyshchenko, O.K., Samitova, V.O.: Fuzzy clustering data given in the ordinal scale. Int. J. Intell. Syst. Appl. (IJISA), 9(1), 67–74 (2017). https://doi.org/10.5815/ijisa.2017.01.07

    Article  Google Scholar 

  9. Bifet, A.: Adaptive Stream Mining: Pattern Learning and Mining from Evolving Data Streams. IOS Press, Amsterdam (2010)

    MATH  Google Scholar 

  10. Kohonen, T.: Self-Organizing Maps. Springer, Heidelberg (1995)

    Book  Google Scholar 

  11. Perova, I., Pliss, I.: Deep hybrid system of computational intelligence with architecture adaptation for medical fuzzy diagnostics. Int. J. Intell. Syst. Appl. (IJISA) 9(7), 12–21 (2017). https://doi.org/10.5815/ijisa.2017.07.02

    Google Scholar 

  12. Strehl, A., Ghosh, J.: Cluster ensembles – a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2002)

    MathSciNet  MATH  Google Scholar 

  13. Topchy, A., Jain, A.K., Punch, W.: Clustering ensembles: models of consensus and weak partitions. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1866–1881 (2005)

    Article  Google Scholar 

  14. Alizadeh, H., Minaei-Bidgoli, B., Parvin, H.: To improve the quality of cluster ensembles by selecting a subset of base clusters. J. Exp. Theor. Artif. Intell. 26, 127–150 (2013)

    Article  Google Scholar 

  15. Charkhabi, M., Dhot, T., Mojarad, S.A.: Cluster ensembles, majority vote, voter eligibility and privileged voters. Int. J. Mach. Learn. Comput. 4, 275–278 (2014)

    Article  Google Scholar 

  16. Bodyanskiy, Y.: Computational intelligence techniques for data analysis. Lecture Notes in Informatics. GI, Bonn (2005)

    Google Scholar 

  17. Bodyanskiy, Y., Rudenko, O.: Artificial Neural Networks: Architecture, Learning, Application. TELETEKH, Kharkiv (2004)

    Google Scholar 

  18. Bodyanskiy, Y., Peleshko, D., Vinokurova, O., Mashtalir, S., Ivanov, Y.: Analyzing and Processing of Data Stream using Computational Intelligence. Lvivska Polytehnika Publishing, Lviv (2016)

    Google Scholar 

  19. Murphy, P.M., Aha, D.: UCI Repository of machine learning databases. Department of Information and Computer Science. University of California, CA (1994). http://www.ics.uci.edu/mlearn/MLRepository.html

  20. Bodyanskiy, Y.V., Deineko, A.A., Kutsenko, Y.V.: On-line kernel clustering based on the general regression neural network and T. Kohonen’s self-organizing map. Autom. Control Comput. Sci. 51(1), 55–62 (2017)

    Article  Google Scholar 

  21. Bodyanskiy, Y., Deineko, A., Kutsenko, Y.: Sequential fuzzy clustering based on neuro-fuzzy approach. Radioelectronics Inform. Control 3(38), 30–39 (2016)

    Google Scholar 

  22. Zakharian, S., Ladevig-Riebler, P., Tores, S.: Neuronale Netze für Ingenieure: Arbeits und Übungsbuch für regelungs-technische Anwendungen. Vieweg, Braunschweig (1998)

    Book  Google Scholar 

  23. Perova, I., Pliss, G., Churyumov, G., Eze, F.M., Mahmoud, S.M.K.: Neo-fuzzy approach for medical diagnostics tasks in online-mode. In: 1th IEEE International Conference on Data Stream Mining and Processing (DSMP), pp. 34–38 (2016)

    Google Scholar 

  24. Bodyanskiy, Y., Deineko, A., Kutsenko, Y., Zayika O.: Data streams fast EM-fuzzy clustering based on Kohonen’s self-learning. In: 1st IEEE International Conference on Data Stream Mining and Processing (DSMP), pp. 309–313 (2016)

    Google Scholar 

  25. Frank, A., Asuncion, A.: UCI Machine Learning Repository, University of California, School of Information and Computer Science, Irvine, CA (2013). http://archive.ics.uci.edu/ml

  26. Deineko, A., Kutsenko, Y., Pliss, I., Shalamov, M.: Kernel evolving neural networks for sequential principal component analysis and its adaptive learning algorithm. In: International Scientific and Technical Conference Computer Science and Information Technologies (CSIT 2011), Lviv, pp. 107–110 (2015)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Polina Zhernova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhernova, P., Deyneko, A., Deyneko, Z., Pliss, I., Ahafonov, V. (2019). Data Stream Clustering in Conditions of an Unknown Amount of Classes. In: Hu, Z., Petoukhov, S., Dychka, I., He, M. (eds) Advances in Computer Science for Engineering and Education. ICCSEEA 2018. Advances in Intelligent Systems and Computing, vol 754. Springer, Cham. https://doi.org/10.1007/978-3-319-91008-6_41

Download citation

Publish with us

Policies and ethics