Skip to main content

Clustering Over Data Streams Based on Growing Neural Gas

  • Conference paper
  • First Online:
Advances in Knowledge Discovery and Data Mining (PAKDD 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9078))

Included in the following conference series:

Abstract

Clustering data streams requires a process capable of partitioning observations continuously with restrictions of memory and time. In this paper we present a new algorithm, called G-Stream, for clustering data streams by making one pass over the data. G-Stream is based on growing neural gas, that allows us to discover clusters of arbitrary shape without any assumptions on the number of clusters. By using a reservoir, and applying a fading function, the quality of clustering is improved. The performance of the proposed algorithm is evaluated on public data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ackermann, M.R., Märtens, M., Raupach, C., Swierkot, K., Lammersen, C., Sohler, C.: StreamKM++: A clustering algorithm for data streams. ACM Journal of Experimental Algorithmics, 17(1) (2012)

    Google Scholar 

  2. Aggarwal, C.C., Watson, T.J., Ctr, R., Han, J., Wang, J., Yu, P.S.: A framework for clustering evolving data streams. In: VLDB, pp. 81–92 (2003)

    Google Scholar 

  3. de Andrade Silva, J., Faria, E.R., Barros, R.C., Hruschka, E.R., de Carvalho, A.C.P.L.F., Gama, J.: Data stream clustering: A survey. ACM Comput. Surv. 46(1), 13 (2013)

    Google Scholar 

  4. Bache, K., Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml

  5. Bolanos, M., Forrest, J., Hahsler, M.: Stream: Infrastructure for Data Stream Mining (2014). http://CRAN.R-project.org/package=stream, r package version 0.2-0

  6. Bouguelia, M.R., Belaïd, Y., Belaïd, A.: An adaptive incremental clustering method based on the growing neural gas algorithm. In: ICPRAM, pp. 42–49 (2013)

    Google Scholar 

  7. Cao, F., Ester, M., Qian, W., Zhou, A.: Density-based clustering over an evolving data stream with noise. In: SDM, pp. 328–339 (2006)

    Google Scholar 

  8. Fritzke, B.: A growing neural gas network learns topologies. In: NIPS, pp. 625–632 (1994)

    Google Scholar 

  9. Guha, S., Meyerson, A., Mishra, N., Motwani, R., O’Callaghan, L.: Clustering data streams: Theory and practice. IEEE Transactions on Knowledge and Data Engineering 15(3), 515–528 (2003)

    Article  Google Scholar 

  10. Isaksson, C., Dunham, M.H., Hahsler, M.: SOStream: Self Organizing Density-Based Clustering over Data Stream. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 264–278. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Kohonen, T., Schroeder, M.R., Huang, T.S. (eds.): Self-Organizing Maps, 3rd edn. Springer, Secaucus (2001)

    MATH  Google Scholar 

  12. Kranen, P., Assent, I., Baldauf, C., Seidl, T.: The ClusTree: indexing micro-clusters for anytime stream mining. Knowledge and Information Systems 29(2), 249–272 (2011)

    Article  Google Scholar 

  13. Martinetz, T., Schulten, K.: A “Neural-Gas” Network Learns Topologies. Artificial Neural Networks I, 397–402 (1991)

    Google Scholar 

  14. Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research 3, 583–617 (2002)

    MathSciNet  Google Scholar 

  15. Udommanetanakit, K., Rakthanmanon, T., Waiyamai, K.: E-Stream: Evolution-Based Technique for Stream Clustering. In: Alhajj, R., Gao, H., Li, X., Li, J., Zaïane, O.R. (eds.) ADMA 2007. LNCS (LNAI), vol. 4632, pp. 605–615. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  16. Wang, C., Lai, J., Huang, D., Zheng, W.: SVStream: A support vector-based algorithm for clustering data streams. IEEE Trans. Knowl. Data Eng. 25(6), 1410–1424 (2013). http://doi.ieeecomputersociety.org/10.1109/TKDE.2011.263

    Article  Google Scholar 

  17. Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: SIGMOD Conference, pp. 103–114 (1996)

    Google Scholar 

  18. Zhang, X., Furtlehner, C., Sebag, M.: Data streaming with affinity propagation. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part II. LNCS (LNAI), vol. 5212, pp. 628–643. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  19. Zhu, X.H.: Stream data mining repository (web site) (2010). http://www.cse.fau.edu/xqzhu/stream.html

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mohammed Ghesmoune .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ghesmoune, M., Lebbah, M., Azzag, H. (2015). Clustering Over Data Streams Based on Growing Neural Gas. In: Cao, T., Lim, EP., Zhou, ZH., Ho, TB., Cheung, D., Motoda, H. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2015. Lecture Notes in Computer Science(), vol 9078. Springer, Cham. https://doi.org/10.1007/978-3-319-18032-8_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-18032-8_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-18031-1

  • Online ISBN: 978-3-319-18032-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics