Skip to main content

A Comparative Study of Density-based Clustering Algorithms on Data Streams: Micro-clustering Approaches

  • Chapter
  • First Online:
Intelligent Control and Innovative Computing

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 110))

Abstract

Clustering data streams is a challenging problem in mining data streams. Data streams need to be read by a clustering algorithm in a single pass with limited time, and memory whereas they may change over time. Different clustering algorithms have been developed for data streams. Density-based algorithms are a remarkable group in clustering data that can find arbitrary shape clusters, and handle the outliers as well. In recent years, density-based clustering algorithms are adopted for data streams. However, in clustering data streams, it is impossible to record all data streams. Micro-clustering is a summarization method used to record synopsis information about data streams. Various algorithms apply micro-clustering methods for clustering data streams. In this paper, we will concentrate on the density-based clustering algorithms that use micro-clustering methods for clustering and we refer them as density-micro clustering algorithms. We review the algorithms in details and compare them based on different characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal CC (ed) (2007) Data streams—models and algorithms. Springer, New york, USA

    MATH  Google Scholar 

  2. Aggarwal CC, Han J, Wang J, Yu PS (2003) A framework for clustering evolving data streams. In: Proceedings of the 29th international conference on very large data bases, VLDB Endowment, Berlin, Germany, pp 81–92

    Chapter  Google Scholar 

  3. Aggarwal CC, Han J, Wang J, Yu PS (2004) A framework for projected clustering of high dimensional data streams. In: Proceedings of the thirtieth international conference on very large data bases VLDB Endowment, Toronto, Canada, pp 852–863

    Google Scholar 

  4. Anil KJ, Murty MN, Flynn PJ (1999) Data clustering: a review, ACM Comput Surveys 31:264–323

    Article  Google Scholar 

  5. Anil KJ (2008) Data clustering: 50 years beyond K-means, Pattern Recogn Lett 31(8):651–666

    Google Scholar 

  6. Ankerst M, Breunig MM, Kriegel H-P, Sander J (1999) OPTICS: ordering points to identify the clustering structure, SIGMOD Records 28:49–60

    Article  Google Scholar 

  7. Amini A, Teh YW (2011) Density micro-clustering algorithms on data streams: a review, lecture notes in engineering and computer science: proceedings of the international multiconference of engineers and computer scientists 2011, IMECS 2011, Hong Kong, 16–18 March 2011

    Google Scholar 

  8. Amini A, Teh YW, Saybani MR, Aghabozorgi SR (2011) A study of density-grid based clustering algorithms on data streams. In: Proceedings of the 8th international conference on fuzzy systems and knowledge discovery, Shanghai, pp 410–414

    Google Scholar 

  9. Babcock B, Babu S, Datar M, Motwani R, Widom J (2002) Models and issues in data stream systems. In: Proceedings of the twenty-first ACM SIGMOD-SIGACT-SIGART symposium on principles of database systems, PODS 2002, New York, pp 1–16

    Google Scholar 

  10. Cao F, Ester M, Weining Q, Aoying Z (2006) Density-based clustering over an evolving data stream with noise. In: SIAM conference on data mining, SIAM, Bethesda, Maryland, USA, pp 328–339

    Google Scholar 

  11. Elena I, Suzana L, Dejan G (2007) A survey of stream data mining. In: Proceedings of 8th national conference with international participation, ETAI, Ohrid, Republic of MACEDONIA, pp 19–21

    Google Scholar 

  12. Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters in large spatial databases with noise. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD), AAAI Press, Portland, Oregon, pp 226–231

    Google Scholar 

  13. Gan G, Ma C, Wu J (2007) Data clustering: theory, algorithms, and applications (ASA-SIAM series on statistics and applied probability). Society for Industrial and Applied Mathematics (SIAM), Philadelphia, Pennsylvania

    Google Scholar 

  14. Gaber MM, Zaslavsky A, Krishnaswamy S (2010) Data stream mining, data mining and knowledge discovery handbook, pp 759–787

    Chapter  Google Scholar 

  15. Gaber MM, Zaslavsky A, Krishnaswamy S (2005) Mining data streams: a review. SIGMOD Record 34:18–26

    Article  Google Scholar 

  16. Han J (2005) Data mining: concepts and techniques. Morgan Kaufmann, San Francisco

    Google Scholar 

  17. Hinneburg A, Keim DA (1998) An efficient approach to clustering in large multimedia databases with noise. In: Proceeding of 4th international conference on knowledge discovery & data mining, New York City, NY, pp 58–65

    Google Scholar 

  18. Kranen P, Assent I, Baldauf C, Seidl T (2011) The ClusTree: indexing micro-clusters for anytime stream mining. Knowl Inf Syst 29(2): 249–272

    Article  Google Scholar 

  19. Li-xiong L, Jing K, Yun-fei G, Hai H (2009) A three-step clustering algorithm over an evolving data stream. In: Proceedings of IEEE international conference on intelligent computing and intelligent systems (ICIS), Shanghai, China, pp 160–164

    Google Scholar 

  20. Ren J, Ma R, Ren J (2009) Density-based data streams clustering over sliding windows. In: Proceedings of the 6th international conference on fuzzy systems and knowledge discovery (FSKD), IEEE, Tianjin, China

    Google Scholar 

  21. Ruiz C, Menasalvas E, Spiliopoulou M (2009) C-DenStream: using domain knowledge on a data stream. In: Proceedings of the 12th international conference on discovery science, Springer, Berlin, pp 287–301

    Chapter  Google Scholar 

  22. Ruiz C, Spiliopoulou M, Menasalvas E (2007) C-DBSCAN: density-based clustering with constraints. In: Proceedings of the international conference on rough sets fuzzy sets data mining and granular computing, Springer, Berlin, Heidelberg, pp 216–223

    Google Scholar 

  23. Tasoulis DK, Ross G, Adams NM (2007) Visualizing the cluster structure of data streams. In: Proceedings of the 7th international conference on intelligent data analysis, IDA, Springer, Berlin, pp 81–92

    Google Scholar 

  24. Wagstaff K, Cardie C, Rogers S, Schrodl S (2001) Constrained k-means clustering with background knowledge. In: Proceedings of the eighteenth international conference on machine learning, ICML, San Francisco, pp 577–584

    Google Scholar 

  25. Zhang T, Ramakrishnan R, Livny M (1996) BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the 1996 ACM SIGMOD international conference on management of data, Montreal, Quebec, Canada, pp 103–114

    Google Scholar 

  26. Zhou A, Cao F, Qian W, Jin C (2008) Tracking clusters in evolving data streams over sliding windows. Knowledge Inform Syst 15:181–214

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amineh Amini .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Amini, A., Wah, T.Y. (2012). A Comparative Study of Density-based Clustering Algorithms on Data Streams: Micro-clustering Approaches. In: Ao, S., Castillo, O., Huang, X. (eds) Intelligent Control and Innovative Computing. Lecture Notes in Electrical Engineering, vol 110. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1695-1_21

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-1695-1_21

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-1694-4

  • Online ISBN: 978-1-4614-1695-1

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics