Skip to main content

A Data Stream Subspace Clustering Algorithm

  • Conference paper
Intelligent Computation in Big Data Era (ICYCSEE 2015)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 503))

  • 1993 Accesses

Abstract

The main aim of data stream subspace clustering is to find clusters in subspace in rational time accurately. The existing data stream subspace clustering algorithms are greatly influenced by parameters. Due to the flaws of traditional data stream subspace clustering algorithms, we propose SCRP, a new data stream subspace clustering algorithm. SCRP has the advantages of fast clustering and being insensitive to outliers. When data stream changes, the changes will be recorded by the data structure named Region-tree, and the corresponding statistics information will be updated. Further SCRP can regulate clustering results in time when data stream changes. According to the experiments on real datasets and synthetic datasets, SCRP is superior to the existing data stream subspace clustering algorithms on both clustering precision and clustering speed, and it has good scalability to the number of clusters and dimensions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ling, C., Lingjun, Z., Li, T.: A clustering algorithm for multiple data streams based on spectral component similarity. Information Sciences 183(1), 35–47 (2012)

    Article  Google Scholar 

  2. Weiguo, L., Jia, O.: Clustering algorithm for high dimensional data stream over sliding windows. In: Proc of the 10th Int. Conf. on Trust, Security and Privacy in Computing and Communications, pp. 1537–1542. IEEE, Piscataway (2011)

    Google Scholar 

  3. Halkidi, M., Koutsopoulos, I.: Online Clustering of distributed streaming data using belief propagation techniques. In: Proc of the 12th Int. Conf. on Mobile Data Management, pp. 216–225. IEEE, Piscataway (2011)

    Google Scholar 

  4. Aggarwal, C., Han, J., Wang, J., et al.: A framework for clustering evolving data streams. In: Proc of the 29th Int. Conf. on VLDB, pp. 81–92. Morgan Kaufmann (2003)

    Google Scholar 

  5. Parsons, L., Haque, E., Huan, L.: Subspace clustering for high dimensional data: A review. ACM SIGKDD Explorations Newsletter 6(1), 90–105 (2004)

    Article  Google Scholar 

  6. Yihong, L., Yan, H.: Mining data streams using clustering. In: Proc of the 4th Int. Conf. on Machine Learning and Cybernetics, pp. 2079–2083. IEEE, Piscataway (2083)

    Google Scholar 

  7. Yufen, S.: Research on clustering algorithm based on grid. Huazhong University of Science and Technology, Wuhan (2006)

    Google Scholar 

  8. Park, N.H., Lee, W.S.: Cell tree: An adaptive synopsis structure for clustering multi-dimensional on-line data stream. Data & Knowledge Engineering 4(3), 1–22 (2007)

    Google Scholar 

  9. Park, N.H., Lee, W.S.: Grid-based subspace clustering over data streams. In: Proc of the ACM Conf. on Information and Knowledge Management, pp. 801–810. ACM, New York (2007)

    Google Scholar 

  10. Yanwei, Y., Qin, W., Jun, K., et al.: An on-line density-based clustering algorithm for spatial data stream. Acta Automatica Sinica 38(6), 1051–1059 (2012)

    Google Scholar 

  11. Dutta, B.R., Angelov, P.: Evolving local means method for clustering of streaming data. In: Proc of the 2012 Int. Conf. on World Congress on Computational Intelligence, pp. 1–8. IEEE, Piscataway (2012)

    Chapter  Google Scholar 

  12. Lingjuan, L., Xiong, L.: An improved online stream data clustering algorithm. In: Proc of the 2nd Int. Conf. on Business Computing and Global Informatization, pp. 526–529. IEEE, Piscataway (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yu, X., Xu, X., Lin, L. (2015). A Data Stream Subspace Clustering Algorithm. In: Wang, H., et al. Intelligent Computation in Big Data Era. ICYCSEE 2015. Communications in Computer and Information Science, vol 503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46248-5_41

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-46248-5_41

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-46247-8

  • Online ISBN: 978-3-662-46248-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics