Skip to main content

A Clustering Algorithm Based on Generalized Stars

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4571))

Abstract

In this paper we present a new algorithm for document clustering called Generalized Star (GStar). This algorithm is a generalization of the Star algorithm proposed by Aslam et al., and recently improved by them and other researchers. In this method we introduced a new concept of star allowing a different star-shaped form with better overlapping clusters. The evaluation experiments on standard document collections show that the proposed algorithm outperforms previously defined methods and obtains a smaller number of clusters. Since the GStar algorithm is relatively simple to implement and is also efficient, we advocate its use for tasks that require clustering, such as information organization, browsing, topic tracking, and new topic detection.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aslam, J., Pelekhov, K., Rus, D.: Static and Dynamic Information Organization with Star Clusters. In: Proceedings of the 1998 Conference on Information Knowledge Management, Baltimore (1998)

    Google Scholar 

  2. Aslam, J., Pelekhov, K., Rus, D.: Using Star Clusters for Filtering. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, USA (2000)

    Google Scholar 

  3. Aslam, J., Pelekhov, K., Rus, D.: The Star Clustering Algorithm for Static and Dynamic Information Organization. Journal of Graph Algorithms and Applications 8(1), 95–129 (2004)

    MATH  MathSciNet  Google Scholar 

  4. Banerjee, A., Krumpelman, C., Basu, S., Mooney, R., Ghosh, J.: Model Based Overlapping Clustering. In: KDD 2005. Proceedings of International Conference on Knowledge Discovery and Data Mining (2005)

    Google Scholar 

  5. Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental Clustering and Dynamic Information Retrieval. In: Proceedings of the 29th Symposium on Theory of Computing (1997)

    Google Scholar 

  6. Cutting, D., Karger, D., Pedersen, J.: Constant Interaction-time Scatter/Gather Browsing of Very Large Document Collections. In: Proceedings of the 16th SIGIR (1993)

    Google Scholar 

  7. Duda, R., Hart, P., Stork, D.: Pattern Classification. John Wiley & Sons Inc., West Sussex (2001)

    MATH  Google Scholar 

  8. Gil-García, R.J., Badía-Contelles, J.M., Pons-Porrata, A.: Extended Star Clustering Algorithm. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 480–487. Springer, Heidelberg (2003)

    Google Scholar 

  9. Gil-García, R.J., Badía-Contelles, J.M., Pons-Porrata, A.: Parallel Algorithm for Extended Star Clustering. In: Sanfeliu, A., Martínez Trinidad, J.F., Carrasco Ochoa, J.A. (eds.) CIARP 2004. LNCS, vol. 3287, pp. 402–409. Springer, Heidelberg (2004)

    Google Scholar 

  10. Kuncheva, L., Hadjitodorov, S.: Using Diversity in Cluster Ensembles. In: Proceedings of IEEE SMC 2004, The Netherlands (2004)

    Google Scholar 

  11. van Rijsbergen, C.J.: Information Retrieval, 2nd edn. Buttersworth, London (1979)

    Google Scholar 

  12. Zhong, S., Ghosh, J.: A Comparative Study of Generative Models for Document Clustering. In: Proceedings of SDM Workshop on Clustering High Dimensional Data and Its Applications (2003)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Petra Perner

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Suárez, A.P., Pagola, J.E.M. (2007). A Clustering Algorithm Based on Generalized Stars. In: Perner, P. (eds) Machine Learning and Data Mining in Pattern Recognition. MLDM 2007. Lecture Notes in Computer Science(), vol 4571. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73499-4_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73499-4_19

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73498-7

  • Online ISBN: 978-3-540-73499-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics