Advertisement

A Speed-Up Hierarchical Compact Clustering Algorithm for Dynamic Document Collections

  • Reynaldo Gil-García
  • Aurora Pons-Porrata
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5856)

Abstract

In this paper, a speed-up version of the Dynamic Hierarchical Compact (DHC) algorithm is presented. Our approach profits from the cluster hierarchy already built to reduce the number of calculated similarities. The experimental results on several benchmark text collections show that the proposed method is significantly faster than DHC while achieving approximately the same clustering quality.

Keywords

hierarchical clustering dynamic clustering 

References

  1. 1.
    Gil-García, R.J., Badía-Contelles, J.M., Pons-Porrata, A.: Dynamic Hierarchical Compact Clustering Algorithm. In: Sanfeliu, A., Cortés, M.L. (eds.) CIARP 2005. LNCS, vol. 3773, pp. 302–310. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  2. 2.
    Ciaccia, P., Patella, P., Zezula, P.: M-Tree: An efficient access method for similarity search in metric spaces. In: VLDB 1997, pp. 426–435 (1997)Google Scholar
  3. 3.
    Berchtold, S., Bohm, C., Jagadish, H.V., Kriegel, H.P., Sander, J.: Independent quantization: An index compression technique for high dimensional data space. In: 16th International Conference on Data Engineering, pp. 577–588 (2000)Google Scholar
  4. 4.
    Zhao, Y., Karypis, G.: Evaluation of hierarchical clustering algorithms for document datasets. In: International Conference on Information and Knowledge Management, pp. 515–524 (2002)Google Scholar
  5. 5.
    Larsen, B., Aone, C.: Fast and Effective Text Mining Using Linear-time Document Clustering. In: KDD 1999, pp. 16–22. ACM Press, New York (1999)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Reynaldo Gil-García
    • 1
  • Aurora Pons-Porrata
    • 1
  1. 1.Center for Pattern Recognition and Data MiningUniversidad de OrienteSantiago de CubaCuba

Personalised recommendations