Skip to main content
Log in

Hierarchical clustering of text documents

  • Control Systems and Information Technologies
  • Published:
Automation and Remote Control Aims and scope Submit manuscript

Abstract

We consider the possibility to use compression algorithms to compute similarity distances in order to solve the clustering problem. We propose an actual hierarchical clustering machine that constructs a binary tree of object dependencies similar to a taxonomy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Bennett, C.H., Gacs, P., Li, M., Vitanyi, P.M.B., and Zurek, W., Information Distance, IEEE Trans. Inf. Theory, 1998, vol. 44, no. 4, pp. 1407–1423.

    Article  MATH  MathSciNet  Google Scholar 

  2. Li, M., Chen, X., Li, X., Ma, B., and Vitanyi, P.M.B., The Similarity Metric, IEEE Trans. Inf. Theory, 2004, vol. 50, no. 12, pp. 3250–3264.

    Article  MathSciNet  Google Scholar 

  3. Cilibrasi, R. and Vitanyi, P.M.B., Clustering by Compression, IEEE Trans. Inf. Theory, 2005, vol. 51, no. 4, pp. 1523–1545.

    Article  MathSciNet  Google Scholar 

  4. Thaper, N.. Using Compression for Source Based Classification of Text, Master’s Thesis, MIT, 2001.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to L. S. Lomakina.

Additional information

Original Russian Text © L.S. Lomakina, V.B. Rodionov, A.S. Surkova, 2012, published in Sistemy Upravleniya i Informatsionnye Tekhnologii, 2012, No. 3, pp. 39–44.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Lomakina, L.S., Rodionov, V.B. & Surkova, A.S. Hierarchical clustering of text documents. Autom Remote Control 75, 1309–1315 (2014). https://doi.org/10.1134/S000511791407011X

Download citation

  • Received:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1134/S000511791407011X

Keywords

Navigation