Document and Term Clustering

  • Gerald Kowalski


Clustering plays many important roles in Information Retrieval. It can support both the search functions as well as the display of results. Given very large data sets, clustering can quickly become computationally impossible and many systems limit the number of items that can be clustered. Different clustering techniques can also affect the resources needed to do the clustering. As the computations required for a technique are reduced the accuracy of the clustering also decreases. Clustering can be applied to items, thus creating a document cluster which can be used in suggesting additional items or to be used in visualization of search results. Clustering to a lesser extent can be applied to the words in items and can be used to generate automatically a statistical thesaurus. The major clustering techniques are described along with discussion on how to create hierarchical clusters.


Hierarchical Cluster Cluster Technique Cluster Process Single Link Relevant Item 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Copyright information

© Springer US 2011

Authors and Affiliations

  • Gerald Kowalski
    • 1
  1. 1.AshburnUSA

Personalised recommendations