Abstract
Clustering plays many important roles in Information Retrieval. It can support both the search functions as well as the display of results. Given very large data sets, clustering can quickly become computationally impossible and many systems limit the number of items that can be clustered. Different clustering techniques can also affect the resources needed to do the clustering. As the computations required for a technique are reduced the accuracy of the clustering also decreases. Clustering can be applied to items, thus creating a document cluster which can be used in suggesting additional items or to be used in visualization of search results. Clustering to a lesser extent can be applied to the words in items and can be used to generate automatically a statistical thesaurus. The major clustering techniques are described along with discussion on how to create hierarchical clusters.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer US
About this chapter
Cite this chapter
Kowalski, G. (2011). Document and Term Clustering. In: Information Retrieval Architecture and Algorithms. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-7716-8_6
Download citation
DOI: https://doi.org/10.1007/978-1-4419-7716-8_6
Published:
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-7715-1
Online ISBN: 978-1-4419-7716-8
eBook Packages: Computer ScienceComputer Science (R0)