Clustering for Post Hoc Information Retrieval
Clustering is a technique that allows similar objects to be grouped together based on common attributes. It has been used in information retrieval for different retrieval process tasks and objects of interest (e.g., documents, authors, index terms). Attributes used for clustering may include assigned terms within documents and their co-occurrences, the documents themselves if the focus is on index terms, or linkages (e.g., hypertext links of Web documents, citations or co-citations within documents, documents accessed). Clustering in IR facilitates browsing and assessment of retrieved documents for relevance and may reveal unexpected relationships among the clustered objects.
A fundamental challenge of information retrieval (IR) that continues today is how to best match user queries with documents in a queried collection. Many mathematical models have been developed over the years to facilitate the matching process. The...
- 4.Crouch CJ. A cluster-based approach to thesaurus construction. In: Proceedings of the 11th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1988. p. 309–20.Google Scholar
- 5.Cutting DR, Karger DR, Pedersen JO, Tukey JW. Scatter/Gather: a cluster-based approach to browsing large document collections. In: Proceedings of the 15th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1992. p. 318–29.Google Scholar
- 7.Hearst MA, Pedersen JO. Reexamining the cluster hypothesis: scatter/gather on retrieval results. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1996. p. 76–84.Google Scholar
- 10.Liu X, Croft WB. Cluster-based retrieval using language models. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2004. p. 186–93.Google Scholar
- 11.Rasmussen E. Clustering algorithms. In: Frakes WB, Baeza-Yates R, editors. Information retrieval data structures & algorithms. Englewood Cliffs: Prentice Hall; 1992. p. 419–42.Google Scholar