Query Expansion Using PRF-CBD Approach for Documents Retrieval
Query Expansion has been widely used to improve the effectiveness of documents retrieval. In this work, we have attempted to identify additional terms for query expansion, from the initial set of documents retrieved for the original query, with the help of Clustering-by-Directions (CBD) algorithm proposed by Kaczmarek. The CBD algorithm is based on a tag cloud of associated terms that are located in a radical arrangement and provides a clue to the direction of user intent in which search can be continued effectively. The output of the CBD approach gives rise to a set of terms in which we have considered top k terms for expanding the given query. The importance of these selected expansion terms is computed with respect to the number of terms in the radical of the selected directions. The experiments were conducted on FIRE 2012 adhoc data collection and we have performed monolingual documents retrieval in 3 major languages: Bengali, Hindi and English.
KeywordsPseudo Relevance Feedback Query Terms Expansion Clustering-By-Directions Query Terms Weighting
- 1.Crabtree, D., Andreae, P., Gao, X.: Understanding query aspects with applications to interactive query expansion. In: IEEE/WIC/ACM International Conference on Web Intelligence, pp. 691–695 (2007)Google Scholar
- 5.Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proc. of the 17th ACM SIGIR Conference on Research and Development in IR, SIGIR 1994, pp. 232–241. Springer-Verlag New York, Inc., New York (1994)Google Scholar
- 6.Zhang, B., Du, Y., Li, H., Wang, Y.: Query expansion based on topics. In: Fifth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, pp. 610–614 (2008)Google Scholar