Query Expansion Using PRF-CBD Approach for Documents Retrieval

  • R. Rajendra Prasath
  • Sudeshna Sarkar
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8251)

Abstract

Query Expansion has been widely used to improve the effectiveness of documents retrieval. In this work, we have attempted to identify additional terms for query expansion, from the initial set of documents retrieved for the original query, with the help of Clustering-by-Directions (CBD) algorithm proposed by Kaczmarek[2]. The CBD algorithm is based on a tag cloud of associated terms that are located in a radical arrangement and provides a clue to the direction of user intent in which search can be continued effectively. The output of the CBD approach gives rise to a set of terms in which we have considered top k terms for expanding the given query. The importance of these selected expansion terms is computed with respect to the number of terms in the radical of the selected directions. The experiments were conducted on FIRE 2012 adhoc data collection and we have performed monolingual documents retrieval in 3 major languages: Bengali, Hindi and English.

Keywords

Pseudo Relevance Feedback Query Terms Expansion Clustering-By-Directions Query Terms Weighting 

References

  1. 1.
    Crabtree, D., Andreae, P., Gao, X.: Understanding query aspects with applications to interactive query expansion. In: IEEE/WIC/ACM International Conference on Web Intelligence, pp. 691–695 (2007)Google Scholar
  2. 2.
    Kaczmarek, A.: Interactive query expansion with the use of clustering-by-directions algorithm. IEEE Trans. on Industrial Electronics 58(8), 3168–3173 (2011)CrossRefGoogle Scholar
  3. 3.
    Manning, C.D., Raghavan, P., Schtze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)CrossRefMATHGoogle Scholar
  4. 4.
    Robertson, S., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Found. Trends Inf. Retr. 3(4), 333–389 (2009)CrossRefGoogle Scholar
  5. 5.
    Robertson, S.E., Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. In: Proc. of the 17th ACM SIGIR Conference on Research and Development in IR, SIGIR 1994, pp. 232–241. Springer-Verlag New York, Inc., New York (1994)Google Scholar
  6. 6.
    Zhang, B., Du, Y., Li, H., Wang, Y.: Query expansion based on topics. In: Fifth International Conference on Fuzzy Systems and Knowledge Discovery, vol. 2, pp. 610–614 (2008)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • R. Rajendra Prasath
    • 1
  • Sudeshna Sarkar
    • 1
  1. 1.Department of Computer Science and EngineeringIndian Institute of TechnologyKharagpurIndia

Personalised recommendations