Advertisement

Feature-Based Query Expansion

  • Donald MetzlerEmail author
Chapter
  • 733 Downloads
Part of the The Information Retrieval Series book series (INRE, volume 27)

Abstract

This chapter demonstrates how feature-based models can be extended and used for query expansion using a technique known as latent concept expansion (LCE). The approach has three key benefits, including the ability to go beyond the bag of words assumption, the ability to employ arbitrary features during the query expansion process, and the ability to expand with a variety of concept types beyond unigrams. In addition to the basic LCE model, the chapter also describes a number of powerful extensions, including generalized LCE and LCE using hierarchical MRFs that encode document structure during the expansion process.

References

  1. Blei, D., Ng, A., & Jordan, M. (2003b). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022. zbMATHGoogle Scholar
  2. Broder, A., Ciccolo, P., Gabrilovich, E., Josifovski, V., Metzler, D., Riedel, L., & Yuan, J. (2009). Online expansion of rare queries for sponsored search. In Proceedings of the 18th international conference on World Wide Web, WWW ’09 (pp. 511–520). New York: ACM. CrossRefGoogle Scholar
  3. Buckley, C., & Salton, G. (1995). Optimization of relevance feedback weights. In Proc. 18th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 351–357). CrossRefGoogle Scholar
  4. Clarke, C. L. A., & Cormack, G. V. (2000). Shortest-substring retrieval and ranking. ACM Transactions on Information Systems, 18(1), 44–78. CrossRefGoogle Scholar
  5. Collins-Thompson, K., & Callan, J. (2005). Query expansion using random walk models. In Proc. 14th intl. conf. on information and knowledge management (pp. 704–711). Google Scholar
  6. Croft, W. B. (1986). Boolean queries and term dependencies in probabilistic retrieval models. Journal of the American Society for Information Science, 37(4), 71–77. Google Scholar
  7. Croft, W. B., Turtle, H., & Lewis, D. (1991). The use of phrases and structured queries in information retrieval. In Proc. 14th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 32–45). CrossRefGoogle Scholar
  8. Fagan, J. (1987). Automatic phrase indexing for document retrieval: An examination of syntactic and non-syntactic methods. In Proc. tenth ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 91–101). CrossRefGoogle Scholar
  9. Harper, D., & van Rijsbergen, C. J. (1978). An evaluation of feedback in document retrieval using co-occurrence data. Journal of Documentation, 34(3), 189–216. CrossRefGoogle Scholar
  10. Ji, X. & Zha, H. (2003). Domain-independent text segmentation using anisotropic diffusion and dynamic programming. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval, SIGIR ’03 (pp. 322–329). New York: ACM. Google Scholar
  11. Kurland, O., & Lee, L. (2004). Corpus structure, language models, and ad hoc information retrieval. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 194–201). Google Scholar
  12. Lavrenko, V., & Croft, W. B. (2001). Relevance-based language models. In Proc. 24th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 120–127). CrossRefGoogle Scholar
  13. Liu, X., & Croft, W. B. (2004). Cluster-based retrieval using language models. In Proc. 27th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 186–193). Google Scholar
  14. Macdonald, C. & Ounis, I. (2007). Expertise drift and query expansion in expert search. In Proceedings of the sixteenth ACM conference on information and knowledge management, CIKM ’07 (pp. 341–350). New York: ACM. CrossRefGoogle Scholar
  15. Metzler, D., & Croft, W. B. (2005). A Markov random field model for term dependencies. In Proc. 28th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 472–479). CrossRefGoogle Scholar
  16. Metzler, D., & Croft, W. B. (2007). Latent concept expansion using Markov random fields. In Proc. 30th ann. intl. ACM SIGIR conf. on research and development in information retrieval. Google Scholar
  17. Metzler, D., Strohman, T., Turtle, H., & Croft, W. B. (2004b). Indri at TREC 2004: Terabyte track. In Proc. 13th intl. conf. on World Wide Web. Google Scholar
  18. Metzler, D., Strohman, T., Zhou, Y., & Croft, W. B. (2005b). Indri at TREC 2005: terabyte track. In Proc. 14th intl. conf. on World Wide Web. Google Scholar
  19. Murdock, V., & Croft, W. B. (2005). A translation model for sentence retrieval. In Proc. HLT ’05 (pp. 684–691). Morristown: Association for Computational Linguistics. CrossRefGoogle Scholar
  20. Papka, R., & Allan, J. (1997). Why bigger windows are better than smaller ones (Technical report). University of Massachusetts, Amherst. Google Scholar
  21. Ponte, J., & Croft, W. B. (1998). A language modeling approach to information retrieval. In Proc. 21st ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 275–281). CrossRefGoogle Scholar
  22. Rocchio, J. J. (1971). Relevance feedback in information retrieval (pp. 313–323). New York: Prentice-Hall. Google Scholar
  23. Tao, T., Wang, X., Mei, Q., & Zhai, C. (2006). Language model information retrieval with document expansion. In Proc. of HLT/NAACL (pp. 407–414). Google Scholar
  24. van Rijsbergen, C. J. (1977). A theoretical basis for the use of cooccurrence data in information retrieval. Journal of Documentation, 33(2), 106–119. CrossRefGoogle Scholar
  25. Wei, X., & Croft, W. B. (2006). LDA-based document models for ad-hoc retrieval. In Proc. 29th ann. intl. ACM SIGIR conf. on research and development in information retrieval (pp. 178–185). CrossRefGoogle Scholar
  26. Xu, J., & Croft, W. B. (2000). Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1), 79–112. CrossRefGoogle Scholar
  27. Zhai, C., & Lafferty, J. (2001a). Model-based feedback in the language modeling approach to information retrieval. In Proc. 10th intl. conf. on information and knowledge management (pp. 403–410). Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  1. 1.Natural Language Group, Information Sciences InstituteUniversity of Southern CaliforniaMarina del ReyUSA

Personalised recommendations