Information Retrieval

, Volume 16, Issue 3, pp 307–330 | Cite as

A learning approach to optimizing exploration–exploitation tradeoff in relevance feedback



Relevance feedback is an effective technique for improving search accuracy in interactive information retrieval. In this paper, we study an interesting optimization problem in interactive feedback that aims at optimizing the tradeoff between presenting search results with the highest immediate utility to a user (but not necessarily most useful for collecting feedback information) and presenting search results with the best potential for collecting useful feedback information (but not necessarily the most useful documents from a user’s perspective). Optimizing such an exploration–exploitation tradeoff is key to the optimization of the overall utility of relevance feedback to a user in the entire session of relevance feedback. We formally frame this tradeoff as a problem of optimizing the diversification of search results since relevance judgments on more diversified results have been shown to be more useful for relevance feedback. We propose a machine learning approach to adaptively optimizing the diversification of search results for each query so as to optimize the overall utility in an entire session. Experiment results on three representative retrieval test collections show that the proposed learning approach can effectively optimize the exploration–exploitation tradeoff and outperforms the traditional relevance feedback approach which only does exploitation without exploration.


Interactive retrieval models Feedback Diversification User modeling 



We thank the anonymous reviewers for their useful comments. This material is based upon work supported by the National Science Foundation under Grant Numbers IIS-0713581, CNS-0834709, and CNS 1028381, by NIH/NLM grant 1 R01 LM009153-01, and by a Sloan Research Fellowship. Maryam Karimzadehgan was supported by the Google PhD fellowship. Any opinions, findings, conclusions, or recommendations expressed in this material are the authors’ and do not necessarily reflect those of the sponsors.


  1. 1.
    Abbeel, P., Coates, A., Quigley, M., & Ng, A. Y. (2006). An application of reinforcement learning to aerobatic helicopter flight. NIPS (pp. 1–8).Google Scholar
  2. 2.
    Agarwal, D., Chen, B. -C., & Elango, P. (2009). Explore/exploit schemes for web content optimization. ICDM.Google Scholar
  3. 3.
    Agrawal, R., Gollapudi, S., Halverson, A., & Leong, S. (2009). Diversifying search results. ACM WSDM (pp. 5–14).Google Scholar
  4. 4.
    Bookstein, A. (1983). Information retrieval: A sequential learning process. Journal of American Society (ASIS), 34(5), 331–342.Google Scholar
  5. 5.
    Buckley, C. (2004). Why current ir engines fail. ACM SIGIR, 584–585.Google Scholar
  6. 6.
    Burges, C., Shaked, T., Renshaw, E., Lazier, A., Deeds, M., Hamilton, N., et al. (2005). Learning to rank using gradient descent. ICML (pp. 89–96).Google Scholar
  7. 7.
    Cao, Z., Qin, T., Liu, T. -Y., Tsai, M. -F., & Li, H. (2007). Learning to rank: From pairwise approach to listwise approach. ICML (pp. 129–136).Google Scholar
  8. 8.
    Carbonell, J., & Goldstein, J. (1998). The use of mmr, diversity-based reranking for reordering documents and producing summaries. ACM SIGIR (pp. 335–336).Google Scholar
  9. 9.
    Carmel, D., Yom-Tov, E., Darlow, A., & Pelleg, D. (2006). What makes a query difficult? ACM SIGIR (pp. 390–397).Google Scholar
  10. 10.
    Carterette, B., & Chandar, P. (2009). Probabilistic models of novel document rankings for faceted topic retrieval. ACM CIKM (pp. 1287–1296).Google Scholar
  11. 11.
    Carterette, B., & Petkova, D. (2006). Learning a ranking from pairwise preferences. ACM SIGIR (pp. 629–630).Google Scholar
  12. 12.
    Chen, H., & Karger, D. R. (2006). Less is more: Probabilistic models for retrieving fewer relevant documents. ACM SIGIR (pp. 429–436).Google Scholar
  13. 13.
    Clarke, C. L., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Bnttcher, S., et al. (2008). Novelty and diversity in information retrieval evaluation. ACM SIGIR (pp. 659–666).Google Scholar
  14. 14.
    Clough, P., Sanderson, M., Abouammoh, M., Navarro, S., & Paramita, M. L. (2009). Multiple approaches to analysing query diversity. ACM SIGIR (pp. 734–735).Google Scholar
  15. 15.
    Cohn, D. A., Ghahramani, Z., & Jordan, M. I. (1996). Active learning with statistical models. AI Journal, 4, 129–145.MATHGoogle Scholar
  16. 16.
    Collins-Thompson, K., & Bennett, P. N. (2009). Estimating query performance using class predictions. ACM SIGIR (pp. 672–673).Google Scholar
  17. 17.
    Cooper, W. S., Gey, F. C., & Dabney, D. P. (1992). Probabilistic retrieval based on staged logistic regression. ACM SIGIR (pp. 198–210).Google Scholar
  18. 18.
    Cronen-Townsend, S., Zhou, Y., & Croft, W. B. (2002). Predicting query performance. ACM SIGIR (pp. 299–306).Google Scholar
  19. 19.
    Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. Journal of Royal Statist, 39, 1–38.MathSciNetMATHGoogle Scholar
  20. 20.
    Faber, V. (1994). Clustering and the continuous k-means algorithm. Los Alamos Science, 22, 138–144.Google Scholar
  21. 21.
    Fuhr, N. (2008). A probability ranking principle for interactive information retrieval. Information Retrieval Journal, 11(3), 251–265.CrossRefGoogle Scholar
  22. 22.
    Fuhr, N., & Buckley, C. (1991). A probabilistic learning approach for document indexing. TOIS, 9(3), 223–248.CrossRefGoogle Scholar
  23. 23.
    Gey, F. C. (1994). Inferring probability of relevance using the method of logistic regression. ACM SIGIR (pp. 222–231).Google Scholar
  24. 24.
    Goffman, W. (1964). A searching procedure for information retrieval. Information Storage and Retrieval, 2, 73–78.CrossRefGoogle Scholar
  25. 25.
    Harman, D. (1992). Relevance feedback revisited. In Proceedings of ACM SIGIR 1992 (pp. 1–10).Google Scholar
  26. 26.
    Harman, D., & Buckley, C. (2004). Sigir 2004 workshop: Ria and where can ir go from here. SIGIR Forum, 38(2), 45–49.CrossRefGoogle Scholar
  27. 27.
    Hastie, T., Tibshirani, R., & Friedman, J. (2001). The elements of statistical learning. New York, USA: Springer Series in Statistics.MATHGoogle Scholar
  28. 28.
    Hauff, C., Murdock, V., & Baeza-Yates, R. (2008). Improved query difficulty prediction for the web. ACM CIKM (pp. 439–448).Google Scholar
  29. 29.
    He, B., & Ounis, I. (2004). Inferring query performance using pre-retrieval predictors. SPIRE (pp. 43–54).Google Scholar
  30. 30.
    He, J., Larson, M., & Rijke, M. D. (2008). Using coherence-based measures to predict query difficulty. ECIR (pp. 689–694).Google Scholar
  31. 31.
    Herbrich, R., Graepel, T., & Obermayer, K. (2000). Large margin rank boundaries for ordinal regression. Advances in Large Margin Classifiers, 88, 115–132.Google Scholar
  32. 32.
    Hofmann, T. (1999). Probabilistic latent sematic indexing. In Proceedings of ACM SIGIR’99 (pp. 50–57).Google Scholar
  33. 33.
    Jaakkola, T., & Siegelmann, H. (2001). Active information retrieval. NIPS.Google Scholar
  34. 34.
    Jensen E. C., Beitzel, S. M., Grossman, D., Frieder, O., & Chowdhury, A. (2005). Predicting query difficulty on the web by learning visual clues. ACM SIGIR (pp. 615–616).Google Scholar
  35. 35.
    Joachims, T. (2002). Optimizing search engines using clickthrough data. ACM KDD (pp. 133–142).Google Scholar
  36. 36.
    Joachims, T., Freitag, D., & Mitchell, T. (1997). Webwatcher: A tour guide for the world wide web. IJCAI.Google Scholar
  37. 37.
    Karimzadehgan, M., & Zhai, C. (2010). Exploration-exploitation tradeoff in interactive relevance feedback. ACM CIKM (pp. 1397–1400).Google Scholar
  38. 38.
    Karimzadehgan, M., & Zhai, C. (2011). Improving retrieval accuracy of difficult queries through generalizing negative document language models. ACM CIKM (pp. 27–36).Google Scholar
  39. 39.
    Karimzadehgan, M., Zhai, C., & Belford, G. (2008). Multi-aspect expertise matching for review assignment. ACM CIKM (pp. 1113–1122).Google Scholar
  40. 40.
    Kumar, P. R., & Varaiya, P. P. (1986). Stochastic systems: Estimation, identification, and adaptive control. Englewood Gifts, NJ: Prentice Hall.Google Scholar
  41. 41.
    Laffetry, J., & Zhai, C. (2001). Document language models, query models, and risk minimization for information retrieval. ACM SIGIR (pp. 111–119).Google Scholar
  42. 42.
    Lavrenko, V., & Croft, W. B. (2001). Relevance-based language models. ACM SIGIR (pp. 120–127).Google Scholar
  43. 43.
    Leuski, A. (2001). Interactive information organization: Techniques and evaluation. PhD Thesis, University of Massachusetts.Google Scholar
  44. 44.
    Lewis, D. D., & Gale, W. A. (1994). A sequential algorithm for training text classifiers. ACM SIGIR (pp. 3–12).Google Scholar
  45. 45.
    Lv, Y., & Zhai, C. (2009). Adaptive relevance feedback in information retrieval. ACM CIKM (pp. 255–264).Google Scholar
  46. 46.
    McCallum, A., & Nigam, K. (1998). Employing em and pool-based active learning for text classification. ICML (pp. 350–358).Google Scholar
  47. 47.
    Moffat, A., & Zobel, J. (2008). Rank-biased precision for measurement of retrieval effectiveness. ACM Transactions on Information Systems, 27(1).Google Scholar
  48. 48.
    Pandey, S., Chakrabarti, D., & Agarwal, D. (2007). Multi-armed bandit problems with dependent arms. ICML (pp. 721–728).Google Scholar
  49. 49.
    Pandey, S., Roy, S., Olston, C., Cho, J., & Chakrabarti, S. (2005). Shuffling a stacked deck: The case for partially randomized ranking of search engine results. VLDB (pp. 781–792).Google Scholar
  50. 50.
    Paramita, M. L., Sanderson, M., & Clough, P. (2009). Diversity in photo retrieval: Overview of the imageclefphoto task 2009. CLEF workshop.Google Scholar
  51. 51.
    Ponte, J. M., & Croft, W. B. (1998). A language modeling approach to information retrieval. ACM SIGIR (pp. 275–281).Google Scholar
  52. 52.
    Radlinski, F., & Dumais, S. (2006). Improving personalized web search using result diversification. ACM SIGIR (pp. 691–692).Google Scholar
  53. 53.
    Radlinski, F., & Joachims, T. (2005). Query chains: Learning to rank from implicit feedback. ACM KDD (pp. 239–248).Google Scholar
  54. 54.
    Radlinski, F., Kleinberg, R., & Joachims, T. (2008). Learning diverse rankings with multi-armed bandits. ICML (pp. 784–791).Google Scholar
  55. 55.
    Robertson, S. E., & Jones, K. S. (1976). Relevance weighting of search terms. Journal of the American Society of Information Science, 27(3), 129–146.CrossRefGoogle Scholar
  56. 56.
    Rocchio, J. (1971). Relevance feedback in information retrieval. In The SMART retrieval system (pp. 313–323).Google Scholar
  57. 57.
    Roy, N., & McCallum, A. (2001). Toward optimal active learning through sampling estimation of error reduction. ICML (pp. 441–448).Google Scholar
  58. 58.
    Salton, G., & Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of Information Science, 41(4), 288–297.CrossRefGoogle Scholar
  59. 59.
    Sanderson, M., Tang, J., Arni, T., & Clough, P. (2009). What else is there? search diversity examined. ECIR (pp. 562–569).Google Scholar
  60. 60.
    Scholer, F., & Garcia, S. (2009). A case for improved evaluation of query difficulty prediction. ACM SIGIR (pp. 640–641).Google Scholar
  61. 61.
    Shen, X., & Zhai, C. (2005). Active feedback in ad hoc information retrieval. ACM SIGIR (pp. 59–66).Google Scholar
  62. 62.
    Singh, S., Litman, D., Kearns, M., & Walker, M. (2002). Optimizing dialogue management with reinforcement learning: Experiments with the njfun system. Journal of Artificial Intelligence, 16, 105–133.Google Scholar
  63. 63.
    Song, R., Luo, Z., Wen, J. -R., Yu, Y., & Hon, H. -W. (2007). Identifying ambiguous queries in web search. WWW (pp. 1169–1170).Google Scholar
  64. 64.
    Tang, J., & Sanderson, M. (2010). Evaluation and user preference study on spatial diversity. ECIR (pp. 179–190).Google Scholar
  65. 65.
    Tao, T., & Zhai, C. (2006). Regularized estimation of mixture models for robust pseudo-relevance feedback. ACM SIGIR (pp. 162–169).Google Scholar
  66. 66.
    Thrun, S. B. (1992). The role of exploration in learning control. In Handbook for intelligent control: Neural, fuzzy and adaptive approaches. Florence, Kentucky: Van Nostrand Reinhold.Google Scholar
  67. 67.
    Tong, S., & Koller, D. (2000). Support vector machine active learning with applications to text classification. ICML (pp. 999–1006).Google Scholar
  68. 68.
    Vee, E., Srivastava, U., Shanmugasundaram, J., Bhat, P., & Yahia, S. A. (2008). Efficient computation of diverse query results. ICDE (pp. 228–236).Google Scholar
  69. 69.
    Voorhees, E. M. (2004). Overview of the trec 2004 robust retrieval track. TREC.Google Scholar
  70. 70.
    Voorhees, E. M. (2005). Draft: Overview of the trec 2005 robust retrieval track. Notebook of TREC 2005.Google Scholar
  71. 71.
    Wang, X., Fang, H., & Zhai, C. (2008). A study of methods for negative relevance feedback. ACM SIGIR (pp. 219–226).Google Scholar
  72. 72.
    Wilcoxon, F. (1945). Individual comparisons by ranking methods. Biometrics, 1, 80–83.CrossRefGoogle Scholar
  73. 73.
    Xu, Z., Akella, R., & Zhang, Y. (2007). Incorporating diversity and density in active learning for relevance feedback. ECIR (pp. 246–257).Google Scholar
  74. 74.
    Xu, Z., & Ram (2008). A bayesian logistic regression model for active relevance feedback. ACM SIGIR (pp. 227–234).Google Scholar
  75. 75.
    Yom-Tov, E., Fine, S., Carmel, D., & Darlow, A. (2005). Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. ACM SIGIR (pp. 512–519).Google Scholar
  76. 76.
    Yue, Y., & Joachims, T. (2008). Predicting diverse subsets using structural svms. ICML (pp. 1224–1231).Google Scholar
  77. 77.
    Yue, Y., & Joachims, T. (2009). Interactively optimizing information retrieval systems as a dueling bandits problem. ICML (pp. 1201–1208).Google Scholar
  78. 78.
    Zhai, C., Cohen, W., & Lafferty, J. (2003). Beyond independent relevance: Methods and evaluation metrics for subtopic retrieval. ACM SIGIR (pp. 10–17).Google Scholar
  79. 79.
    Zhai, C., & Lafferty, J. (2001). Model-based feedback in the language modeling approach to information retrieval. ACM CIKM (pp. 403–410).Google Scholar
  80. 80.
    Zhai, C., & Lafferty, J. (2001). A study of smoothing methods for language models applied to ad hoc information retrieval. ACM SIGIR (pp. 334–342).Google Scholar
  81. 81.
    Zhang, W., & Dietterich, T. (1995). A reinforcement learning approach to job-shop scheduling. IJCAI (pp. 1114–1120).Google Scholar
  82. 82.
    Zhang, Y., Xu, W., & Callan, J. (2003). Exploration and exploitation in adaptive filtering based on bayesian active learning, international conference on machine learning. ICML (pp. 896–903).Google Scholar
  83. 83.
    Zheng, Z., Chen, K., Sun, G., & Zha, H. (2007). A regression framework for learning ranking functions using relative relevance judgments. ACM SIGIR (pp. 287–294).Google Scholar
  84. 84.
    Zhu, X., Goldberg, A. B., Gael, J. V., & Andrzejewski, D. (2007). Improving diversity in ranking using absorbing random walks. NAACL HTL (pp. 97–104).Google Scholar
  85. 85.
    Ziegler, C. -N., McNee, S. M., Konstan, J. A., & Lausen, G. (2005). Improving recommendation lists through topic diversification. WWW (pp. 22–32).Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Illinois at Urbana-ChampaignUrbanaUSA

Personalised recommendations