Abstract
Estimating the probability of relevance for a document is fundamental in information retrieval. From a theoretical point of view, risk exists in the estimation process, in the sense that the estimated probabilities may not be the actual ones precisely. The estimation risk is often considered to be dependent on the rank. For example, the probability ranking principle assumes that ranking documents in the order of decreasing probability of relevance can optimize the rank effectiveness. This implies that a precise estimation can yield an optimal rank. However, an optimal (or even ideal) rank does not always guarantee that the estimated probabilities are precise. This means that part of the estimation risk is rank-independent. It imposes practical risks in the applications, such as pseudo relevance feedback, where different estimated probabilities of relevance in the first-round retrieval will make a difference even when two ranks are identical. In this paper, we will explore the effect and the modeling of such rank-independent risk. A risk management method is proposed to adaptively adjust the rank-independent risk. Experimental results on several TREC collections demonstrate the effectiveness of the proposed models for both pseudo-relevance feedback and relevance feedback.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Croft, W.B., Cronen-Townsend, S., Lavrenko, V.: Relevance feedback and personalization: A language modeling perspective. In: DELOS Workshop: Personalisation and Recommender Systems in Digital Libraries (2001)
Dillon, J.V., Collins-Thompson, K.: A unified optimization framework for robust pseudo-relevance feedback algorithms. In: CIKM, pp. 1069–1078 (2010)
Lafferty, J.D., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: SIGIR, pp. 111–119 (2001)
Lafferty, J.D., Zhai, C.: Probabilistic relevance models based on document and query generation. In: Language Modeling and Information Retrieval, pp. 1–10 (2003)
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR, pp. 120–127 (2001)
Lv, Y., Zhai, C.: Adaptive relevance feedback in information retrieval. In: CIKM, pp. 255–264 (2009)
Manmatha, R., Rath, T., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: SIGIR, pp. 267–275 (2001)
Maron, M.E., Kuhns, J.L.: On relevance, probabilistic indexing and information retrieval. J. ACM 7, 216–244 (1960)
Ogilvie, P., Callan, J.: Experiments using the lemur toolkit. In: TREC 2002, pp. 103–108 (2002)
Ponte, J.M., Croft, W.B.: A language modeling approach to information retrieval. In: SIGIR, pp. 275–281 (1998)
Robertson, S.E.: The probability ranking principle in IR. Journal of Documentation, 294–304 (1977)
Robertson, S.E., Zaragoza, H.: The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends in Information Retrieval 3(4), 333–389 (2009)
van Rijsbergen, C.J.: Information Retrieval. Butterworths (1979)
Wang, J., Zhu, J.: Portfolio theory of information retrieval. In: SIGIR, pp. 115–122 (2009)
Zhai, C., Lafferty, J.D.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: SIGIR, pp. 334–342 (2001)
Zhai, C., Lafferty, J.D.: A risk minimization framework for information retrieval. Inf. Process. Manage. 42(1), 31–55 (2006)
Zhu, J., Wang, J., Cox, I.J., Taylor, M.J.: Risky business: modeling and exploiting uncertainty in information retrieval. In: SIGIR, pp. 99–106 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, P., Song, D., Wang, J., Zhao, X., Hou, Y. (2011). On Modeling Rank-Independent Risk in Estimating Probability of Relevance. In: Salem, M.V.M., Shaalan, K., Oroumchian, F., Shakery, A., Khelalfa, H. (eds) Information Retrieval Technology. AIRS 2011. Lecture Notes in Computer Science, vol 7097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25631-8_2
Download citation
DOI: https://doi.org/10.1007/978-3-642-25631-8_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25630-1
Online ISBN: 978-3-642-25631-8
eBook Packages: Computer ScienceComputer Science (R0)