An Investigation into the Use of Document Scores for Optimisation over Rank-Biased Precision

Randeni, Sunil; Matawie, Kenan M.; Park, Laurence A. F.

doi:10.1007/978-3-319-70145-5_15

Sunil Randeni²³,
Kenan M. Matawie²³ &
Laurence A. F. Park²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10648))

Included in the following conference series:

Asia Information Retrieval Symposium

585 Accesses

Abstract

When a Document Retrieval system receives a query, a Relevance model is used to provide a score to each document based on its relevance to the query. Relevance models have parameters that should be tuned to optimise the accuracy of the relevance model for the document set and expected queries, where the accuracy is computed using an Information Retrieval evaluation function. Unfortunately, evaluation functions contain a discontinuous mapping from the document scores to document ranks, making optimisation of relevance models difficult using gradient based optimisation methods. In this article, we identify that the evaluation function Rank-biased Precision (RBP) performs a conversion from document scores, to ranks, then to weights. Therefore, we investigate the utility of bypassing the conversion to ranks (converting document score directly to RBP weights) for Relevance model tuning purposes. We find that using transformed BM25 document scores in the place of the RBP weights provides an equivalent optimisation function for mean and median RBP. Therefore, we can use this document score based RBP as a surrogate for tuning relevance models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Amati, G., Rijsbergen, C.J.V.: Probabilistic models of information retrieval based on measuring the divergence from randomness. In: ACM Transactions on Information Systems (TOIS), pp. 357–389. ACM (2002)
Google Scholar
Amati, G., van Rijsbergen, C.J.: Term frequency normalization via pareto distributions. In: Crestani, F., Girolami, M., Van Rijsbergen, C.J. (eds.) ECIR 2002. LNCS, vol. 2291, pp. 183–192. Springer, Heidelberg (2002). doi:10.1007/3-540-45886-7_13
Chapter Google Scholar
Amati, G.: Probability models for information retrieval based on divergence from randomness. Ph.D. thesis, Department of Computing Science, University of Glasgow, Scotland, December 2003
Google Scholar
Büttcher, S., Clarke, C., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press, Cambridge (2010)
MATH Google Scholar
Chris, B., Tal, S., Erin, R., Ari, L., Matt, D., Nicole, H., Greg, H.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, pp. 89–96. ACM (2005)
Google Scholar
He, B., Ounis, I.: A study of parameter tuning for term frequency normalization. In: CIKM 2003 Proceedings of the Twelfth International Conference on Information and Knowledge Management, New Orleans, Louisiana, USA, pp. 10–16. ACM (2003)
Google Scholar
He, B., Ounis, I.: Term frequency normalisation tuning for BM25 and DFR models. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 200–214. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31865-1_15
Chapter Google Scholar
Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 41–48. ACM (2000)
Google Scholar
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA, pp. 111–119. ACM (2001)
Google Scholar
Lafferty, J., Zhai, C.: A study of smoothing methods for language models applied to information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA, pp. 334–342. ACM (2001)
Google Scholar
Manning, D.C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, 2nd edn. Cambridge University Press, New York (2009)
MATH Google Scholar
Matawie, K., Hasso, S.: Information retrieval models: Performance, evaluation and comparisons for healthcare big data analytics. In: Proceedings of the 31st International Workshop on Statistical Modelling, Rennes, France, pp. 207–212 (2016)
Google Scholar
Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. (TOIS) 27(1), 2:1–2:27 (2008)
Article Google Scholar
Park, L.A.F., Zhang, Y.: On the distribution of user persistence for rank-biased precision. In: The Proceedings of the Twelfth Australasian Document Computing Symposium (2007)
Google Scholar
Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, pp. 21–29. ACM (1996)
Google Scholar
Taylor, M., Zaragoza, H., Craswell, N., Burges, C.: Optimisation methods for ranking functions with multiple parameters. In: Proceedings of the 15th ACM international conference on Information and Knowledge Management, Arlington, Virginia, USA, pp. 585–593. ACM (2006)
Google Scholar
Valizadegan, H., Jin, R., Zhang, R., Mao, J.: Learning to rank by optimizing NDCG measure. In: Advances in Neural Information Processing Systems, pp. 1883–1891 (2009)
Google Scholar
Zhang, Y., Park, L.A.F., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. J. Inf. Retrieval, 1–24 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Engineering and Mathematics, Western Sydney University, Sydney, Australia
Sunil Randeni, Kenan M. Matawie & Laurence A. F. Park

Authors

Sunil Randeni
View author publications
You can also search for this author in PubMed Google Scholar
Kenan M. Matawie
View author publications
You can also search for this author in PubMed Google Scholar
Laurence A. F. Park
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sunil Randeni .

Editor information

Editors and Affiliations

Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Won-Kyung Sung
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Hanmin Jung
Beijing University of Technology, Beijing, China
Shuo Xu
Burapha University, Chonburi, Thailand
Krisana Chinnasarn
Kwansei Gakuin University, Himeji, Hyogo, Japan
Kazutoshi Sumiya
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Jeonghoon Lee
Renmin University of China, Beijing, China
Zhicheng Dou
Georgetown University, Washington, District of Columbia, USA
Grace Hui Yang
Konkuk University, Seoul, Korea (Republic of)
Young-Guk Ha
Korea Institute of Science and Technology Information, Daejeon, Korea (Republic of)
Seungbock Lee

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Randeni, S., Matawie, K.M., Park, L.A.F. (2017). An Investigation into the Use of Document Scores for Optimisation over Rank-Biased Precision. In: Sung, WK., et al. Information Retrieval Technology. AIRS 2017. Lecture Notes in Computer Science(), vol 10648. Springer, Cham. https://doi.org/10.1007/978-3-319-70145-5_15

Download citation

DOI: https://doi.org/10.1007/978-3-319-70145-5_15
Published: 08 November 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-70144-8
Online ISBN: 978-3-319-70145-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics