Skip to main content

An Investigation into the Use of Document Scores for Optimisation over Rank-Biased Precision

  • Conference paper
  • First Online:
Information Retrieval Technology (AIRS 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10648))

Included in the following conference series:

  • 585 Accesses

Abstract

When a Document Retrieval system receives a query, a Relevance model is used to provide a score to each document based on its relevance to the query. Relevance models have parameters that should be tuned to optimise the accuracy of the relevance model for the document set and expected queries, where the accuracy is computed using an Information Retrieval evaluation function. Unfortunately, evaluation functions contain a discontinuous mapping from the document scores to document ranks, making optimisation of relevance models difficult using gradient based optimisation methods. In this article, we identify that the evaluation function Rank-biased Precision (RBP) performs a conversion from document scores, to ranks, then to weights. Therefore, we investigate the utility of bypassing the conversion to ranks (converting document score directly to RBP weights) for Relevance model tuning purposes. We find that using transformed BM25 document scores in the place of the RBP weights provides an equivalent optimisation function for mean and median RBP. Therefore, we can use this document score based RBP as a surrogate for tuning relevance models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Amati, G., Rijsbergen, C.J.V.: Probabilistic models of information retrieval based on measuring the divergence from randomness. In: ACM Transactions on Information Systems (TOIS), pp. 357–389. ACM (2002)

    Google Scholar 

  2. Amati, G., van Rijsbergen, C.J.: Term frequency normalization via pareto distributions. In: Crestani, F., Girolami, M., Van Rijsbergen, C.J. (eds.) ECIR 2002. LNCS, vol. 2291, pp. 183–192. Springer, Heidelberg (2002). doi:10.1007/3-540-45886-7_13

    Chapter  Google Scholar 

  3. Amati, G.: Probability models for information retrieval based on divergence from randomness. Ph.D. thesis, Department of Computing Science, University of Glasgow, Scotland, December 2003

    Google Scholar 

  4. Büttcher, S., Clarke, C., Cormack, G.: Information Retrieval: Implementing and Evaluating Search Engines. The MIT Press, Cambridge (2010)

    MATH  Google Scholar 

  5. Chris, B., Tal, S., Erin, R., Ari, L., Matt, D., Nicole, H., Greg, H.: Learning to rank using gradient descent. In: Proceedings of the 22nd International Conference on Machine Learning, Bonn, Germany, pp. 89–96. ACM (2005)

    Google Scholar 

  6. He, B., Ounis, I.: A study of parameter tuning for term frequency normalization. In: CIKM 2003 Proceedings of the Twelfth International Conference on Information and Knowledge Management, New Orleans, Louisiana, USA, pp. 10–16. ACM (2003)

    Google Scholar 

  7. He, B., Ounis, I.: Term frequency normalisation tuning for BM25 and DFR models. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 200–214. Springer, Heidelberg (2005). doi:10.1007/978-3-540-31865-1_15

    Chapter  Google Scholar 

  8. Järvelin, K., Kekäläinen, J.: IR evaluation methods for retrieving highly relevant documents. In: Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Athens, Greece, pp. 41–48. ACM (2000)

    Google Scholar 

  9. Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA, pp. 111–119. ACM (2001)

    Google Scholar 

  10. Lafferty, J., Zhai, C.: A study of smoothing methods for language models applied to information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, Louisiana, USA, pp. 334–342. ACM (2001)

    Google Scholar 

  11. Manning, D.C., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, 2nd edn. Cambridge University Press, New York (2009)

    MATH  Google Scholar 

  12. Matawie, K., Hasso, S.: Information retrieval models: Performance, evaluation and comparisons for healthcare big data analytics. In: Proceedings of the 31st International Workshop on Statistical Modelling, Rennes, France, pp. 207–212 (2016)

    Google Scholar 

  13. Moffat, A., Zobel, J.: Rank-biased precision for measurement of retrieval effectiveness. ACM Trans. Inf. Syst. (TOIS) 27(1), 2:1–2:27 (2008)

    Article  Google Scholar 

  14. Park, L.A.F., Zhang, Y.: On the distribution of user persistence for rank-biased precision. In: The Proceedings of the Twelfth Australasian Document Computing Symposium (2007)

    Google Scholar 

  15. Singhal, A., Buckley, C., Mitra, M.: Pivoted document length normalization. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, Switzerland, pp. 21–29. ACM (1996)

    Google Scholar 

  16. Taylor, M., Zaragoza, H., Craswell, N., Burges, C.: Optimisation methods for ranking functions with multiple parameters. In: Proceedings of the 15th ACM international conference on Information and Knowledge Management, Arlington, Virginia, USA, pp. 585–593. ACM (2006)

    Google Scholar 

  17. Valizadegan, H., Jin, R., Zhang, R., Mao, J.: Learning to rank by optimizing NDCG measure. In: Advances in Neural Information Processing Systems, pp. 1883–1891 (2009)

    Google Scholar 

  18. Zhang, Y., Park, L.A.F., Moffat, A.: Click-based evidence for decaying weight distributions in search effectiveness metrics. J. Inf. Retrieval, 1–24 (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunil Randeni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Randeni, S., Matawie, K.M., Park, L.A.F. (2017). An Investigation into the Use of Document Scores for Optimisation over Rank-Biased Precision. In: Sung, WK., et al. Information Retrieval Technology. AIRS 2017. Lecture Notes in Computer Science(), vol 10648. Springer, Cham. https://doi.org/10.1007/978-3-319-70145-5_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-70145-5_15

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-70144-8

  • Online ISBN: 978-3-319-70145-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics