Selective Labeling and Incomplete Label Mitigation for Low-Cost Evaluation

Hui, Kai; Berberich, Klaus

doi:10.1007/978-3-319-23826-5_14

Kai Hui¹⁶ &
Klaus Berberich¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9309))

Included in the following conference series:

International Symposium on String Processing and Information Retrieval

1108 Accesses
4 Citations

Abstract

Information retrieval evaluation heavily relies on human effort to assess the relevance of result documents. Recent years have seen efforts and good progress to reduce the human effort and thus lower the cost of evaluation. Selective labeling strategies carefully choose a subset of result documents to label, for instance, based on their aggregate rank in results; strategies to mitigate incomplete labels seek to make up for missing labels, for instance, predicting them using machine learning methods. How different strategies interact, though, is unknown.

In this work, we study the interaction of several state-of-the-art strategies for selective labeling and incomplete label mitigation on four years of TREC Web Track data (2011–2014). Moreover, we propose and evaluate MaxRep as a novel selective labeling strategy, which has been designed so as to select effective training data for missing label prediction.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aslam, J.A., Pavlu, V.: A practical sampling strategy for efficient retrieval evaluation. Report (May 2007)
Google Scholar
Aslam, J.A., Pavlu, V., Yilmaz, E.: A statistical method for system evaluation using incomplete judgments. In: SIGIR, pp. 541–548 (2006)
Google Scholar
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR, pp. 25–32 (2004)
Google Scholar
Büttcher, S., Clarke, C.L.A., Yeung, P.C.K., Soboroff, I.: Reliable information retrieval evaluation with incomplete and biased judgements. In: SIGIR, pp. 63–70 (2007)
Google Scholar
Carterette, B.: Robust test collections for retrieval evaluation. In: SIGIR, pp. 55–62 (2007)
Google Scholar
Carterette, B., Allan, J.: Semiautomatic evaluation of retrieval systems using document similarities. In: CIKM, pp. 873–876 (2007)
Google Scholar
Carterette, B., Allan, J., Sitaraman, R.: Minimal test collections for retrieval evaluation. In: SIGIR, pp. 268–275 (2006)
Google Scholar
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM, pp. 621–630 (2009)
Google Scholar
Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR, pp. 659–666 (2008)
Google Scholar
Cleverdon, C.: The cranfield tests on index language devices. In: Aslib proceedings, vol. 19, pp. 173–194. MCB UP Ltd (1967)
Google Scholar
Cormack, G.V., Palmer, C.R., Clarke, C.L.A.: Efficient construction of large test collections. In: SIGIR, pp. 282–289 (1998)
Google Scholar
Nemhauser, G., Wolsey, L., Fisher, M.: An analysis of approximations for maximizing submodular set functions–i. Mathematical Programming 14, 265–294 (1978)
Article MathSciNet MATH Google Scholar
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
MATH Google Scholar
Rijsbergen, C.J.V.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979)
Google Scholar
Sakai, T.: Alternatives to bpref. In: SIGIR, pp. 71–78 (2007)
Google Scholar
Spärck Jones, K., Van Rijsbergen, K.: Information retrieval test collections. Journal of Documentation 32(1), 59–75 (1976)
Article Google Scholar
Vu, H.-T., Gallinari, P.: A machine learning based approach to evaluating retrieval systems. In: HLT-NAACL, pp. 399–406 (2006)
Google Scholar
Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. In: CIKM, pp. 102–111 (2006)
Google Scholar
Yu, K., Bi, J., Tresp, V.: Active learning via transductive experimental design. In: ICML, pp. 1081–1088 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Max Planck Institute for Informatics, Saarbrücken, Germany
Kai Hui & Klaus Berberich

Authors

Kai Hui
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Berberich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kai Hui .

Editor information

Editors and Affiliations

King's College London, London, United Kingdom
Costas Iliopoulos
University of Helsinki, Helsinki, Finland
Simon Puglisi
University College London, London, United Kingdom
Emine Yilmaz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hui, K., Berberich, K. (2015). Selective Labeling and Incomplete Label Mitigation for Low-Cost Evaluation. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds) String Processing and Information Retrieval. SPIRE 2015. Lecture Notes in Computer Science(), vol 9309. Springer, Cham. https://doi.org/10.1007/978-3-319-23826-5_14

Download citation

DOI: https://doi.org/10.1007/978-3-319-23826-5_14
Published: 05 September 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23825-8
Online ISBN: 978-3-319-23826-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics