Abstract
Information retrieval evaluation heavily relies on human effort to assess the relevance of result documents. Recent years have seen efforts and good progress to reduce the human effort and thus lower the cost of evaluation. Selective labeling strategies carefully choose a subset of result documents to label, for instance, based on their aggregate rank in results; strategies to mitigate incomplete labels seek to make up for missing labels, for instance, predicting them using machine learning methods. How different strategies interact, though, is unknown.
In this work, we study the interaction of several state-of-the-art strategies for selective labeling and incomplete label mitigation on four years of TREC Web Track data (2011–2014). Moreover, we propose and evaluate MaxRep as a novel selective labeling strategy, which has been designed so as to select effective training data for missing label prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aslam, J.A., Pavlu, V.: A practical sampling strategy for efficient retrieval evaluation. Report (May 2007)
Aslam, J.A., Pavlu, V., Yilmaz, E.: A statistical method for system evaluation using incomplete judgments. In: SIGIR, pp. 541–548 (2006)
Buckley, C., Voorhees, E.M.: Retrieval evaluation with incomplete information. In: SIGIR, pp. 25–32 (2004)
Büttcher, S., Clarke, C.L.A., Yeung, P.C.K., Soboroff, I.: Reliable information retrieval evaluation with incomplete and biased judgements. In: SIGIR, pp. 63–70 (2007)
Carterette, B.: Robust test collections for retrieval evaluation. In: SIGIR, pp. 55–62 (2007)
Carterette, B., Allan, J.: Semiautomatic evaluation of retrieval systems using document similarities. In: CIKM, pp. 873–876 (2007)
Carterette, B., Allan, J., Sitaraman, R.: Minimal test collections for retrieval evaluation. In: SIGIR, pp. 268–275 (2006)
Chapelle, O., Metlzer, D., Zhang, Y., Grinspan, P.: Expected reciprocal rank for graded relevance. In: CIKM, pp. 621–630 (2009)
Clarke, C.L., Kolla, M., Cormack, G.V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I.: Novelty and diversity in information retrieval evaluation. In: SIGIR, pp. 659–666 (2008)
Cleverdon, C.: The cranfield tests on index language devices. In: Aslib proceedings, vol. 19, pp. 173–194. MCB UP Ltd (1967)
Cormack, G.V., Palmer, C.R., Clarke, C.L.A.: Efficient construction of large test collections. In: SIGIR, pp. 282–289 (1998)
Nemhauser, G., Wolsey, L., Fisher, M.: An analysis of approximations for maximizing submodular set functions–i. Mathematical Programming 14, 265–294 (1978)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
Rijsbergen, C.J.V.: Information Retrieval, 2nd edn. Butterworth-Heinemann, Newton (1979)
Sakai, T.: Alternatives to bpref. In: SIGIR, pp. 71–78 (2007)
Spärck Jones, K., Van Rijsbergen, K.: Information retrieval test collections. Journal of Documentation 32(1), 59–75 (1976)
Vu, H.-T., Gallinari, P.: A machine learning based approach to evaluating retrieval systems. In: HLT-NAACL, pp. 399–406 (2006)
Yilmaz, E., Aslam, J.A.: Estimating average precision with incomplete and imperfect judgments. In: CIKM, pp. 102–111 (2006)
Yu, K., Bi, J., Tresp, V.: Active learning via transductive experimental design. In: ICML, pp. 1081–1088 (2006)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Hui, K., Berberich, K. (2015). Selective Labeling and Incomplete Label Mitigation for Low-Cost Evaluation. In: Iliopoulos, C., Puglisi, S., Yilmaz, E. (eds) String Processing and Information Retrieval. SPIRE 2015. Lecture Notes in Computer Science(), vol 9309. Springer, Cham. https://doi.org/10.1007/978-3-319-23826-5_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-23826-5_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23825-8
Online ISBN: 978-3-319-23826-5
eBook Packages: Computer ScienceComputer Science (R0)