Advertisement

On Evaluating Query Performance Predictors

  • Yu Huang
  • Tiejian Luo
  • Xiang Wang
  • Kai Hui
  • Wen-Jie Wang
  • Ben He
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8351)

Abstract

Query performance prediction (QPP) is to estimate the query difficulty without knowing the relevance assessment information. The quality of predictor is evaluated by the correlation coefficient between the predicted values and actual Average Precision (AP). The Pearson correlation coefficient, Spearman’s Rho and Kendall’ Tau are the most popular measurements of calculating the correlation coefficient between predicted values and AP. Previous works showed that these methods are not enough equitable and appropriate for evaluating the quality of predictor. In this paper, we add two novel methods, Maximal Information Coefficient (MIC) and Brownian Distance Correlation (Dcor), in evaluating the quality of predictor and compare them with three traditional measurements to observe the differences. We conduct a series of experiments on several standard TREC datasets and analyze the results. The experimental results reveal that MIC and Dcor provide different conclusions in some cases, which offer useful supplements in evaluating the quality of predictor. Furthermore, the sensitivity of diverse methods towards the change of predictors’ parameters is distinct in our experiments, and we make some analysis to these differences.

Keywords

Query performance prediction Information retrieve Benchmark Maximal information coefficient Brownian distance correlation Comparison 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Carmel, D., Yom-Tov, E.: Estimating the query difficulty for information retrieval. Synthesis Lectures on Information Concepts, Retrieval, and Services 2(1), 1–89 (2010)CrossRefGoogle Scholar
  2. 2.
    Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 299–306. ACM (2002)Google Scholar
  3. 3.
    Grabisch, M., Marichal, J.-L., Mesiar, R., Pap, E.: Aggregation Functions (Encyclopedia of Mathematics and its Applications), 1st edn. Cambridge University Press, New York (2009) Google Scholar
  4. 4.
    Hauff, C., Azzopardi, L., Hiemstra, D.: The combination and evaluation of query performance prediction methods. In: Boughanem, M., Berrut, C., Mothe, J., Soule-Dupuy, C. (eds.) ECIR 2009. LNCS, vol. 5478, pp. 301–312. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  5. 5.
    He, B., Ounis, I.: Inferring query performance using pre-retrieval predictors. In: Apostolico, A., Melucci, M. (eds.) SPIRE 2004. LNCS, vol. 3246, pp. 43–54. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  6. 6.
    Kwok, K.L.: A new method of weighting query terms for ad-hoc retrieval. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 187–195. ACM (1996)Google Scholar
  7. 7.
    Metzler, D., Croft, W.B.: A markov random field model for term dependencies. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 472–479. ACM (2005)Google Scholar
  8. 8.
    Reshef, D.N., Reshef, Y.A., Finucane, H.K., Grossman, S.R., McVean, G., Turnbaugh, P.J., Lander, E.S., Mitzenmacher, M., Sabeti, P.C.: Detecting novel associations in large data sets. science 334(6062), 1518–1524 (2011)CrossRefGoogle Scholar
  9. 9.
    Scholer, F., Garcia, S.: A case for improved evaluation of query difficulty prediction. In: Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 640–641. ACM (2009)Google Scholar
  10. 10.
    Shtok, A., Kurland, O., Carmel, D.: Predicting query performance by query-drift estimation. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 305–312. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  11. 11.
    Szekely, G.J., Rizzo, M.L.: Brownian distance covariance. The Annals of Applied Statistics 3(4), 1236–1265 (2009)CrossRefzbMATHMathSciNetGoogle Scholar
  12. 12.
    Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to ad hoc information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 334–342. ACM (2001)Google Scholar
  13. 13.
    Zhou, Y.: Retrieval performance prediction and document quality. PhD thesis, University of Massachusetts Amherst (2007)Google Scholar
  14. 14.
    Zhou, Y., Croft, W.B.: Query performance prediction in web search environments. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information retrieval, pp. 543–550. ACM (2007)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Yu Huang
    • 1
  • Tiejian Luo
    • 1
  • Xiang Wang
    • 1
  • Kai Hui
    • 1
  • Wen-Jie Wang
    • 1
  • Ben He
    • 1
  1. 1.University of Chinese Academy of SciencesBeijingChina

Personalised recommendations