Precision-Oriented Effectiveness Measures
Precision-oriented evaluation in information retrieval considers the relevance of the top n search results, for small n and using a set of relevance judgments that need not be complete. Such “shallow” evaluation is consistent with a user who only cares about the top-ranked documents. Relaxing the requirement of identifying all relevant documents for every query means that certain measures, such as recall at n, cannot be applied. However, it also allows evaluation on a very large corpus, where employing human relevance assessors to find the complete relevant set for each query would be too expensive. Both aspects of precision-oriented evaluation, the shallow viewing of results and the large corpus, are associated with Web search, where search results are typically a top-10 and the corpus may contain tens of billions of documents.
The Cranfield II experiments in 1963 were a landmark effort in information retrieval evaluation . A test collection...
- 1.Buckley C, Voorhees EM. Retrieval evaluation with incomplete information. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2007. p. 25–32.Google Scholar
- 2.Clarke CLA, Scholer F, Soboroff I. The TREC 2005 terabyte track. In: Proceedings of the 5th Text Retrieval Conference; 2005.Google Scholar
- 3.Cleverdon C. The Cranfield tests on index language devices. Readings in information retrieval. San Fransisco: Morgan Kaufmann; 1997. p. 47–59.Google Scholar
- 4.Hawking D, Craswell N. The very large collection and web tracks. In: Voorhees E, Harman D, editors. TREC experiment and evaluation in information retrieval. Cambridge, MA: MIT Press; 2005. p. 199–231.Google Scholar
- 5.Voorhees EM, Harman DK. TREC: experiment and evaluation in information retrieval. Cambridge, MA: MIT Press; 2005.Google Scholar
- 6.Yilmaz E, Aslam JA. Estimating average precision with incomplete and imperfect judgments. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management; 2006. p. 102–11.Google Scholar