Expected reading effort in focused retrieval evaluation
This study introduces a novel framework for evaluating passage and XML retrieval. The framework focuses on a user’s effort to localize relevant content in a result document. Measuring the effort is based on a system guided reading order of documents. The effort is calculated as the quantity of text the user is expected to browse through. More specifically, this study seeks evaluation metrics for retrieval methods following a specific fetch and browse approach, where in the fetch phase documents are ranked in decreasing order according to their document score, like in document retrieval. In the browse phase, for each retrieved document, a set of non-overlapping passages representing the relevant text within the document is retrieved. In other words, the passages of the document are re-organized, so that the best matching passages are read first in sequential order. We introduce an application scenario motivating the framework, and propose sample metrics based on the framework. These metrics give a basis for the comparison of effectiveness between traditional document retrieval and passage/XML retrieval and illuminate the benefit of passage/XML retrieval.
KeywordsPassage retrieval XML retrieval Evaluation Metrics Small screen devices
The study was supported by Academy of Finland under grants #115480 and #130482.
- Ali, M. S., Consens, M. P., Kazai, G., & Lalmas, M. (2008). Structural relevance: A common basis for the evaluation of structured document retrieval. In Proceedings of CIKM ‘08 (pp. 1153–1162).Google Scholar
- Allan, J. (2004). Hard track overview in TREC 2004: High accuracy retrieval from documents. In Proceedings of the 13th text retrieval conference (TREC 2004). Nist Special Publication, SP 500-261, 11 pages.Google Scholar
- Arvola, P., Junkkari, M., & Kekäläinen, J. (2006). Applying XML retrieval methods for result document navigation in small screen devices. In Proceedings of MobileHCI workshop for ubiquitous information access (pp. 6–10).Google Scholar
- Buyukkokten, O., Garcia-Molina, H., Paepcke, A., & Winograd, T. (2000). Power browser: Efficient web browsing for PDAs. In Proceedings of CHI ‘2000 (pp. 430–437).Google Scholar
- Chiaramella, Y., Mulhem, P., & Fourel, F. (1996). A model for multimedia search information retrieval. Technical report, basic research action FERMI 8134.Google Scholar
- de Vries, A. P., Kazai, G., & Lalmas, M. (2004). Tolerance to irrelevance: A user-effort oriented evaluation of retrieval systems without predefined retrieval unit. In Proceedings of RIAO 2004 (pp. 463–473).Google Scholar
- Dunlop, M. D. (1997). Time, relevance and interaction modelling for information retrieval. In Proceedings of SIGIR ‘97 (pp. 206–212).Google Scholar
- Finesilver K., & Reid J. (2003). User behaviour in the context of structured documents. In Proceedings of ECIR 2003, LNCS 2633 (pp. 104–119).Google Scholar
- Ibekwe-SanJuan, F., & SanJuan, E. (2009). Use of multiword terms and query expansion for interactive information retrieval. In Advances in Focused Retrieval, LNCS 5631 (pp. 54–64).Google Scholar
- INEX (Initiative for the Evaluation of XML Retrieval) home pages. (2009). Retrieved January 23, 2009 from http://www.inex.otago.ac.nz.
- Itakura, K., & Clarke, C. L. K. (2009). University of Waterloo at INEX 2008: Adhoc, book, and link-the-wiki tracks. In Advances in Focused Retrieval, LNCS 5631 (pp. 132–139).Google Scholar
- Jones, M., Buchanan, G., & Mohd-Nasir, N. (1999). Evaluation of WebTwig—a site outliner for handheld Web access. In Proceedings of international symposium on handheld and ubiquitous computing, LNCS 1707 (pp. 343–345).Google Scholar
- Kamps, J., Geva, S., Trotman, A., Woodley, A., & Koolen, M. (2008c). Overview of the INEX 2008 ad hoc track. In INEX 2008 workshop pre-proceedings (pp. 1–28).Google Scholar
- Kamps, J., Koolen, M., & Lalmas, M. (2008a). Locating relevant text within XML documents. In Proceedings of SIGIR’08 (pp. 847–848).Google Scholar
- Kamps, J., Lalmas, M., & Pehcevski, J. (2007). Evaluating relevant in context: Document retrieval with a twist. In Proceedings SIGIR ‘07 (pp. 749–750).Google Scholar
- Kamps, J., Pehcevski, J., Kazai, G., Lalmas, M., & Robertson, S. (2008b). INEX 2007 evaluation measures. In INEX 2007, LNCS 4862 (pp. 24–33).Google Scholar
- Opera Software ASA, Opera Mini™ for Mobile. (2006). Retrieved January 21, 2009 from http://www.opera.com/mini/demo/.
- Piwowarski, P. (2006). EPRUM metrics and INEX 2005. In Proceedings of INEX 2005, LNCS 3977 (pp. 30–42).Google Scholar
- Piwowarski, B., & Dupret, G. (2006). Evaluation in (XML) information retrieval: Expected precision-recall with user modelling (EPRUM). In Proceedings of SIGIR’06 (pp. 260–267).Google Scholar
- Piwowarski, B., & Lalmas, M. (2004). Providing consistent and exhaustive relevance assessments for XML retrieval evaluation. In Proceedings of CIKM ‘04 (pp. 361–370).Google Scholar
- Robertson, S. (2008). A new interpretation of average precision. In Proceedings of SIGIR ‘08 (pp. 689–690).Google Scholar
- Saracevic, T. (1996). Relevance reconsidered ‘96. In Proceedings of CoLIS (pp. 201–218). Google Scholar
- Trotman, A., Pharo, N., & Lehtonen, M. (2007). XML IR users and use cases. In Proceedings of INEX 2006, LNCS 4518 (pp. 400–412).Google Scholar