Retrieval Models Versus Retrievability

Bashir, Shariq; Rauber, Andreas

doi:10.1007/978-3-662-53817-3_7

Shariq Bashir⁷ &
Andreas Rauber⁸

Part of the book series: The Information Retrieval Series ((INRE,volume 37))

1567 Accesses
1 Citations

Abstract

Retrievability is an important measure in information retrieval (IR) that can be used to analyse retrieval models and document collections. Rather than just focusing on a set of few documents that are given in the form of relevance judgements, retrievability examines what is retrieved, how frequently it is retrieved and how much effort is needed to retrieve it. Such a measure is of particular interest within the recall-oriented retrieval systems (e.g. patent or legal retrieval), because in this context a document needs to be retrieved before it can be judged for relevance. If a retrieval model makes some patents hard to find, patent searchers could miss relevant documents just because of the bias of the retrieval model. In this chapter we explain the concept of retrievability in information retrieval. We also explain how it can be estimated and how it can be used for analysing a retrieval bias of retrieval models. We also show how retrievability relates to effectiveness by analysing the relationship between retrievability and effectiveness measures and how the retrievability measure can be used to improve effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Arampatzis A, Kamps J, Kooken M, Nussbaum N (2007) Access to legal documents: exact match, best match, and combinations. In: Proceedings of the sixteenth text retrieval conference (TREC’07)
Google Scholar
Azzopardi L, Bache R (2010) On the relationship between effectiveness and accessibility. In: SIGIR ’10: proceeding of the 33rd annual international ACM SIGIR conference on research and development in information retrieval, Geneva, pp 889–890
Google Scholar
Azzopardi L, Owens C (2009) Search engine predilection towards news media providers. In: SIGIR ’09: proceedings of the 32nd annual international ACM SIGIR conference on research and development in information retrieval, Boston, MA, pp 774–775
Chapter Google Scholar
Azzopardi L, Vinay V (2008) Accessibility in information retrieval. In: ECIR’08: proceedings of the 30th European conference on IR research, pp 482–489
Google Scholar
Azzopardi L, Vinay V (2008) Retrievability: an evaluation measure for higher order information access tasks. In: CIKM ’08: proceeding of the 17th ACM conference on information and knowledge management, Napa Valley, CA, pp 561–570
Chapter Google Scholar
Bache R, Azzopardi L (2010) Improving access to large patent corpora. In: Transactions on large-scale data- and knowledge-centered systems II, vol 2. Springer, Berlin, pp 103–121
Chapter Google Scholar
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. ACM Press, New York
Google Scholar
Bashir S, Rauber A (2014) Automatic ranking of retrieval models using retrievability measure. Knowl Inf Syst 41(1):189–221
Article Google Scholar
Callan J, Connell M (2001) Query-based sampling of text databases. ACM Trans Inf Syst J 19(2):97–130
Article Google Scholar
Chowdhury GG (2004) Introduction to modern information retrieval, 2nd edn. Facet Publishing, London
Google Scholar
Dumble PL, Morris JM, Wigan MR (1979) Accessibility indicators for transport planning. Transp Res Part A Gen 13:91–109
Google Scholar
Efron M (2009) Using multiple query aspects to build test collections without human relevance judgments. In: Advances in information retrieval, proceedings of 31th European conference on IR research, ECIR 2009, Toulouse, 6–9 April 2009, pp 276–287
Google Scholar
Fujii A, Iwayama M, Kando N (2007) Introduction to the special issue on patent processing. Inf Process Manage J 43(5):1149–1153
Article Google Scholar
Gastwirth JL (1972) The estimation of the Lorenz curve and Gini index. Rev Econ Stat 54(3):306–316
Article MathSciNet Google Scholar
Geurs KT, van Wee B (2004) Accessibility evaluation of land-use and transport strategies: Review and research directions. J Transp Geogr 12:127–140
Article Google Scholar
Hansen WG (1959) How accessibility shape land use. J Am Inst Plann 25:73–76
Article Google Scholar
Harter SP, Hert CA (1997) Evaluation of information retrieval systems: approaches, issues, and methods. Ann Rev Inf Sci Technol 32:3–94
Google Scholar
Hauff C, Hiemstra D, Azzopardi L, de Jong F (2010) A case for automatic system evaluation. In: Advances in information retrieval, proceedings of the 32nd European conference on IR research, ECIR 2010, Milton Keynes, 28–31 March 2010, pp 153–165
Google Scholar
Lauw HW, Lim E-P, Wang K (2006) Bias and controversy: beyond the statistical deviation. In: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining, Philadelphia, PA, pp 625–630
Chapter Google Scholar
Litman T (2008) Evaluating accessibility for transportation planning. Victoria Transport Policy Institute
Google Scholar
Lupu M, Huang J, Zhu J, Tait J (2009) TREC-CHEM: large scale chemical information retrieval evaluation at TREC. In: SIGIR forum, vol 43, no 2. ACM, New York, pp 63–70
Google Scholar
Magdy W, Jones GJF (2010) Pres: a score metric for evaluating recall-oriented information retrieval applications. In: SIGIR’10: ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 611–618
Google Scholar
Manning CD, Raghavan P, Schutze H (2008) Introduction to information retrieval. Cambridge University Press, Cambridge
Book MATH Google Scholar
Mase H, Matsubayashi T, Ogawa Y, Iwayama M, Oshio T (2005) Proposal of two-stage patent retrieval method considering the claim structure. ACM Trans Asian Lang Inf Process 4(2):190–206
Article Google Scholar
Mowshowitz A, Kawaguchi A (2002) Bias on the web. In: Communications of the ACM, vol 45, no 9. ACM, New York, NY, pp 56–60
Google Scholar
Nuray R, Can F (2006) Automatic ranking of information retrieval systems using data fusion. Inf Process Manage 42(3):595–614
Article MATH Google Scholar
Ounis I, De Rijke M, Macdonald C, Mishne G, Soboroff I (2006) Overview of the TREC 2006 blog track. In: Proceedings of the text retrieval conference, TREC’06
Google Scholar
Petricek V, Escher T, Cox IJ, Margetts H (2006) The web structure of e-government - developing a methodology for quantitative evaluation. In: WWW ’06 proceedings of the 15th international conference on World Wide Web, pp 669–678
Google Scholar
Robertson SE, Walker S (1994) Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval. In: SIGIR ’94: proceedings of the 17th annual international ACM SIGIR conference on research and development in information retrieval, Dublin, pp 232–241
Google Scholar
Sakai T, Lin C-Y (2010) Ranking retrieval systems without relevance assessments: revisited. In: Proceedings of the 3rd international workshop on evaluating information access, EVIA 2010, National Center of Sciences, Tokyo, 15 June 2010, pp 25–33
Google Scholar
Sanderson M, Zobel J (2005) Information retrieval system evaluation: effort, sensitivity, and reliability. In: SIGIR’05: ACM SIGIR conference on research and development in information retrieval. ACM, New York, pp 162–169
Google Scholar
Shi Z, Li P, Wang B (2010) Using clustering to improve retrieval evaluation without relevance judgments. In: COLING 2010, 23rd international conference on computational linguistics, posters volume, Beijing, 23–27 August 2010, pp 1131–1139
Google Scholar
Shi Z, Wang B, Li P, Shi Z (2010) Using global statistics to rank retrieval systems without relevance judgments. In: Shi Z, Vadera S, Aamodt A, Leake DB (eds) Intelligent information processing. IFIP advances in information and communication technology, vol 340. Springer, Berlin, pp 183–192
Google Scholar
Singhal A (1997) AT&T at TREC-6. In: The 6th text retrieval conference (TREC6), pp 227–232
Google Scholar
Singhal A (2001) Modern information retrieval: a brief overview. IEEE Data Eng Bull 24:34–43
Google Scholar
Spoerri A (2007) Using the structure of overlap between search results to rank retrieval systems without relevance judgments. Inf Process Manage 43(4):1059–1070
Article Google Scholar
Vaughan L, Thelwall M (2004) Search engine coverage bias: evidence and possible causes. Inf Process Manage J 40(4):693–707
Article Google Scholar
Voorhees EM (2001) Overview of the TREC 2001 question answering track. In: Proceedings of the text retrieval conference, TREC’01, pp 42–51
Google Scholar
Voorhees EM (2002) The philosophy of information retrieval evaluation. In: CLEF’01. Springer, Berlin, pp 355–370
Google Scholar
Voorhees EM, Harman DK (2005) TREC experiment and evaluation in information retrieval. MIT Press, Cambridge, MA
Google Scholar
Wilkie C, Azzopardi L (2014) A retrievability analysis: exploring the relationship between retrieval bias and retrieval performance. In: Proceedings of the 23rd ACM international conference on conference on information and knowledge management, CIKM 2014, Shanghai, 3–7 November 2014, pp 81–90
Google Scholar
Zhai CX (2002) Risk minimization and language modeling in text retrieval. Ph.D. thesis, Carnegie Mellon University
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Mohammad Ali Jinnah University, Islamabad, Pakistan
Shariq Bashir
TU Wien, Vienna, Austria
Andreas Rauber

Authors

Shariq Bashir
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Rauber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andreas Rauber .

Editor information

Editors and Affiliations

Institute for Software Engineering & Interactive Systems, Vienna University of Technology, Vienna, Austria
Mihai Lupu
Research Platform Responsible Research and Innovation in Academic Practice, University of Vienna, Vienna, Austria
Katja Mayer
Information & Society Research Division, National Institute of Informatics, Tokyo, Japan
Noriko Kando
Patinformatics, LLC , Dublin, Ohio, USA
Anthony J. Trippe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bashir, S., Rauber, A. (2017). Retrieval Models Versus Retrievability. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 37. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53817-3_7

Download citation

DOI: https://doi.org/10.1007/978-3-662-53817-3_7
Published: 26 March 2017
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53816-6
Online ISBN: 978-3-662-53817-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics