Document Filtering for Fast Ranking

Persin, Michael

doi:10.1007/978-1-4471-2099-5_35

Michael Persin³

435 Accesses
19 Citations

Abstract

Ranking techniques are effective for finding answers in document collections but the cost of evaluation of ranked queries can be unacceptably high. We propose an evaluation technique that reduces both main memory usage and query evaluation time. based on early recognition of which documents are likely to be highly ranked. Our experiments show that, for our test data, the proposed technique evaluates queries in 20% of the time and 2% of the memory taken by the standard inverted file implementation, without degradation in retrieval effectiveness.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Buckley and A.F. Lewit. Optimisation of inverted vector searches. In Proc. ACM-SIGIR International Conference on Research and Development in Information Retrieval, pages 97–110, Montreal, Canada, June 1985.
Google Scholar
W.B. Frakes and R. Baeza-Yates, editors. Information Retrieval: Data Structures and Algorithms. Prentice-Hall, New Jersey, 1992.
Google Scholar
D. Harman and C. Candela. Retrieving records from a gigabyte of text on a minicomputer using statistical ranking. Journal of the American Society for Information Science, 41 (8): 581–589, 1990.
Article Google Scholar
D. Lucarella. A document retrieval system based upon nearest neighbour searching. Journal of Information Science, 14: 25–33, 1988.
Article Google Scholar
A. Moffat and J. Zobel. Parameterised compression for sparse bitmaps. ln Proc. ACM-SIGIR International Conference on Research and Development in Information. Retrieval, pages 274–285, Copenhagen, Denmark, June 1992. ACM Press.
Google Scholar
A. Moffat and J. Zobel. Fast ranking in limited space. Technical Report 93/11, Department of Computer Science, The University of Melbourm, 1993. Submitted to the 1994 Data Engineering conference.
Google Scholar
National Institute of Standards and Technology. Proc. Text Retrieval Conference (TRAC),Washington, November 1992. Special Publication 500–207.
Google Scholar
S.A. Perry and P. Willett. A reniew of the use of inverted files for best matchs searching in information retrieval systems. Journal of Information Science, 6: 59–66, 1983.
Article Google Scholar
M. Persin, J. Zobel, and R. Sacks-Davis. Fast document ranking for large scale information retrieval. Technical Report 94/1, Collaborative Information Technology Research Institute, Department of Computer Science, Royal Melbourne Institute of Technology, Australia, 1994.
Google Scholar
G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, MA, 1989.
Google Scholar
G. Salton and M.J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983.
MATH Google Scholar
J. Zobel, A. Moffat, and R. Sacks-Davis. An efficient indexing technique for full-text database systems. In Proc. International Conference on Very Large Databases, pages 352–362, Vancouver, Canada, August 1992.
Google Scholar
J. Zobel, A. Moffat, and R. Sacks-Davis. Memory-efficient ranking of document collections. Technical Report TR-92–53, Collaborative Information Technology Research Institute, RMIT and The University of Melbourne, Melbourne, Australia, August 1992.
Google Scholar

Download references

Author information

Authors and Affiliations

CITRI, Dept. of Computer Science, RMIT, 723 Swanston St., Carlton, Victoria, 3053, Australia
Michael Persin

Authors

Michael Persin
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Massachusetts, 01003, Amherst, MA, USA
Bruce W. Croft
Department of Computer Science, University of Glasgow, G12 8RZ, 8–17 Lilybank Gardens, Glasgow, Scotland
C. J. van Rijsbergen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Persin, M. (1994). Document Filtering for Fast Ranking. In: Croft, B.W., van Rijsbergen, C.J. (eds) SIGIR ’94. Springer, London. https://doi.org/10.1007/978-1-4471-2099-5_35

Download citation

DOI: https://doi.org/10.1007/978-1-4471-2099-5_35
Publisher Name: Springer, London
Print ISBN: 978-3-540-19889-5
Online ISBN: 978-1-4471-2099-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics