Skip to main content

Inverted Files

  • Reference work entry
  • First Online:
  • 44 Accesses

Synonyms

Full text inverted index; Inverted index; Postings file

Definition

An Inverted file is an index data structure that maps content to its location within a database file, in a document or in a set of documents. It is normally composed of: (i) a vocabulary that contains all the distinct words found in a text and (ii), for each word t of the vocabulary, a list that contains statistics about the occurrences of t in the text. Such list is known as the inverted list of t. The inverted file is the most popular data structure used in document retrieval systems to support full text search.

Historical Background

Efforts for indexing electronic texts are found in literature since the beginning of the computational systems. For example, descriptions of Electronic Information Search Systems that are able to index and search text can be found in the early 1950s [1].

In a seminal work, Gerard Salton wrote a book in 1968, containing the basis for the modern information retrieval systems [2],...

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Recommended Reading

  1. Luhn HP. A statistical approach to mechanized encoding and searching of literary information. IBM J Res Dev. 1957;1(4):309–17.

    Article  MathSciNet  Google Scholar 

  2. Salton G. Automatic information organization and retrieval. New York: McGraw-Hill; 1968.

    Google Scholar 

  3. de Moura ES, dos Santos CF, Fernandes DR, Silva AS, Calado P, Nascimento MA. Improving web search efficiency via a locality based static pruning method. In: Proceedings of the 14th International World Wide Web Conference; 2005. p. 235–44.

    Google Scholar 

  4. Baeza-Yates R, Ribeiro-Neto B. Modern information retrieval. 2nd ed. Reading: Addison Wesley; 2011.

    Google Scholar 

  5. Anh V, Moffat A. Pruned query evaluation using pre-computed impacts. In: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2006. p. 372–9.

    Google Scholar 

  6. Anh V, Kretser O de, Moffat A. Vector-space ranking with effective early termination. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001. p. 35–42.

    Google Scholar 

  7. Strohman T, Croft WB. Efficient document retrieval in main memory. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2007. p. 175–82.

    Google Scholar 

  8. Ding S, Suel T. Faster top-k document retrieval using block-max in- dexes. In: Proceedings of the 34th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2011. p. 993–1002.

    Google Scholar 

  9. Rossi C, de Moura ES, Carvalho AL, da Silva AS. Fast document-at-a-time query processing using two-tier indexes. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval; 2013. p. 183–92.

    Google Scholar 

  10. Marcus F, Vanja J, Jinhui L, Srihari V, Xi-angfei Z, Zien Jason Y. Evaluation strategies for top-k queries over memory- resident inverted indexes. Proc VLDB Endow. 2011;4(12):1213–24.

    Google Scholar 

  11. Long X, Suel T. Three-level caching for efficient query processing in large Web search engines. Proceedings of the 14th International World Wide Web Conference; 2005. p. 257–66.

    Google Scholar 

  12. Kaszkiel M, Zobel J. Term-ordered query evaluation versus document- ordered query evaluation for large document databases. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998. p. 343–4.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Edleno Silva De Moura .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

De Moura, E.S., Cristo, M.A. (2018). Inverted Files. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1136

Download citation

Publish with us

Policies and ethics