Encyclopedia of Database Systems

Living Edition
| Editors: Ling Liu, M. Tamer Özsu

Inverted Files

  • Edleno Silva De MouraEmail author
  • Marco Antonio Cristo
Living reference work entry
DOI: https://doi.org/10.1007/978-1-4899-7993-3_1136-2



An Inverted file is an index data structure that maps content to its location within a database file, in a document or in a set of documents. It is normally composed of: (i) a vocabulary that contains all the distinct words found in a text and (ii), for each word t of the vocabulary, a list that contains statistics about the occurrences of t in the text. Such list is known as the inverted list of t. The inverted file is the most popular data structure used in document retrieval systems to support full text search.

Historical Background

Efforts for indexing electronic texts are found in literature since the beginning of the computational systems. For example, descriptions of Electronic Information Search Systems that are able to index and search text can be found in the early 1950s [1].

In a seminal work, Gerard Salton wrote a book in 1968, containing the basis for the modern information retrieval systems [2],...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Luhn HP. A statistical approach to mechanized encoding and searching of literary information. IBM J Res Dev. 1957;1:309–17.MathSciNetCrossRefGoogle Scholar
  2. 2.
    Salton G. Automatic information organization and retrieval. New York: McGraw-Hill; 1968.Google Scholar
  3. 3.
    de Moura ES, dos Santos CF, Fernandes DR, Silva AS, Calado P, Nascimento MA. Improving web search efficiency via a locality based static pruning method. Proceedings of 14th International World Wide Web Conference; 2005. p. 235–44.Google Scholar
  4. 4.
    Baeza-Yates R, Ribeiro-Neto B. Modern information retrieval. 2nd ed. Reading: Addison Wesley; 2011.Google Scholar
  5. 5.
    Anh V, Moffat A. Pruned query evaluation using pre-computed impacts. ACM SIGIR. 2006. p. 372–9.Google Scholar
  6. 6.
    Anh V, Kretser O de, Moffat A. Vector-space ranking with effective early termination. ACM SIGIR. 2001. p. 35–42.Google Scholar
  7. 7.
    Strohman T, Croft WB. Efficient document retrieval in main memory. ACM SIGIR; 2007. p. 175–82.Google Scholar
  8. 8.
    Ding S, Suel T. Faster top-k document retrieval using block-max in- dexes. ACM SIGIR. 2011. p. 993–1002.Google Scholar
  9. 9.
    Rossi C, de Moura ES, Carvalho AL, da Silva AS. Fast document-at-a-time query processing using two-tier indexes. ACM SIGIR; 2013. p. 183–92.Google Scholar
  10. 10.
    Marcus F, Vanja J, Jinhui L, Srihari V, Xi-angfei Z, Zien Jason Y. Evaluation strategies for top-k queries over memory- resident inverted indexes. PVLDB. 2011;4(12):1213–24.Google Scholar
  11. 11.
    Long X, Suel T. Three-level caching for efficient query processing in large Web search engines. Proceedings of 14th International World Wide Web Conference; 2005. p. 257–66.Google Scholar
  12. 12.
    Kaszkiel M, Zobel J. Term-ordered query evaluation versus document- ordered query evaluation for large document databases. Proceedings of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998. p. 343–4.Google Scholar

Copyright information

© Springer Science+Business Media LLC 2017

Authors and Affiliations

  • Edleno Silva De Moura
    • 1
    Email author
  • Marco Antonio Cristo
    • 2
  1. 1.Federal University of AmazonasManausBrazil
  2. 2.FUCAPIManausBrazil

Section editors and affiliations

  • Mario A. Nascimento
    • 1
  1. 1.Dept. of Computing ScienceUniv. of AlbertaEdmontonCanada