Skip to main content

Signature Files

  • Reference work entry
  • First Online:
Encyclopedia of Database Systems

Definition

A signature file allows fast search for text data. It is typically a very compact data structure that aims at minimizing disk access at query time. Query processing is performed in two stages: filtering, where false negatives are guaranteed to not occur but false positives may occur, and, query refinement, where false positives are removed.

Historical Background

Efficient and effective text indexing is a well-known and long-standing problem in information retrieval. While inverted files are a de facto standard for text indexing, in the early days, its storage overhead was not acceptable for larger datasets. In addition, accessing an inverted file on disk may require a relatively large number of (expensive) disk seeks. The main motivation for signature files is to allow fast filtering of text using a linear scan of the signature file for finding text segments that may contain the queried term(s). Given that the found segments may be false positives, a refinement step is...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 4,499.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 6,499.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

  1. Baeza-Yates RA, Ribeiro-Neto BA. Modern information retrieval. New York: ACM Press/Addison-Wesley; 1999.

    Google Scholar 

  2. Faloutsos C. Access methods for text. ACM Comput Surv. 1985;17(1):49–74.

    Article  Google Scholar 

  3. Zobel J, Moffat A, Kotagiri R. Inverted files versus signature files for text indexing. ACM Trans Database Syst. 1998;23(4):453–90.

    Article  Google Scholar 

  4. Deppish U. S-tree: a dynamic balanced signature index for office retrieval. In: Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1986. p. 77–87.

    Google Scholar 

  5. Frakes WB, Baeza-Yates RA. Information retrieval data structures & algorithms. Upper Saddle River: Prentice-Hall; 1992.

    Google Scholar 

  6. Witten IH, Moffat A, Bell TC. Managing gigabytes: compressing and indexing documents and images. 2nd ed. San Francisco: Morgan Kaufman; 1999.

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mario A. Nascimento .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media, LLC, part of Springer Nature

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Nascimento, M.A. (2018). Signature Files. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1138

Download citation

Publish with us

Policies and ethics