Definition
A signature file allows fast search for text data. It is typically a very compact data structure that aims at minimizing disk access at query time. Query processing is performed in two stages: filtering, where false negatives are guaranteed to not occur but false positives may occur, and, query refinement, where false positives are removed.
Historical Background
Efficient and effective text indexing is a well-known and long-standing problem in information retrieval. While inverted files are a de facto standard for text indexing, in the early days, its storage overhead was not acceptable for larger datasets. In addition, accessing an inverted file on disk may require a relatively large number of (expensive) disk seeks. The main motivation for signature files is to allow fast filtering of text using a linear scan of the signature file for finding text segments that may contain the queried term(s). Given that the found segments may be false positives, a refinement step is...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Baeza-Yates RA, Ribeiro-Neto BA. Modern information retrieval. New York: ACM Press/Addison-Wesley; 1999.
Faloutsos C. Access methods for text. ACM Comput Surv. 1985;17(1):49–74.
Zobel J, Moffat A, Kotagiri R. Inverted files versus signature files for text indexing. ACM Trans Database Syst. 1998;23(4):453–90.
Deppish U. S-tree: a dynamic balanced signature index for office retrieval. In: Proceedings of the 9th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1986. p. 77–87.
Frakes WB, Baeza-Yates RA. Information retrieval data structures & algorithms. Upper Saddle River: Prentice-Hall; 1992.
Witten IH, Moffat A, Bell TC. Managing gigabytes: compressing and indexing documents and images. 2nd ed. San Francisco: Morgan Kaufman; 1999.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Nascimento, M.A. (2018). Signature Files. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_1138
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_1138
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering