Abstract
We present a new static dictionary that is very fast and compact, while also extremely easy to implement. A combination of properties make this algorithm very attractive for applications requiring large static dictionaries:
-
1
High performance, with membership queries taking O(1)-time with a near-optimal constant.
-
1
Continued high performance in external memory, with queries requiring only 1-2 disk seeks. If the dictionary has n items in \(\left\{ 0, ..., m\!-\!1 \right\}\) and d is the number of bytes retrieved from disk on each read, then the average number of seeks is \(\min\left(1.63, 1 + O\left( \frac{\sqrt{n} \log m}{d} \right)\right)\).
-
1
Efficient use of space, storing n items from a universe of size m in \(n \log m - \frac{1}{2} n \log n + O\left(n + \log \log m\right)\) bits. We prove this space bound with a novel application of the Kolmogorov-Smirnov distribution.
-
1
Simplicity, with a 20-line pseudo-code construction algorithm and 4-line query algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Brodnik, A., Munro, J.I.: Membership in Constant Time and Minimum Space. In: van Leeuwen, J. (ed.) ESA 1994. LNCS, vol. 855, pp. 72–81. Springer, Heidelberg (1994)
Brodnik, A., Munro, J.I.: Membership in Constant Time and Almost-Minimum Space. SIAM Journal of Computing 28, 1627–1640 (1999)
Carter, J., Wegman, M.: Universal Classes of Hash Functions. Journal of Computer and System Sciences 18, 143–154 (1979)
Cleary, J.G.: Compact Hash Tables Using Bidirectional Linear Probing. IEEE Transactions on Computers 33, 828–834 (1984)
Feller, W.: On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions. Annals of Mathematical Statistics 19(2), 177–189 (1948)
Vitter, J., Flajolet, P.: Average-case analysis of algorithms and data structures. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, pp. 431–524. Elsevier, Amsterdam (1990)
Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.: Space Efficient Hash Tables With Worst Case Constant Access Time. In: Alt, H., Habib, M. (eds.) STACS 2003. LNCS, vol. 2607, pp. 271–282. Springer, Heidelberg (2003)
Fredman, M., Komlós, J., Szemerédi, E.: Storing a Sparse Table with O(1) Worst Case Access Time. Journal of the ACM 31(3), 538–544 (1984)
Grossi, R., Orlandi, A., Raman, R., Rao, S.: More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries. In: STACS 2009, pp. 517–528 (2009)
Jensen, M., Pagh, R.: Optimality in External Memory Hashing. Algorithmica 52(3), 403–411 (2008)
Knuth, D.: Sorting and Searching. The Art of Computer Programming, vol. 3. Addison-Wesley Publishing Company, Reading (1973)
Kolmogoroff, A.: Confidence limits for an unknown distribution function. Annals of Mathematical Statistics 12, 461–463 (1941)
Pagh, R.: Low Redundancy in Static Dictionaries with Constant Query Time. SIAM Journal on Computing 31(2), 353–363 (2001)
Pagh, R., Rodler, F.F.: Cuckoo hashing. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS, vol. 2161, pp. 121–133. Springer, Heidelberg (2001)
Patrascu, M.: Succincter. In: FOCS 2008, pp. 305–313 (2008)
Raman, R., Raman, V., Rao, S.: Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. ACM Transactions on Algorithms 3(4), Article 43 (2007)
Smirnov, N.: On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bulletin Mathématique de l’Université de Mouscou, 2(fasc. 2) (1939)
Vitter, J.: Algorithms and Data Structures for External Memory. Now Publishers, Inc., Hanover (2008)
Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishers, Taylor & Francis, San Francisco, London (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schneider, S., Spertus, M. (2009). A Simple, Fast, and Compact Static Dictionary. In: Dong, Y., Du, DZ., Ibarra, O. (eds) Algorithms and Computation. ISAAC 2009. Lecture Notes in Computer Science, vol 5878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10631-6_86
Download citation
DOI: https://doi.org/10.1007/978-3-642-10631-6_86
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10630-9
Online ISBN: 978-3-642-10631-6
eBook Packages: Computer ScienceComputer Science (R0)