Skip to main content

A Simple, Fast, and Compact Static Dictionary

  • Conference paper
Algorithms and Computation (ISAAC 2009)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5878))

Included in the following conference series:

  • 1327 Accesses

Abstract

We present a new static dictionary that is very fast and compact, while also extremely easy to implement. A combination of properties make this algorithm very attractive for applications requiring large static dictionaries:

  1. 1

    High performance, with membership queries taking O(1)-time with a near-optimal constant.

  2. 1

    Continued high performance in external memory, with queries requiring only 1-2 disk seeks. If the dictionary has n items in \(\left\{ 0, ..., m\!-\!1 \right\}\) and d is the number of bytes retrieved from disk on each read, then the average number of seeks is \(\min\left(1.63, 1 + O\left( \frac{\sqrt{n} \log m}{d} \right)\right)\).

  3. 1

    Efficient use of space, storing n items from a universe of size m in \(n \log m - \frac{1}{2} n \log n + O\left(n + \log \log m\right)\) bits. We prove this space bound with a novel application of the Kolmogorov-Smirnov distribution.

  4. 1

    Simplicity, with a 20-line pseudo-code construction algorithm and 4-line query algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brodnik, A., Munro, J.I.: Membership in Constant Time and Minimum Space. In: van Leeuwen, J. (ed.) ESA 1994. LNCS, vol. 855, pp. 72–81. Springer, Heidelberg (1994)

    Chapter  Google Scholar 

  2. Brodnik, A., Munro, J.I.: Membership in Constant Time and Almost-Minimum Space. SIAM Journal of Computing 28, 1627–1640 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  3. Carter, J., Wegman, M.: Universal Classes of Hash Functions. Journal of Computer and System Sciences 18, 143–154 (1979)

    Article  MATH  MathSciNet  Google Scholar 

  4. Cleary, J.G.: Compact Hash Tables Using Bidirectional Linear Probing. IEEE Transactions on Computers 33, 828–834 (1984)

    Article  MATH  Google Scholar 

  5. Feller, W.: On the Kolmogorov-Smirnov Limit Theorems for Empirical Distributions. Annals of Mathematical Statistics 19(2), 177–189 (1948)

    Article  MATH  MathSciNet  Google Scholar 

  6. Vitter, J., Flajolet, P.: Average-case analysis of algorithms and data structures. In: van Leeuwen, J. (ed.) Handbook of Theoretical Computer Science, pp. 431–524. Elsevier, Amsterdam (1990)

    Google Scholar 

  7. Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.: Space Efficient Hash Tables With Worst Case Constant Access Time. In: Alt, H., Habib, M. (eds.) STACS 2003. LNCS, vol. 2607, pp. 271–282. Springer, Heidelberg (2003)

    Google Scholar 

  8. Fredman, M., Komlós, J., Szemerédi, E.: Storing a Sparse Table with O(1) Worst Case Access Time. Journal of the ACM 31(3), 538–544 (1984)

    Article  MATH  Google Scholar 

  9. Grossi, R., Orlandi, A., Raman, R., Rao, S.: More Haste, Less Waste: Lowering the Redundancy in Fully Indexable Dictionaries. In: STACS 2009, pp. 517–528 (2009)

    Google Scholar 

  10. Jensen, M., Pagh, R.: Optimality in External Memory Hashing. Algorithmica 52(3), 403–411 (2008)

    Article  MATH  MathSciNet  Google Scholar 

  11. Knuth, D.: Sorting and Searching. The Art of Computer Programming, vol. 3. Addison-Wesley Publishing Company, Reading (1973)

    Google Scholar 

  12. Kolmogoroff, A.: Confidence limits for an unknown distribution function. Annals of Mathematical Statistics 12, 461–463 (1941)

    Article  MATH  MathSciNet  Google Scholar 

  13. Pagh, R.: Low Redundancy in Static Dictionaries with Constant Query Time. SIAM Journal on Computing 31(2), 353–363 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  14. Pagh, R., Rodler, F.F.: Cuckoo hashing. In: Meyer auf der Heide, F. (ed.) ESA 2001. LNCS, vol. 2161, pp. 121–133. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  15. Patrascu, M.: Succincter. In: FOCS 2008, pp. 305–313 (2008)

    Google Scholar 

  16. Raman, R., Raman, V., Rao, S.: Succinct Indexable Dictionaries with Applications to Encoding k-ary Trees, Prefix Sums and Multisets. ACM Transactions on Algorithms 3(4), Article 43 (2007)

    Google Scholar 

  17. Smirnov, N.: On the estimation of the discrepancy between empirical curves of distribution for two independent samples. Bulletin Mathématique de l’Université de Mouscou, 2(fasc. 2) (1939)

    Google Scholar 

  18. Vitter, J.: Algorithms and Data Structures for External Memory. Now Publishers, Inc., Hanover (2008)

    Google Scholar 

  19. Witten, I.H., Moffat, A., Bell, T.C.: Managing gigabytes: compressing and indexing documents and images. Morgan Kaufmann Publishers, Taylor & Francis, San Francisco, London (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schneider, S., Spertus, M. (2009). A Simple, Fast, and Compact Static Dictionary. In: Dong, Y., Du, DZ., Ibarra, O. (eds) Algorithms and Computation. ISAAC 2009. Lecture Notes in Computer Science, vol 5878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10631-6_86

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10631-6_86

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10630-9

  • Online ISBN: 978-3-642-10631-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics