Skip to main content

Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract)

  • Conference paper
Book cover Automata, Languages and Programming (ICALP 2008)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 5125))

Included in the following conference series:

Abstract

The retrieval problem is the problem of associating data with keys in a set. Formally, the data structure must store a function \(f\colon U\to \{0,1\}^r\) that has specified values on the elements of a given set S ⊆ U, |S| = n, but may have any value on elements outside S. All known methods (e. g. those based on perfect hash functions), induce a space overhead of Θ(n) bits over the optimum, regardless of the evaluation time. We show that for any k, query time O(k) can be achieved using space that is within a factor 1 + e − k of optimal, asymptotically for large n. The time to construct the data structure is O(n), expected. If we allow logarithmic evaluation time, the additive overhead can be reduced to O(loglogn) bits whp. A general reduction transfers the results on retrieval into analogous results on approximate membership, a problem traditionally addressed using Bloom filters. Thus we obtain space bounds arbitrarily close to the lower bound for this problem as well. The evaluation procedures of our data structures are extremely simple. For the results stated above we assume free access to fully random hash functions. This assumption can be justified using space o(n) to simulate full randomness on a RAM.

The main ideas for this paper were conceived while the authors were participating in the 2006 Seminar on Data Structures at IBFI Schloss Dagstuhl, Germany.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alstrup, S., Brodal, G.S., Rauhe, T.: Optimal static range reporting in one dimension. In: Proc. 33rd ACM STOC, pp. 476–482 (2001)

    Google Scholar 

  2. Bloom, B.H.: Space/time trade-offs in hash coding with allowable errors. Commun. ACM 13(7), 422–426 (1970)

    Article  MATH  Google Scholar 

  3. Botelho, F.C., Pagh, R., Ziviani, N.: Simple and space-efficient minimal perfect hash functions. In: Dehne, F., Sack, J.-R., Zeh, N. (eds.) WADS 2007. LNCS, vol. 4619, pp. 139–150. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  4. Broder, A.Z., Mitzenmacher, M.: Network applications of Bloom filters: A survey. In: Proc. 40th Annual Allerton Conference on Communication, Control, and Computing, pp. 636–646. ACM Press, New York (2002)

    Google Scholar 

  5. Cain, J.A., Sanders, P., Wormald, N.C.: The random graph threshold for k-orientiability and a fast algorithm for optimal multiple-choice allocation. In: Proc. 18th ACM-SIAM SODA, pp. 469–476 (2007)

    Google Scholar 

  6. Calkin, N.J.: Dependent sets of constant weight binary vectors. Combinatorics, Probability and Computing 6(3), 263–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  7. Carter, L., Floyd, R.W., Gill, J., Markowsky, G., Wegman, M.N.: Exact and approximate membership testers. In: Proc. 10th ACM STOC, pp. 59–65 (1978)

    Google Scholar 

  8. Chazelle, B., Kilian, J., Rubinfeld, R., Tal, A.: The Bloomier filter: an efficient data structure for static support lookup tables. In: Proc. 15th ACM-SIAM SODA, pp. 30–39 (2004)

    Google Scholar 

  9. Cooper, C.: On the rank of random matrices. Random Struct. Algorithms 16(2), 209–232 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  10. Czumaj, A., Riley, C., Scheideler, C.: Perfectly Balanced Allocation. In: Arora, S., Jansen, K., Rolim, J.D.P., Sahai, A. (eds.) RANDOM 2003 and APPROX 2003. LNCS, vol. 2764, pp. 240–251. Springer, Heidelberg (2003)

    Google Scholar 

  11. Dietzfelbinger, M.: Design strategies for minimal perfect hash functions. In: Proc. 4th Int. Symp. on Stochastic Algorithms: Foundations and Applications (SAGA). LNCS, vol. 4665, pp. 2–17. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  12. Dietzfelbinger, M., Pagh, R.: Succinct data structures for retrieval and approximate membership, Technical Report, arXiv:0803.3693v1 [cs.DS] (March 26, 2008)

    Google Scholar 

  13. Dietzfelbinger, M., Weidling, C.: Balanced allocation and dictionaries with tightly packed constant size bins. Theoret. Comput. Sci. 380(1–2), 47–68 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  14. Fernholz, D., Ramachandran, V.: The k-orientability thresholds for G n,p. In: Proc. 18th ACM-SIAM SODA, pp. 459–468 (2007)

    Google Scholar 

  15. Fotakis, D., Pagh, R., Sanders, P., Spirakis, P.G.: Space efficient hash tables with worst case constant access time. Theory Comput. Syst. 38(2), 229–248 (2005)

    Article  MathSciNet  MATH  Google Scholar 

  16. Hagerup, T., Tholey, T.: Efficient minimal perfect hashing in nearly minimal space. In: Ferreira, A., Reichel, H. (eds.) STACS 2001. LNCS, vol. 2010, pp. 317–326. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  17. Majewski, B.S., Wormald, N.C., Havas, G., Czech, Z.J.: A family of perfect hashing methods. Computer J. 39(6), 547–554 (1996)

    Article  Google Scholar 

  18. Mitzenmacher, M.: Compressed Bloom filters. IEEE/ACM Transactions on Networking 10(5), 604–612 (2002)

    Article  MATH  Google Scholar 

  19. Mortensen, C.W., Pagh, R., Pǎtraşcu, M.: On dynamic range reporting in one dimension. In: Proc. 37th ACM STOC, pp. 104–111 (2005)

    Google Scholar 

  20. Pagh, R., Rodler, F.F.: Cuckoo Hashing. J. Algorithms 51, 122–144 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  21. Panigrahy, R.: Efficient hashing with lookups in two memory accesses. In: Proc. 16th ACM-SIAM SODA, pp. 830–839 (2005)

    Google Scholar 

  22. Porat, E.: An optimal Bloom filter replacement based on matrix solving, Technical Report, arXiv:0804.1845v1 [cs.DS] (April 11, 2008)

    Google Scholar 

  23. Seiden, S.S., Hirschberg, D.S.: Finding succinct ordered minimal perfect hash functions. Inf. Process. Lett. 51(6), 283–288 (1994)

    Article  MathSciNet  MATH  Google Scholar 

  24. Zukowski, M., Heman, S., Boncz, P.A.: Architecture-conscious hashing. In: Proc. Int. Workshop on Data Management on New Hardware (DaMoN), Chicago, 8 pages, Article No. 6 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Dietzfelbinger, M., Pagh, R. (2008). Succinct Data Structures for Retrieval and Approximate Membership (Extended Abstract). In: Aceto, L., Damgård, I., Goldberg, L.A., Halldórsson, M.M., Ingólfsdóttir, A., Walukiewicz, I. (eds) Automata, Languages and Programming. ICALP 2008. Lecture Notes in Computer Science, vol 5125. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70575-8_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70575-8_32

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70574-1

  • Online ISBN: 978-3-540-70575-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics