Abstract
Bloom filtering is an important technique for space efficient storage of a conservative approximation of a set S. The set stored may have up to some specified number of “false positive” members, but all elements of S are included. In this paper we consider lossy dictionaries that are also allowed to have “false negatives”. The aim is to maximize the weight of included keys within a given space constraint. This relaxation allows a very fast and simple data structure making almost optimal use of memory. Being more time efficient than Bloom filters, we believe our data structure to be well suited for replacing Bloom filters in some applications. Also, the fact that our data structure supports information associated to keys paves the way for new uses, as illustrated by an application in lossy image compression.
Partially supported by the IST Programme of the EU under contract number IST-1999-14186 (ALCOM-FT).
Basic Research in Computer Science (http://www.brics.dk), funded by the Danish National Research Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Burton H. Bloom. Space/time trade-offs in hash coding with allowable errors. Communications of the ACM, 13(7):422–426, July 1970.
Andrej Brodnik and J. Ian Munro. Membership in constant time and almost-minimum space. SIAM J. Comput., 28(5):1627–1640 (electronic), 1999.
Harry Buhrman, Peter Bro Miltersen, Jaikumar Radhakrishnan, and S. Venkatesh. Are bitvectors optimal? In Proceedings of the 32nd Annual ACM Symposium on Theory of Computing (STOC’ 00), pages 449–458. ACM Press, New York, 2000.
Larry Carter, Robert Floyd, John Gill, George Markowsky, and Mark Wegman. Exact and approximate membership testers. In Proceedings of the 10th Annual ACM Symposium on Theory of Computing (STOC’ 78), pages 59–65. ACM Press, New York, 1978.
William J. Cook, William H. Cunningham, William R. Pulleyblank, and Alexander Schrijver. Combinatorial optimization. John Wiley & Sons Inc., New York, 1998. A Wiley-Interscience Publication.
Martin Dietzfelbinger, Torben Hagerup, Jyrki Katajainen, and Martti Penttonen. A reliable randomized algorithm for the closest-pair problem. Journal of Algorithms, 25(1):19–51, 1997. doi:10.1006/jagm.1997.0873.
Michael L. Fredman, János Komlós, and Endre Szemerédi. Storing a sparse table with O(1) worst case access time. J. Assoc. Comput. Mach., 31(3):538–544, 1984.
Insung Ihm and Sanghun Park. Wavelet-based 3D compression scheme for very large volume data. Graphics Interface, pages 107–116, 1998.
Tae-Young Kim and Yeong Gil Shin. An efficient wavelet-based compression method for volume rendering. In Seventh Pacific Conference on Computer Graphics and Applications, pages 147–156, 1999.
George Marsaglia. The Marsaglia random number CD ROM including the diehard battery of tests of randomness. http://stat.fsu.edu/pub/diehard/.
Rasmus Pagh. Low Redundancy in Static Dictionaries with O(1) Lookup Time. In Proceedings of the 26th International Colloquium on Automata, Languages and Programming (ICALP’ 99), volume 1644 of Lecture Notes in Computer Science, pages 595–604. Springer-Verlag, Berlin, 1999.
Rasmus Pagh. On the Cell Probe Complexity of Membership and Perfect Hashing. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing (STOC’ 01). ACM Press, New York, 2001.
Rasmus Pagh and Flemming Friche Rodler. Cuckoo hashing. To appear in Proceedings of ESA 2001, 2001.
Flemming Friche Rodler. Wavelet based 3D compression with fast random access for very large volume data. In Seventh Pacific Conference on Computer Graphics and Applications, pages 108–117, Seoul, Korea, 1999.
Flemming Friche Rodler and Rasmus Pagh. Fast random access to wavelet compressed volumetric data using hashing. Manuscript.
Robert Endre Tarjan. Efficiency of a good but not linear set union algorithm. J. Assoc. Comput. Mach., 22:215–225, 1975.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pagh, R., Rodler, F.F. (2001). Lossy Dictionaries. In: auf der Heide, F.M. (eds) Algorithms — ESA 2001. ESA 2001. Lecture Notes in Computer Science, vol 2161. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44676-1_25
Download citation
DOI: https://doi.org/10.1007/3-540-44676-1_25
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42493-2
Online ISBN: 978-3-540-44676-7
eBook Packages: Springer Book Archive