Abstract
The main topic of this book is implementing hash tables; it’s only secondarily about hash functions. This is why you have assumed a priori that you have uniformly distributed hash keys. In reality, this is unlikely to be the case; real data are rarely random samples from the space of possible data values. In this chapter, you will learn about commonly used heuristic hash functions. In the next chapter, you will see an approach to achieving stronger probabilistic guarantees.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For all but two of the tables, that of size 64 and that of size 67, this means that the load is higher than 1, so this obviously will only work for chained hashing. The purpose of the examples in this chapter, however, is merely to show how keys are distributed over bins with tables of different sizes, so don’t worry about conflict resolution and load.
- 2.
When I say deterministic here, I mean that a hash function should always produce the same output on the same input. There are plenty of randomized hash functions, in the sense that they use random numbers as part of their construction. You fix these random numbers when you use the function to hash application keys. You can change from one hash function to another by picking new random numbers, but you can’t change them at arbitrary times if you want your function to consistently give you the same output for the same input. Universal hashing, which will be discussed in the next chapter, uses random numbers to create deterministic hash functions.
- 3.
The simplest I have seen was used to hash ASCII strings and only used the first character. For standard ASCII, there are only 128 characters (they use 7 bits per character), while for Extended ASCII there are 256. That is not the bad part, however. If you hash common words, such as variable names in a program, then they do not use the full set of ASCII characters. Using only the first character of a string is a very poor hash function.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Thomas Mailund
About this chapter
Cite this chapter
Mailund, T. (2019). Heuristic Hash Functions. In: The Joys of Hashing. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4066-3_6
Download citation
DOI: https://doi.org/10.1007/978-1-4842-4066-3_6
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-4065-6
Online ISBN: 978-1-4842-4066-3
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)