Skip to main content

Heuristic Hash Functions

  • Chapter
  • First Online:
The Joys of Hashing
  • 1422 Accesses

Abstract

The main topic of this book is implementing hash tables; it’s only secondarily about hash functions. This is why you have assumed a priori that you have uniformly distributed hash keys. In reality, this is unlikely to be the case; real data are rarely random samples from the space of possible data values. In this chapter, you will learn about commonly used heuristic hash functions. In the next chapter, you will see an approach to achieving stronger probabilistic guarantees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 34.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 44.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    For all but two of the tables, that of size 64 and that of size 67, this means that the load is higher than 1, so this obviously will only work for chained hashing. The purpose of the examples in this chapter, however, is merely to show how keys are distributed over bins with tables of different sizes, so don’t worry about conflict resolution and load.

  2. 2.

    When I say deterministic here, I mean that a hash function should always produce the same output on the same input. There are plenty of randomized hash functions, in the sense that they use random numbers as part of their construction. You fix these random numbers when you use the function to hash application keys. You can change from one hash function to another by picking new random numbers, but you can’t change them at arbitrary times if you want your function to consistently give you the same output for the same input. Universal hashing, which will be discussed in the next chapter, uses random numbers to create deterministic hash functions.

  3. 3.

    The simplest I have seen was used to hash ASCII strings and only used the first character. For standard ASCII, there are only 128 characters (they use 7 bits per character), while for Extended ASCII there are 256. That is not the bad part, however. If you hash common words, such as variable names in a program, then they do not use the full set of ASCII characters. Using only the first character of a string is a very poor hash function.

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Thomas Mailund

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Mailund, T. (2019). Heuristic Hash Functions. In: The Joys of Hashing. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-4066-3_6

Download citation

Publish with us

Policies and ethics