Locality-Adaptive Parallel Hash Joins Using Hardware Transactional Memory

Shanbhag, Anil; Pirk, Holger; Madden, Sam

doi:10.1007/978-3-319-56111-0_7

Anil Shanbhag¹⁸,
Holger Pirk¹⁸ &
Sam Madden¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10195))

Included in the following conference series:

745 Accesses
1 Citations

Abstract

Previous work [1] has claimed that the best performing implementation of in-memory hash joins is based on (radix-)partitioning of the build-side input. Indeed, despite the overhead of partitioning, the benefits from increased cache-locality and synchronization free parallelism in the build-phase outweigh the costs when the input data is randomly ordered. However, many datasets already exhibit significant spatial locality (i.e., non-randomness) due to the way data items enter the database: through periodic ETL or trickle loaded in the form of transactions. In such cases, the first benefit of partitioning — increased locality — is largely irrelevant. In this paper, we demonstrate how hardware transactional memory (HTM) can render the other benefit, freedom from synchronization, irrelevant as well.

Specifically, using careful analysis and engineering, we develop an adaptive hash join implementation that outperforms parallel radix-partitioned hash joins as well as sort-merge joins on data with high spatial locality. In addition, we show how, through lightweight (less than 1% overhead) runtime monitoring of the transaction abort rate, our implementation can detect inputs with low spatial locality and dynamically fall back to radix-partitioning of the build-side input. The result is a hash join implementation that is more than 3 times faster than the state-of-the-art on high-locality data and never more than 1% slower.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://sites.google.com/site/murmurhash.
2.
Earlier implementations suffered from a bug that caused Intel to deactivate the feature in a microcode update.
3.
Note that fully shuffled unique data items have even worse locality than uniform randomly generated (non-unique) data items because there is zero probability for re-accessing a data item.
4.
The ratio is even higher for smaller datatypes.
5.
We considered breaking it down by abort code but found no useful correlation (see Appendix A).

References

Balkesen, C., et al.: Main-memory hash joins on multi-core CPUs: tuning to the underlying hardware. In: ICDE (2013)
Google Scholar
Blanas, S., Li, Y., Patel, J.M.: Design and evaluation of main memory hash join algorithms for multi-core CPUs. In: SIGMOD (2011)
Google Scholar
Hennessy, J.L., Patterson, D.A.: Computer Architecture: A Quantitative Approach. Elsevier, Amsterdam (2011)
MATH Google Scholar
Herlihy, M., Moss, J.E.B.: Transactional memory: architectural support for lock-free data structures. ACM (1993)
Google Scholar
Jacobi, C., Slegel, T., Greiner, D.: Transactional memory architecture and implementation for IBM system Z. In: MICRO (2012)
Google Scholar
Kim, C., et al.: Sort vs. hash revisited: fast join implementation on modern multi-core CPUs. In: VLDB (2009)
Google Scholar
Leis, V., Kemper, A., Neumann, T.: Exploiting hardware transactional memory in main-memory databases. In: ICDE (2014)
Google Scholar
Makreshanski, D., Levandoski, J., Stutsman, R.: To lock, swap, or elide: on the interplay of hardware transactional memory and lock-free indexing. Proc. VLDB Endow. 8(11), 1298–1309 (2015)
Article Google Scholar
Manegold, S., Boncz, P., Kersten, M.: Optimizing main-memory join on modern hardware. In: TKDE (2002)
Google Scholar
Peters, T.: Description of timsort. http://bugs.python.org/file4451/timsort.txt
Rogaway, P., Shrimpton, T.: Cryptographic hash-function basics: definitions, implications, and separations for preimage resistance, second-preimage resistance, and collision resistance. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp. 371–388. Springer, Heidelberg (2004). doi:10.1007/978-3-540-25937-4_24
Chapter Google Scholar
Shavit, N., Touitou, D.: Software transactional memory. Distrib. Comput. 10, 99–116 (1997)
Article Google Scholar
Tran, K.Q., Blanas, S., Naughton, J.F.: On transactional memory, spinlocks, and database transactions. In: ADMS (2010)
Google Scholar
Yoo, R.M., et al.: Performance evaluation of Intel transactional synchronization extensions for high-performance computing. In: SC (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

MIT CSAIL, Cambridge, USA
Anil Shanbhag, Holger Pirk & Sam Madden

Authors

Anil Shanbhag
View author publications
You can also search for this author in PubMed Google Scholar
Holger Pirk
View author publications
You can also search for this author in PubMed Google Scholar
Sam Madden
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anil Shanbhag .

Editor information

Editors and Affiliations

Ohio State University, Columbus, Ohio, USA
Spyros Blanas
IBM Thomas J Watson Research Center, Yorktown Heights, New York, USA
Rajesh Bordawekar
Oracle Cor., Redwood Shores, California, USA
Tirthankar Lahiri
Computer Science Department, Microsoft Corporation, Redmond, Washington, USA
Justin Levandoski
Carnegie Mellon University, Pittsburgh, Pennsylvania, USA
Andrew Pavlo

Appendices

A Transaction Abort Breakdown

When studying the pseudocode of our approach in Sect. 4.6, a reader may note that one might use the status code to determine the reason for aborted transactions and use this as runtime feedback. Figure 8 shows a breakdown of the reason for aborts observed with when running the TSize-Adaptive implementation (when varying the shuffle window size). Capacity aborts occur if transaction working set exceeds L1 cache size or if more than A cache lines of the same cache set are accessed, where A is the L1 cache associativity. Conflict abort happens if two transactions read/write sets overlap. The main reason we cannot use this information as runtime feedback is that most transaction aborts have return code set to 0, i.e., giving no information about the reason for abort (\(RC = 0\) line) and degree of noise is high for the other return codes.

B Non-unique Inputs

While we consider the problem of efficient conflict handling out of scope of this paper, we still consider it important to establish that the presented techniques do not prevent conflict handling. To illustrate this, consider Fig. 9 which corresponds to Fig. 7 run on uniform randomly generated integers in the domain 1 to n (the size of the input) which, naturally, includes duplicate values. As in Fig. 7, the experiment is to perform the full join but only counting the number of matches. As stated in Sect. 4, we use bucket-chaining to handle overflows of the buckets and re-inserting to handle aborted transaction. The figure shows a similar pattern to Fig. 7 but exposes a suboptimal configuration of the threshold for switching to the radix-partitioned implementation. This indicates that a conflict-aware fallback strategy may be worthwhile.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shanbhag, A., Pirk, H., Madden, S. (2017). Locality-Adaptive Parallel Hash Joins Using Hardware Transactional Memory. In: Blanas, S., Bordawekar, R., Lahiri, T., Levandoski, J., Pavlo, A. (eds) Data Management on New Hardware. ADMS IMDM 2016 2016. Lecture Notes in Computer Science(), vol 10195. Springer, Cham. https://doi.org/10.1007/978-3-319-56111-0_7

Download citation

DOI: https://doi.org/10.1007/978-3-319-56111-0_7
Published: 23 March 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-56110-3
Online ISBN: 978-3-319-56111-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics