Skip to main content

Parallel Hash Join Algorithms for Dynamic Load Balancing in a Shared Disks Cluster

  • Conference paper
Book cover Computational Science and Its Applications - ICCSA 2006 (ICCSA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3984))

Included in the following conference series:

Abstract

Most of previous parallel join algorithms assume a shared nothing (SN) cluster, where each database partition is owned by a single processing node. While SN cluster can interconnect a large number of nodes and support a geographically distributed environment, it may suffer from poor facility for load balancing and system availability compared to a shared disks sharing (SD) cluster. In this paper, we first propose a dynamic load balancing strategy by exploiting the characteristics of SD cluster. Then we parallelize conventional hash join algorithms using the dynamic load balancing strategy. We also explore the performance of parallel join algorithms using a simulation model of SD cluster. The experiment results show that the proposed parallel join algorithms can achieve higher potential for dynamic load balancing with the inherent flexibility of SD cluster.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bamha, M., Exbrayat, M.: Pipelining a Skew-Insensitive Parallel Join Algorithm. Parallel Processing Letters 13(3), 317–328 (2003)

    Article  MathSciNet  Google Scholar 

  2. DB2 Universal Database for z/OS Version 8 - Data Sharing: Planning and Administration. IBM SC18-7417-01 (2004)

    Google Scholar 

  3. Heal, A., Yuan, A., El-Rewni, H.: Dynamic Data Reallocation for Skew Management in Shared Nothing Parallel Databases. Distributed and Parallel Databases 5(3), 271–288 (1997)

    Article  Google Scholar 

  4. Imasaki, K., Nguyen, H., Dandamudi, S.: Performance comparison of pipelined hash joins on workstation clusters. In: Sahni, S.K., Prasanna, V.K., Shukla, U. (eds.) HiPC 2002. LNCS, vol. 2552, pp. 264–275. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  5. Lu, H., Ooi, B., Tan, K.: Query Processing in Parallel Relational Database Systems. IEEE Computer Society Press, Los Alamitos (1995)

    Google Scholar 

  6. Lu, H., Tan, K.: Dynamic and Load-Balanced Task Oriented Database Query Processing in Parallel Systems. In: Pirotte, A., Delobel, C., Gottlob, G. (eds.) EDBT 1992. LNCS, vol. 580, pp. 357–372. Springer, Heidelberg (1992)

    Chapter  Google Scholar 

  7. Ohn, K., Cho, H.: Cache conscious dynamic transaction routing in a shared disks cluster. In: Laganá, A., Gavrilova, M.L., Kumar, V., Mun, Y., Tan, C.J.K., Gervasi, O. (eds.) ICCSA 2004. LNCS, vol. 3045, pp. 548–557. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  8. Ranade, D.: Shared Data Clusters. John Wiley, Inc., Chichester (2002)

    Google Scholar 

  9. Schikuta, E., Kirkovits, P.: Cluster Based Hybrid Hash Join: Analysis and Evaluation. In: Proc. IEEE Conf. Cluster Computing, pp. 461–466 (2002)

    Google Scholar 

  10. Schneider, D., DeWitt, D.: A Performance Evaluation of Four Parallel Join Algorithms in a Shared-Nothing Multiprocessor Environment. In: Proc. ACM SIGMOD Conf., pp. 110–121 (1989)

    Google Scholar 

  11. Schwetman, H.: CSIM18 Simulation Engine. Mesquite Software, Inc. (1996)

    Google Scholar 

  12. Silberschatz, A., Korth, H.F., Sudarshan, S.: Database System Concepts, 4th edn. McGraw Hill, New York (2002)

    Google Scholar 

  13. Vallath, M.: Oracle Real Application Clusters. Elsevier Digital Press, Amsterdam (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Moon, A., Cho, H. (2006). Parallel Hash Join Algorithms for Dynamic Load Balancing in a Shared Disks Cluster. In: Gavrilova, M.L., et al. Computational Science and Its Applications - ICCSA 2006. ICCSA 2006. Lecture Notes in Computer Science, vol 3984. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751649_23

Download citation

  • DOI: https://doi.org/10.1007/11751649_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-34079-9

  • Online ISBN: 978-3-540-34080-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics