Skip to main content

Performance Analysis of a Parallel Sort Merge Join on Cluster Architectures

  • Conference paper
  • 590 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3719))

Abstract

We developed a concise but comprehensive analytical model for the well-known sort merge Join algorithm on cost effective cluster architectures.

We try to concentrate on a limited number of characteristic parameters to keep the analytical model clear and focused. We believe that a meaningful model can be built upon only three characteristic parameter sets, describing main memory size, the I/O bandwidth and the disk bandwidth. We justify our approach by a practical implementation and a comparison of the theoretical to real performance values.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Baker, M., Buyya, R.: Cluster Computing at a Glance. In: High Performance Cluster Computing, pp. 3–47. Prentice Hall, Englewood Cliffs (1999)

    Google Scholar 

  2. Pirahesh, H., Mohan, C., Cheng, J., Liu, T., Selinger, P.: Parallelism in relational database systems: Architectural issues and design approaches. In: Proc. Of the IEEE Conf. On Distributed and Parallel Database Systems. IEEE Computer Society Press, Los Alamitos (1990)

    Google Scholar 

  3. Stonebraker, M., Aoki, P., Devine, R., Litwin, W., Olson, M.: Mariposa: A new architecture for distributed data. In: Proc. Of the Int. Conf. On Data Engineering. IEEE Computer Society Press, Los Alamitos (1994)

    Google Scholar 

  4. Moreno, E.: Hash join algorithms on smps clusters: Effects of netcaches on its scalability and performance. Journal of Information Science and Engineering 18 (2002)

    Google Scholar 

  5. Amin, M.B., Schneider, D.A., Singh, V.: An adaptive, load balancing parallel join algorithm. In: Sixth International Conference on Management of Data (COMAD 1994), Bangalore, India (1994)

    Google Scholar 

  6. Jiang, Y., Makinouchi, A.: A parallel hash-based join algorithm for a networked cluster of multiprocessor nodes. In: Proceedings of the COMPSAC 1997 - 21st International Computer Software and Applications Conference (1997)

    Google Scholar 

  7. Tamura, T., Oguchi, M., Kitsuregawa, M.: Parallel database processing on a 100 node PC cluster. In: Proc. of the Supercomputing 1997. IEEE Computer Society Press, Los Alamitos (1997)

    Google Scholar 

  8. Schikuta, E., Kirkovits, P.: Cluster based hybrid hash join: Analysis and evaluation. In: IEEE International Conference on Cluster Computing, Chicago. IEEE Computer Society Press, Los Alamitos (2002)

    Google Scholar 

  9. Bitton, D., Boral, H., Dewitt, D., Wilkinson, W.: Parallel algorithms for the execution of relational operations. ACM Trans. Database Systems 8, 324–353 (1983)

    Article  Google Scholar 

  10. Schikuta, E., Kirkovits, P.: Analysis and evaluation of sorting for parallel database systems. In: Proc. Euromicro 1996, Workshop on Parallel and Distributed Processing, Braga, Portugal, pp. 258–265. IEEE Computer Society Press, Los Alamitos (1996)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Schikuta, E. (2005). Performance Analysis of a Parallel Sort Merge Join on Cluster Architectures. In: Hobbs, M., Goscinski, A.M., Zhou, W. (eds) Distributed and Parallel Computing. ICA3PP 2005. Lecture Notes in Computer Science, vol 3719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564621_31

Download citation

  • DOI: https://doi.org/10.1007/11564621_31

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29235-7

  • Online ISBN: 978-3-540-32071-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics