Abstract
We developed a concise but comprehensive analytical model for the well-known sort merge Join algorithm on cost effective cluster architectures.
We try to concentrate on a limited number of characteristic parameters to keep the analytical model clear and focused. We believe that a meaningful model can be built upon only three characteristic parameter sets, describing main memory size, the I/O bandwidth and the disk bandwidth. We justify our approach by a practical implementation and a comparison of the theoretical to real performance values.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Baker, M., Buyya, R.: Cluster Computing at a Glance. In: High Performance Cluster Computing, pp. 3–47. Prentice Hall, Englewood Cliffs (1999)
Pirahesh, H., Mohan, C., Cheng, J., Liu, T., Selinger, P.: Parallelism in relational database systems: Architectural issues and design approaches. In: Proc. Of the IEEE Conf. On Distributed and Parallel Database Systems. IEEE Computer Society Press, Los Alamitos (1990)
Stonebraker, M., Aoki, P., Devine, R., Litwin, W., Olson, M.: Mariposa: A new architecture for distributed data. In: Proc. Of the Int. Conf. On Data Engineering. IEEE Computer Society Press, Los Alamitos (1994)
Moreno, E.: Hash join algorithms on smps clusters: Effects of netcaches on its scalability and performance. Journal of Information Science and Engineering 18 (2002)
Amin, M.B., Schneider, D.A., Singh, V.: An adaptive, load balancing parallel join algorithm. In: Sixth International Conference on Management of Data (COMAD 1994), Bangalore, India (1994)
Jiang, Y., Makinouchi, A.: A parallel hash-based join algorithm for a networked cluster of multiprocessor nodes. In: Proceedings of the COMPSAC 1997 - 21st International Computer Software and Applications Conference (1997)
Tamura, T., Oguchi, M., Kitsuregawa, M.: Parallel database processing on a 100 node PC cluster. In: Proc. of the Supercomputing 1997. IEEE Computer Society Press, Los Alamitos (1997)
Schikuta, E., Kirkovits, P.: Cluster based hybrid hash join: Analysis and evaluation. In: IEEE International Conference on Cluster Computing, Chicago. IEEE Computer Society Press, Los Alamitos (2002)
Bitton, D., Boral, H., Dewitt, D., Wilkinson, W.: Parallel algorithms for the execution of relational operations. ACM Trans. Database Systems 8, 324–353 (1983)
Schikuta, E., Kirkovits, P.: Analysis and evaluation of sorting for parallel database systems. In: Proc. Euromicro 1996, Workshop on Parallel and Distributed Processing, Braga, Portugal, pp. 258–265. IEEE Computer Society Press, Los Alamitos (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Schikuta, E. (2005). Performance Analysis of a Parallel Sort Merge Join on Cluster Architectures. In: Hobbs, M., Goscinski, A.M., Zhou, W. (eds) Distributed and Parallel Computing. ICA3PP 2005. Lecture Notes in Computer Science, vol 3719. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564621_31
Download citation
DOI: https://doi.org/10.1007/11564621_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29235-7
Online ISBN: 978-3-540-32071-5
eBook Packages: Computer ScienceComputer Science (R0)