Abstract
A non-equijoin of relations R and S is a band join if the join predicate requires values in the join attribute of R to fall within a specified band about the values in the join attribute of S. Traditionally, R and S are split into partitions that are assigned to processors for the join to be executed concurrently and independently. Since the join is a non-equijoin, some records of R (or S) must appear in more than one partition, i.e. some records are replicated across two or more partitions. This may lead to poor performance especially when the number of records to be replicated is large. This paper presents a new algorithm, called the pipelined band join. The algorithm avoids data replication in secondary storage by dynamically creating partitions during join computation through pipelining. A preliminary study indicates that the proposed algorithm outperforms the conventional method.
Preview
Unable to display preview. Download preview PDF.
References
H. Boral, W. Alexander, L. Clay, G. Copeland, S. Danforth, M. Franklin, B. Hart, M. Smith, and P. Valduriez. Prototyping bubba, a highly parallel database system. IEEE Transactions on Knowledge and Data Engineering, 2(1):4–24, March 1990.
E.F. Codd. A relational model of data for large shared data bank. Communications of the ACM, 13(6):377–387, June 1970.
D.J. DeWitt, S. Ghandeharizadeh, D.A. Scheneider, A. Bricker, H-I Hsiao, and R. Rasmussen. The gamma database machine project. IEEE Trans. Knowledge and Data Engineering, 2(1):44–62, March 1990.
D.J. DeWitt and J. Gray. Parallel database systems: The future of high performance database systems. Communications of the ACM, 35(6):85–98, June 1992.
D.J. DeWitt, J.F. Naughton, and D.A. Schneider. An evaluation of non-equijoin algorithms. In Proceedings of the 17th Intl. Conf. on Very Large Data Bases, pages 443–452, Barcelona, Spain, September 1991.
S. Englert, J. Gray, T. Kocher, and P. Shah. A benchmark of nonstop sql release 2 demonstrating near-linear speedup and scaleup on large databases. Technical Report Technical Report 89.4, Tandom Computer Inc., 1989.
J.L. Hennessy and D.A. Patterson. Computer Architecture: A Quantitative Approach (page 17). Morgan Kaufman Publishers Inc., 1990.
K.A. Hua and C. Lee. Handling data skew in multiprocessor database computers using partition tuning. In Proceedings of the 17th International Conference on Very Large Data Bases, pages 525–535, Barcelona, Spain, September 1991.
K.A. Hua, Y.L. Lo, and H.C. Young. Including the load balancing issue in the optimization of multi-way join queries for shared-nothing database computers. In Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems, pages 74–83, San Diego, California, January 1993.
H. Lu, K.L. Tan, and M.C. Shan. Hash-based join algorithms for multiprocessor computers with shared memory. In Proceedings of the 16th International Conference on Very Large Data Bases, pages 198–209, Brisbane, Australia, August 1990.
P. Mishra and M.H. Eich. Join processing in relational databases. ACM Computing Surveys, 24(1):63–113, March 1992.
L. Shapiro. Join processing in database systems with large main memories. ACM Transactions on Database Systems, 11(3):239–264, September 1986.
V. Soloviev. A truncating hash algorithm for processing band-join queries. In Proceedings of the 9th Intl. Conf. on Data Engineering, pages 419–427, Vienna, Austria, February 1993.
M. Soo, R. Snodgrass, and C. Jenson. Efficient evaluation of the valid-time natural join. In Proceedings of the 10th Intl. Conf. on Data Engineering, pages 282–292, February 1994.
M. Stonebraker. The case for shared nothing. Database Engineering, 9(1):4–9, 1986.
Teradata Corporation. Dbc/1012 database computer concepts and facilities, rel. 3.1 edition, teradata document c02-0001-05. Los Angeles, CA, 1988.
J. Torrellas, A. Gupta, and J. Hennessy. Characterizing the cache performance and synchronization behavior of a multiprocessor operating system. Technical Report CSL-TR 92-512, Computer Systems Laboratory, Stanford University, January 1992.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, H., Tan, KL. (1995). Pipelined band join in shared-nothing systems. In: Kanchanasut, K., Lévy, JJ. (eds) Algorithms, Concurrency and Knowledge. ACSC 1995. Lecture Notes in Computer Science, vol 1023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60688-2_48
Download citation
DOI: https://doi.org/10.1007/3-540-60688-2_48
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60688-8
Online ISBN: 978-3-540-49262-7
eBook Packages: Springer Book Archive