Abstract
We consider queries over large object-oriented databases in which one class of objects contains references to another class of objects. In order to answer the query efficiently, the database system needs to be able to follow object pointers from a large collection of objects in a way that minimizes the I/O cost. Traditional techniques require significant redundant I/O when both the referencing class and the referenced class are substantially larger than main memory. We propose a new technique for processing a class of object-oriented queries that is an adaptation of the Jive-Join algorithm of Ross and Li. Our algorithm applies as long as the number of disk blocks in the referenced relation is roughly of the order of the square of the number of blocks that fit in main memory. The cost of the algorithm is at most one pass through each input class extension, one pass through an index file if there is an index, and two passes through a temporary file that contains the object-identifier of the referenced object for each output tuple. Almost all of the I/O is sequential, resulting in minimal seek and rotational latencies. We analyze the cost of our algorithm, and compare our algorithm with a naive algorithm, and with an adaptation of an algorithm due to Valduriez. We demonstrate that under a wide range of circumstances, our algorithm performs significantly better than its competitors. The performance improvement is most dramatic when there is a small amount of memory, and when the input class extensions are very large.
This research was supported by a grant from the AT&T Foundation, by a David and Lucile Packard Foundation Fellowship in Science and Engineering, by a Sloan Foundation Fellowship, by NSF grants IRI-9209029, CDA-90-24735, and by an NSF Young Investigator award.
Preview
Unable to display preview. Download preview PDF.
References
R. Agrawal et al. Quest: A project on database mining. In Proceedings of the ACM SIGMOD Conference, page 514, May 1994.
K. Brown et al. Resource allocation and scheduling for mixed database workloads. Technical Report 1095, University of Wisconsin, Madison, 1992.
D. S. Batory. On searching transposed files. ACM Transactions on Database Systems, 4(4):531–544, 1979.
J. Dozier. Access to data in NASA's Earth Observing System. In Proceedings of the ACM SIGMOD Conference, page 1, June 1992.
D. De Witt, J. F. Naughton, and D. A. Schneider. Parallel sorting on a shared-nothing architecture using probabilistic splitting. In Proceedings of the Conference on Parallel and Distributed Information Systems, pages 280–291, 1991.
L. M. Haas, M. J. Carey, and M. Livny. Seeking the truth about ad hoc join costs. Technical Report RJ9368, IBM Almaden Research Center, 1993.
C. Nyberg et al. Alphasort: A RISC machine sort. In Proceedings of the ACM SIGMOD Conference, pages 233–242, May 1994.
K. Ross and Z. Li. Efficiently joining multiple large relations. Submitted for publication, 1995.
K. A. Ross and Z. Li. Jive-join and Smash-join: Efficient join techniques for large relations and small main memory. Submitted for publication, 1995.
S. Seshadri and J. F. Naughton. Sampling issues in parallel database systems. (manuscript), 1991.
P. Valduriez. Join indices. ACM Transactions on Database Systems, 12(2):218–246, 1987.
S. B. Yao. Approximating block accesses in database organizations. Communications of the ACM, 20(4):260–261, 1977.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ross, K.A. (1995). Efficiently following object references for large object collections and small main memory. In: Ling, T.W., Mendelzon, A.O., Vieille, L. (eds) Deductive and Object-Oriented Databases. DOOD 1995. Lecture Notes in Computer Science, vol 1013. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60608-4_35
Download citation
DOI: https://doi.org/10.1007/3-540-60608-4_35
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60608-6
Online ISBN: 978-3-540-48460-8
eBook Packages: Springer Book Archive