Abstract
Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with all sources at a “lowest common denominator”. This paper explores a third approach, in which a mediator ensures that sources receive queries they can handle, while still taking advantage of all of the query power of the source. We propose an architecture that enables this, and identify a key component of that architecture, the Capabilities-Based Rewriter (CBR). The CBR takes as input a description of the capabilities of a data source, and a query targeted for that data source. From these, the CBR determines component queries to be sent to the sources, commensurate with their abilities, and computes a plan for combining their results using joins, unions, selections, and projections. We provide a language to describe the query capability of data sources and a plan generation algorithm. Our description language and plan generation algorithm are schema independent and handle SPJ queries. We also extend CBR with a cost-based optimizer. The net effect is that we prune without losing completeness. Finally we compare the implementation of a CBR for the Garlic project with the algorithms proposed in this paper.
Keywords
Research partially supported by Wright Laboratories, Wright Patterson AFB, ARPA Contract F33615-93-C1337.
This paper is an extended version of a paper published in International Conference on Parallel and Distributed Information Systems, December 1996.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
R. Ahmed et al. The Pegasus heterogeneous multidatabase system. IEEE Computer, 24: 19–27, 1991.
M.J. Carey et al. Towards heterogeneous multimedia information systems: The Garlic approach. In Proc. RIDE-DOM Workshop, pages 124–31, 1995.
R. Fagin. Combining fuzzy information from multiple systems. In Proc. PODS, 1996.
J.C. Franchitti and R. King. Amalgame: a tool for creating interoperating persistent, heterogeneous components. Advanced Database Systems, pages 313–36, 1993.
P. Gassner, G. Lohman, B. Schiefer, and Y. Wang. Query optimization in the IBM DB2 family. IEEE Data Engineering Bulletin, 16: 4–18, September 1993.
A. Gupta. Integration of Information Systems: Bridging Heterogeneous Databases. IEEE Press, 1989.
L. Haas, D. Kossman, E. Wimmers, and J. Yang. Optimizing queries across diverse data sources. In Proc. VLDB, 1997.
J. Hammer and D. McLeod. An approach to resolving semantic heterogeneity in a federation of autonomous, heterogeneous database systems. Intl Journal of Intelligent and Cooperative information Systems, 2: 51–83, 1993.
A. Levy, A. Mendelzon, Y. Sagiv, and D. Srivastava. Answering queries using views. In Proc. PODS Conf., pages 95–104, 1995.
G. Lohman. Grammar-like functional rules for representing query optimization alternatives. In Proc. ACM SIGMOD, 1988.
A. Levy, A. Rajaraman, and J. Ordille. Query processing in the information manifold. In Proc. VLDB, 1996.
A. Levy, A. Rajaraman, and J. Ullman. Answering queries using limited external processors. In Proc. PODS, pages 227–37, 1996.
P.A. Larson and H.Z. Yang. Computing queries from derived relations. In Proc. VLDB Conf., pages 259–69, 1985.
Y. Papakonstantinou, A. Gupta, H. Garcia-Molina, and J. Ullman. A query translation scheme for the rapid implementation of wrappers. In Proc. DOOD Conf., pages 161–86, 1995.
Y. Papakonstantinou, A. Gupta, and L. Haas. Capabilities-based query rewriting in mediator systems. In Proc. PDIS, 1996.
Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information sources. In Proc. ICDE Conf., pages 251–60, 1995.
Xiaolei Qian. Query folding. In Proc. ICDE, pages 48–55, 1996.
A. Rajaraman, Y. Sagiv, and J. Ullman. Answering queries using templates with binding patterns. In Proc. PODS Conf., pages 105–112, 1995.
V.S. Subrahmanian et al. HERMES: A heterogeneous reasoning and mediator system.
P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie, and T. Price. Access path selection in a relational database management system. In Proc. ACM SIGMOD, 1979.
S. Sagiv and M. Yannakakis. Equivalences among relational expressions with the union and difference operators. JACM, 27: 633–55, 1980.
A. Tomasic, L. Raschid, and R. Valduriez. Scaling heterogeneous databases and the design of DISCO. Technical report, INRIA, 1995.
J.D. Ullman. Principles of Database and Knowledge-Base Systems, Vol. 1: Classical Database Systems. Computer Science Press, New York, NY, 1988.
J.D. Ullman. Principles of Databaseand Knowledge-Base Systems, Vol. II: The New Technologies. Computer Science Press, New York, NY, 1989.
V. Vassalos and Y. Papakonstantinou. Describing and using query capabilities of heterogeneous sources. Available via.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media New York
About this chapter
Cite this chapter
Papakonstantinou, Y., Gupta, A., Haas, L. (1998). Capabilities-Based Query Rewriting in Mediator Systems. In: Naughton, J.F., Weikum, G. (eds) Parallel and Distributed Information Systems. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-6132-0_4
Download citation
DOI: https://doi.org/10.1007/978-1-4757-6132-0_4
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5026-0
Online ISBN: 978-1-4757-6132-0
eBook Packages: Springer Book Archive