Query Processing in Data Integration Systems
Adaptive query processing; Distributed query processing; Query processing for mediators
In (virtual) data integration, also known as enterprise information integration, queries are posed over a virtual mediated schema and answered on-the-fly using data from remote sources, which may themselves be DBMSs, Web sites, or applications. This requires two main stages that of query reformulation where the user’s query is composed with schema mappings to produce a combined (distributed) query and query optimization and execution where the query is executed efficiently across the sources.
The query optimization and execution problem for data integration is, in principle, quite similar to that for distributed databases. However, it is actually significantly more complex because (1) remote data sources may have different data models and their own query capabilities; (2) statistics on the data at each source may be unavailable; (3) remote data sources may require the requestor...
- 1.Smith JM, Bernstein PA, Dayal U, Goodman N, Landers TA, Lin KWT, Wong E. Multibase: integrating heterogeneous database systems. In: Proceedings of the AFIPS National Computer Conference; 1981. p. 487–99.Google Scholar
- 2.Levy AY, Rajaraman A, Ordille JJ. Querying heterogeneous information sources using source descriptions. In: Proceedings of the 22th International Conference on Very Large Data Bases; 1996. p. 251–62.Google Scholar