Abstract
With the emergence of the Web as a universal data repository, research has recently focused on data integration and data translation, and a common data model of semistructured data has been established. It is being realized, however, that having a common schema model is also necessary, to support tasks such as query formulation, decomposition and optimization, or declarative specification of data translation. In this paper we elaborate on the theoretical foundations of a middle-ware schema model. We present expressive and flexible schema definition languages, and investigate properties such as expressive power and the complexity of decision problems that are significant in the context of data translation and integration.
The work is supported by the Israeli Ministry of Science and by the Academy of Arts and Sciences
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Extensible markup language, 1998. Available by from http://www.w3.org/XML/.
S. Abiteboul, S. Cluet, and T. Milo. Correspondence and translation for heterogeneous data. In Proc. ICDT 97, pages 351–363, 1997.
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J.L. Wiener. The lorel query language for semistructured data. Journal on Digital Libraries, 1(1), 1997.
S. Abiteboul and V. Vianu. Regular path queries with constraints. In Proc. Symp. on Principles of Database Systems-PODS 97, 1997.
P. Buneman, S. Davidson, M. Fernandez, and D. Suciu. Adding structure to unstructured data. In Proc. Int. Conf. on Database Theory ICDT 97, 1997.
P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A query language and optimization techniques for unstructured data. In Proceedings of SIGMOD’ 96, pages 505–516, 1996.
P. Buneman, W. Fan, and S. Weinstein. Path constraints on semistructured and structured data. In Proceedings of PODS’ 98, pages 129–138, 1998.
M.J. Carey et al. Towards heterogeneous multimedia information systems: The Garlic approach. Technical Report RJ 9911, IBM Almaden Research Center, 1994.
T.-P. Chang and R. Hull. Using witness generators to support bi-directional update between object-based databases. In Proc. Symp. on Principles of Database Systems-PODS 95, San Jose, California, May 1995.
V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From structured documents to novel query facilities. In Proc. ACM SIGMOD Symp. on the Management of Data, 94, pages 313–324, 1994.
S. Cluet, C. Delobel, J. Simeon, and K. Smaga. Your mediators need data conversion! In SIGMOD’98, to appear, 1998.
M. Fernandez, D. Florescu, A. Levy, and D. Suciu. A query language for a web-site management system. SIGMOD Record, 6(3):4–11, 1997.
H. Garcia-Molina, Y. Papakonstantinou, D. Quass, A. Rajaraman, Y. Sagiv, J. Ullman, V. Vassalos, and J. Widom. The tsimmis approach to mediation: Data models and languages. In Journal of Intelligent Information Systems, 1997.
S. Ginsburg. The Mathematical Theory of Context-Free Languages. McGraw-Hill, 1966.
C.F. Goldfarb. The SGML Handbook. Calendon Press, Oxford, 1990.
R. Goldman and J. Widom. Dataguides: Enabling query formulation and optimization in semistructured databases. In Proceedings of Conf. on Very Large Data Bases, VLDB’ 97, 1997.
A. Levy, A. Rajaraman, and J. Ordille. Querying heterogeneous information sources using source descriptions. In Proceedings of Conf. on Very Large Data Bases, VLDB’ 96, 1996.
A. Mendelzon, G. Michaila, and T. Milo. Querying the world wide web. Int. Journal of Digital Libraries, 1(1), 1997.
T. Milo and S. Zohar. Using schema matching to simplify heterogeneous data translation. In To appear in VLDB’ 98, 1998.
Y. Papakonstantinou, H. Garcia-Molina, and J. Widom. Object exchange across heterogeneous information sources. In Proc. IEEE International Conference on Data Engineering 95, 1995.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Beeri, C., Milo, T. (1999). Schemas for Integration and Translation of Structured and Semi-structured Data. In: Beeri, C., Buneman, P. (eds) Database Theory — ICDT’99. ICDT 1999. Lecture Notes in Computer Science, vol 1540. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-49257-7_19
Download citation
DOI: https://doi.org/10.1007/3-540-49257-7_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65452-0
Online ISBN: 978-3-540-49257-3
eBook Packages: Springer Book Archive