Abstract
As the WWW becomes a major source of information, a lot of interest has arisen, not only for searching for information, but for reusing this information in new pages, or directly within applications. Unfortunately HTML tags do not provide a significant level of structure for identifying and extracting information, since they are mostly used for presentation issues. Moreover the simple link mechanism of the Web does not support the controlled traversal of links to related pages. Particularly promising is the proposal for a new standard, XML, which could bring the power of SGML to the Web while keeping the simplicity of HTML. In this paper we present a system and a language that allow reusing of information from various sources, including databases and SGML-like documents, by combining it dynamically to produce a virtual document. The language uses a treelike structure for the representation of the information objects as well as link objects. The paper focuses on the selection and the traversal of XML links to extract information from linked pages. The strength of our approach is to be an SGML-compliant solution, which makes it ready to take full advantage of XML for reusing information from the Web as soon as it is widely used.
Preview
Unable to display preview. Download preview PDF.
References
ISO 10179. Document Style and specification Language (DSSSL), 1996.
ISO 10774. Information Technology-Hypermedia/Time-based Structuring language (Hy-Time), 1992.
S. Abiteboul. Querying semi-structured data. In Proceedings of ICDT'97 (Invited talk), 1997.
S. Abiteboul, S. Cluet, and T. Milo. Correspondence and translation for heterogeneous data. In Proceedings of ICDT97, 1997.
S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. Wiener. The lorel query language for semi-structured data. Journal of Digital Libraries, 1(1), 1996.
Akpotsui, V. Quint, and C. Roisin. Type modeling for document transformation in structured editor. Mathematical and Computer Modelling, 1994.
B. Amann. Gram: A graph data model and query language. In Proceedings of the Second European Conference on Hypertext, ECHT'92, Milan, 1992.
C. Beeri and Y. Kornatzky. A logical query language for hypertext system. In Proceedings of the first European Conference on Hypertext, pages 67–80. Canbridge University Press, 1990.
T. Bray and C.M. Sperberg-McQueen. Extensible markup language (XML), W3C working draft, http://www.w3.org/pub/WWW/TR/WD-xml.html.
V. Christophides, S. Abiteboul, S. Cluet, and M. Scholl. From structured documents to novel query facilities. In Proceedings of the ACM SIGMOD Conference on Management of Data, Minneapolis, Minnesota, pages 313–324, 1994.
Arturo Crespo and Eric A. Bier. Webwriter: A browser-based editor for constructing web applications. In Proceedings of WWW5 Conference, Paris, France, May 6–10 1996.
Curtis E. Dyreson and Antony M. Sloane. The boomerang white paper: a page as you like it. In Proceedings of the WWW4 Conference, pages 667–676, December 1995.
D. Konopnicki and O. Schmueli. W3QS: A query system for the world wide web. In Proceedings of VLDB'95, pages 54–65, 1995.
D. M. Levy. Document reuse and document systems. Electronic Publishing, 6(4):339–348, December 1993.
Jacques Le Maitre, Elisabeth Murisasco, and M. Rolbert. From annotated corpora to databases: the SgmlQL language. In CSLI lecture notes, collection linguistic databases. Cambridge University Press, to appear.
A. Mendelzon, G. Mihaila, and T. Milo. Querying the world wide web. In Proceedings of PDIS'96, Miami, Floride, 1996.
Jocelyne Nanard and Marc Nanard. Using types to incorporate knowledge in hypertext. In Proceedings of the 3rd ACM Conference on Hypertext, ACM Press, San Antonio (Texas), December 1991.
Marc Nanard and Jocelyne Nanard. Should anchors be typed too? an experiment with macweb. In Proceedings of the ACM Conference on Hypertext, HTX'93, Seattle, November 1993.
P. Merialdo P. Atzeni, G. Mecca. To weave the web. In Proceedings of the 23rd International Conference on Very Large Databases (VLDB'97), 1997.
D. Skar. Graduating from file-based to info-based document construction. In Proceedings of the SGML Asia Pacific Conference, Sydney, Australia, September 1996.
C. M. Sperberg-McQueen and Robert F. Goldstein. Html to the max: A manifesto for adding SGML intelligence to the world-wide web. Computer Networks and ISDN Systems, 28, pages 3–11, 1995.
Anne-Marie Vercoustre, Jon Dell'Oro, and Brendan Hills. Reuse of information through virtual documents. In Second Australian Document Computing Symposium, Melbourne Australia, pages 55–64, April 5 1997.
Anne-Marie Vercoustre and François Paradis. A descriptive language for information object reuse through virtual documents. In 4th International Conference on Object-Oriented Information Systems (OOIS'97), Brisbane, Australia, 10–12 November 1997.
Jack Jingshuang Yang and Gail E. Kaiser. An architecture for integrating OODBs with WWW. In Proceedings of the WWW5 Conference, Paris, France, May 1996.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vercoustre, AM., Paradis, F. (1998). Reuse of linked documents through virtual document prescriptions. In: Hersch, R.D., André, J., Brown, H. (eds) Electronic Publishing, Artistic Imaging, and Digital Typography. RIDT 1998. Lecture Notes in Computer Science, vol 1375. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0053295
Download citation
DOI: https://doi.org/10.1007/BFb0053295
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64298-5
Online ISBN: 978-3-540-69718-3
eBook Packages: Springer Book Archive