Abstract
In order to pose effective queries to Web sites, some form of site data model must be implicitly or explicitly shared by users. Many approaches try to compensate for the lack of such a common model by considering the hypertextual structure of Web sites; unfortunately, this structure has usually little to do with data semantics. In this paper a different technique is proposed that allows for both navigational and logical/conceptual description of Web sites. The data model is based on WG-log, a query language based on the graph-oriented database model of GOOD (Gyssens et al. 1997) and G-log (Paredaens et al. 1995), which allows the description of data manipulation primitives via (sets of) graph(s). The WG-log description of a Web site schema is lexically based on standard hypermedia design languages, thus allowing for easy schema generation by current hypermedia authoring environments. The use of WG-log for queries allows graphic query construction with respect to both the navigational and the logical parts of schemata. Site schemata are managed by Schema Robots, which assist clients in the process of identification and retrieval of a set of candidate schemata. On the basis of the set of candidate schemata, the client may then query individual Web sites; extensive data caching is used to avoid flooding resulting from an excessive number of candidates. A remote Query Manager process, running side by side with standard Web servers, manages query execution and handles the presentation of the results to the client. Our schema is particularly suited for Intranets, while allowing for a smooth migration of Internet Web sites as more and more of them are produced on the basis of hypermedia design and generation methodologies.
Chapter PDF
Similar content being viewed by others
References
Altavista, Inc. (1997) Alta Vista Search Index http://www.altavista.digital.com.
Atzeni P., Mecca G. and Merialdo P. (1997) Semistructured and Structured Data in the Web: Going Back and Forth http://www.research.att.com/suciu/workshop-papers.html.
Balasubramanian V., Bang Min M. and Joonhee (1995) M.Yoo: A Systematic Approach to Designing a WWW Application Communications of the ACM 38(8), 47-48.
Budi Y. and Dik Lun L. (1996) WISE: A World Wide Web Resource Database System IEEE Transanctions on Knowledge and Data Engineering 8 (4), 548 - 554.
Damiani E. and Fugini M.G. (1995) Automatic Thesaurus Construction Supporting Fuzzy Retrieval of Reusable Software in Proceedings of ACMSAC’95, Nashville.
Dreilinger D. (1997) SavySearch Home Page http://www.lycos.com.
Fletcher J. (1990) Jumpstation FrontPage http://www.stir.ac.uk/js.
Fraternali P. and Paolini P. (1997) Autoweb: Automatic Generation of Web Applications from Declarative Specifications http://www.ing.unico.it/Autoweb.
Garcia-Molina H. and Hammer J. (1997) Integrating and Accessing Heterogeneous Information Sources in Tsimmis in Proceedings of ADBIS 97, St. Petersburg.
Garzotto F., Mainetti and L. Paolini P. (1995) Hypermedia Design Languages Evaluation Issues Communications of the ACM bf 38 (8), 74 - 86.
Giannotti F., Manco G. and Pedreschi D. (1997) A Deductive Data Model for Representing and Querying Semistructured Data in Proceedings of the ILCP 97 Post-Conference Workshop on Logic Programming Tools for Internet Applications, Leuwen.
Gyssens M., Paredaens J., Van der Bussche J. and Van Gucht D. (1994) A Graph-oriented Object Database Model IEEE Transactions on Knowledge and Data Engineering, 6 (4), 572 - 586.
S. Hamilton (1997) E-Commerce for the 21st Century IEEE Computer, 30 (5), 44 - 47.
Hu J., Nicholson D., Mungall C., Hillyard A. and Archibald A. (1996) WebinTool: A Generic Web to Database Interface Building Tool in Proceedings of the 1996 DEXA Workshop.
Isakowitz T., Stohr Edward A. D. and Balasubramanian, P. (1995) RMM: a Language for Structured Hypermedia Design Communications of the ACM, bf 38(8).
Kogan Y., Michaeli D., Sagiv Y. and Shmueli O. (1997)Utilizing the Multiple Facets of WWW Contents in Proceedings of the 1997 NGITS Workshop.
Konopnicki D. and Shmueli 0. (1995) W3QL: A Query System for the World Wide Web in Proceedings of the 21th International Conference on Very Large Databases, Zurich.
Lakshmanan L., Sadri F. and Subramanian I. (1996) A Declarative Language for Querying and Restructuring the Web in Proceedings of the 1996 IEEE RIDE-NDS Workshop.
Loke S.W. and Davison A. (1997) LogicWeb: Enhancing the Web with Logic Programming http://www.cs.mu.oz.au/~swloke/papers/lw.ps.gz.
Lucarella D. and Zanzi A. (1996) A Visual Retrieval Environment for Hypermedia Information Systems ACM Transactions on Information Systems, 14(1).
Luke S., Spector L., Rager D. and Hendler J. (1997) Ontology-based Web Agents http://www.cs.umd.edu/projects/plus/SHOE.
Lycos, Inc. (1997) The Lycos Catalog of the Internet http://www.lycos.com.
Lycos, Inc. (1997) Point http://www.pointcom.com.
McBryan, O. A. (1994) WWWW and GENVL: Tools for Taming the Web, in Proceedings of the First Annual WWW Conference, Geneva.
The McKinley Group, Inc. (1997) Magellan Internet Guide http://www.cs.colostate.edu/dreiling/smartform.html/.
Mendelzon A., Mihaila G. and Milo T. (1996) Querying the World Wide Web, in Proceedings of the Conference on Parallel and Distributed Information Systems, Toronto.
Paredaens J., Peelman P. and Tanca L. (1995) G-log: A Graph-based Query Language IEEE Transactions on Knowledge and Data Engineering, (7), 436-453.
Paredaens J., Peelman P. and Tanca L. (1997) Merging Graph-Based and Rule-Based Computation, Data and Knowledge Engineering, to appear.
Pinkerton B. (1997) Finding What people Want: Experiences with WebCrawler, in Proceedings of the Second Annual WWW/Mosaic Conference, Geneva.
Quarterdeck Inc. (1997) Web Compass Fact Sheet http://www.arachnid.gdeck.com/gdeck/products/webcompass/.
Sacca D., Zaniolo C. (1990) Stable Models and Non-Determinism in Logic Programs with Negation in Proceedings of the 1990 PODS Conference.
Selberg E. and Etzioni O.(1995) Multiservice Search and Comparison Using MetaCrawler in Proceedings of the Fourth International WWW Conference.
Torlone R. (1996) Linguaggi di Interrogazione per it World Wide Web in Proceedings of SEBD ’96, S. Miniato.
Yahoo, Inc. (1997) Yahoo! http://www.yahoo.com.
World Wide Web Consortium (1997) HTML 4.0 Specification Working Draft http://www.w3.org/TR/WD-html.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1998 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Damiani, E., Tanca, L. (1998). Semantic Approaches to Structuring and Querying Web Sites. In: Spaccapietra, S., Maryanski, F. (eds) Data Mining and Reverse Engineering. IFIP — The International Federation for Information Processing. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35300-5_2
Download citation
DOI: https://doi.org/10.1007/978-0-387-35300-5_2
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-4910-6
Online ISBN: 978-0-387-35300-5
eBook Packages: Springer Book Archive