Abstract
Finding relevant information using search engines that index large portions of the World-Wide Web is often a frustrating task. Due to the diversity of the information available, those search engines will have to rely on techniques, developed in the field of information retrieval (IR).
When focusing on more limited domains of the Internet, large collections of documents can be found, having a highly structured and multimedia character. Furthermore, it can be assumed that the content is more related. This allows more precise and advanced query formulation techniques to be used for the Web, as commonly used within a database environment. The Webspace Method focuses on such document collections, and offers an approach for modelling and searching large collections of documents, based on a conceptual schema.
The main focus in this article is the evaluation of a retrieval performance experiment, carried out to examine the advances of the webspace search engine, compared to a standard search engine using a widely accepted IR model. A major improvement in retrieval performance, measured in terms of recall and precision, up to a factor two, can be achieved when searching document collections, using the Webspace Method.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
G. O. Arocena and Mendelzon O. WebOQL: Exploiting document structure in web queries. In proceedings of the International Conference on Data Engineering (ICDE), pages 24–33, 1998.
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, 1999. ISBN_ISSN: 0-201-39829-X.
S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi, and L. Tanca. XML-GL: A graphical language for querying and restructuring XML documents. In proceedings of the International World Wide Web Conference (WWW, pages 1171–1187, Canada, 1999.
D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A query language for XML. Technical report, World Wide Web Consortium (W3C), http://www.w3.org/TR/xquery, Februar 2001.
D.A Grossman and O. Frieder. Information Retrieval: Algorithms and Heuristics. Kluwer international series in engineering and computer science. Kluwer Academic Publishers, 1998. ISSN_ISBN: 0-7923-8271-4.
A.P. de Vries and A.N. Wilschut. On the integration of IR and databases. In proceedings of the IFIP 2.6 Working Conference on Data Semantics 8, 1999.
N. Fuhr and K. Grossjohan. XIRQL: An extension of XQL for information retrieval. In proceeding of ACM SIGIR Workshop On XML and Information Retrieval, Athens, Greece, July 2000.
Y. Hayashi, J. Tomita, and G. Kikui. Searching text-rich xml documents with relevance ranking. In proceedings of the ACM SIGIR 2000 Workshop on XML and Information Retrieval, Athens, Greece, July 2000.
I.A.G.H. Klerkx and W.G. Tijhuis. Concept-based search and content-based information retrieval. Master’s thesis, Saxion Hogeschool Enschede, in cooperation with the department of Computer Science, University of Twente, Enschede, The Netherlands, march 2001. (in Dutch).
G. Mecca, P. Merialdo, and P. Atzeni. Araneus in the era of xml. IEEE Data Engineering Bullettin, Special Issue on XML, September 1999.
Alberto O. Mendelzon, George A. Mihaila, and Tova Milo. Querying the world wide web. Int. Journal on Digital Libraries, 1(1):54–67, 1997.
Lonely Planet Publications. Lonely planet online, March 2001, http://www.lonelyplanet.com/.
R. van Zwol and P.M.G. Apers. Searching documents on the intranet. In proceedings of Workshop on Organizing Webspace (WOWS’99), in conjunction with Digital Libraries 1999, Berkeley (CA), USA, August 1999.
R. van Zwol and P.M.G. Apers. Using webspaces to model document collections on the web. In proceedings of Workshop on WWW and Conceptual Modelling (WCM 2000), in conjunction with ER 2000, Salt Lake City (USA), October 2000.
R. van Zwol and P.M.G. Apers. The webspace method: On the integration of database technology with information retrieval. In proceedings of Ninth International Conference on Information and Knowledge Management (CIKM 2000), Washington DC., USA, November 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
van Zwol, R., Apers, P.M.G. (2002). Retrieval Performance Experiment with the Webspace Method. In: Eaglestone, B., North, S., Poulovassilis, A. (eds) Advances in Databases. BNCOD 2002. Lecture Notes in Computer Science, vol 2405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45495-0_18
Download citation
DOI: https://doi.org/10.1007/3-540-45495-0_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43905-9
Online ISBN: 978-3-540-45495-3
eBook Packages: Springer Book Archive