Skip to main content

Retrieval Performance Experiment with the Webspace Method

  • Conference paper
  • First Online:
Advances in Databases (BNCOD 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2405))

Included in the following conference series:

  • 232 Accesses

Abstract

Finding relevant information using search engines that index large portions of the World-Wide Web is often a frustrating task. Due to the diversity of the information available, those search engines will have to rely on techniques, developed in the field of information retrieval (IR).

When focusing on more limited domains of the Internet, large collections of documents can be found, having a highly structured and multimedia character. Furthermore, it can be assumed that the content is more related. This allows more precise and advanced query formulation techniques to be used for the Web, as commonly used within a database environment. The Webspace Method focuses on such document collections, and offers an approach for modelling and searching large collections of documents, based on a conceptual schema.

The main focus in this article is the evaluation of a retrieval performance experiment, carried out to examine the advances of the webspace search engine, compared to a standard search engine using a widely accepted IR model. A major improvement in retrieval performance, measured in terms of recall and precision, up to a factor two, can be achieved when searching document collections, using the Webspace Method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. G. O. Arocena and Mendelzon O. WebOQL: Exploiting document structure in web queries. In proceedings of the International Conference on Data Engineering (ICDE), pages 24–33, 1998.

    Google Scholar 

  2. R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. ACM Press, 1999. ISBN_ISSN: 0-201-39829-X.

    Google Scholar 

  3. S. Ceri, S. Comai, E. Damiani, P. Fraternali, S. Paraboschi, and L. Tanca. XML-GL: A graphical language for querying and restructuring XML documents. In proceedings of the International World Wide Web Conference (WWW, pages 1171–1187, Canada, 1999.

    Google Scholar 

  4. D. Chamberlin, D. Florescu, J. Robie, J. Simeon, and M. Stefanescu. XQuery: A query language for XML. Technical report, World Wide Web Consortium (W3C), http://www.w3.org/TR/xquery, Februar 2001.

  5. D.A Grossman and O. Frieder. Information Retrieval: Algorithms and Heuristics. Kluwer international series in engineering and computer science. Kluwer Academic Publishers, 1998. ISSN_ISBN: 0-7923-8271-4.

    Google Scholar 

  6. A.P. de Vries and A.N. Wilschut. On the integration of IR and databases. In proceedings of the IFIP 2.6 Working Conference on Data Semantics 8, 1999.

    Google Scholar 

  7. N. Fuhr and K. Grossjohan. XIRQL: An extension of XQL for information retrieval. In proceeding of ACM SIGIR Workshop On XML and Information Retrieval, Athens, Greece, July 2000.

    Google Scholar 

  8. Y. Hayashi, J. Tomita, and G. Kikui. Searching text-rich xml documents with relevance ranking. In proceedings of the ACM SIGIR 2000 Workshop on XML and Information Retrieval, Athens, Greece, July 2000.

    Google Scholar 

  9. I.A.G.H. Klerkx and W.G. Tijhuis. Concept-based search and content-based information retrieval. Master’s thesis, Saxion Hogeschool Enschede, in cooperation with the department of Computer Science, University of Twente, Enschede, The Netherlands, march 2001. (in Dutch).

    Google Scholar 

  10. G. Mecca, P. Merialdo, and P. Atzeni. Araneus in the era of xml. IEEE Data Engineering Bullettin, Special Issue on XML, September 1999.

    Google Scholar 

  11. Alberto O. Mendelzon, George A. Mihaila, and Tova Milo. Querying the world wide web. Int. Journal on Digital Libraries, 1(1):54–67, 1997.

    Google Scholar 

  12. Lonely Planet Publications. Lonely planet online, March 2001, http://www.lonelyplanet.com/.

  13. R. van Zwol and P.M.G. Apers. Searching documents on the intranet. In proceedings of Workshop on Organizing Webspace (WOWS’99), in conjunction with Digital Libraries 1999, Berkeley (CA), USA, August 1999.

    Google Scholar 

  14. R. van Zwol and P.M.G. Apers. Using webspaces to model document collections on the web. In proceedings of Workshop on WWW and Conceptual Modelling (WCM 2000), in conjunction with ER 2000, Salt Lake City (USA), October 2000.

    Google Scholar 

  15. R. van Zwol and P.M.G. Apers. The webspace method: On the integration of database technology with information retrieval. In proceedings of Ninth International Conference on Information and Knowledge Management (CIKM 2000), Washington DC., USA, November 2000.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

van Zwol, R., Apers, P.M.G. (2002). Retrieval Performance Experiment with the Webspace Method. In: Eaglestone, B., North, S., Poulovassilis, A. (eds) Advances in Databases. BNCOD 2002. Lecture Notes in Computer Science, vol 2405. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45495-0_18

Download citation

  • DOI: https://doi.org/10.1007/3-540-45495-0_18

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-43905-9

  • Online ISBN: 978-3-540-45495-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics