Encyclopedia of Database Systems

2018 Edition
| Editors: Ling Liu, M. Tamer Özsu

XML Retrieval

  • Mounia Lalmas
  • Andrew Trotman
Reference work entry
DOI: https://doi.org/10.1007/978-1-4614-8265-9_474

Synonyms

Content-oriented XML retrieval; Focused retrieval; Structured document retrieval; Structured text retrieval

Definition

Text documents often contain a mixture of structured and unstructured content. One way to format this mixed content is according to the adopted W3C standard for information repositories and exchanges, the eXtensible Mark-up Language (XML). In contrast to HTML, which is mainly layout-oriented, XML follows the fundamental concept of separating the logical structure of a document from its layout. This logical document structure can be exploited to allow a more focused sub-document retrieval.

XML retrieval breaks away from the traditional retrieval unit of a document as a single large (text) block and aims to implement focused retrievalstrategies aiming at returning document components, i.e., XML elements, instead of whole documents in response to a user query. This focused retrieval strategy is believed to be of particular benefit for information repositories...

This is a preview of subscription content, log in to check access.

Recommended Reading

  1. 1.
    Amer-Yahia S, Lalmas M. XML search: languages, INEX and scoring. ACM SIGMOD Rec. 2006;35(4):16–23.CrossRefGoogle Scholar
  2. 2.
    Baeza-Yates R, Fuhr N, Maarek YS, editors. Special issue on XML retrieval. ACM Trans Inf Syst 2006;24(4).Google Scholar
  3. 3.
    Blanken HM, Grabs T, Schek H-J, Schenkel R, Weikum G, editors. Intelligent search on XML data, applications, languages, models, implementations, and benchmarks. Berlin: Springer; 2003.zbMATHGoogle Scholar
  4. 4.
    Denoyer L, Gallinari P. The Wikipedia XML corpus, comparative evaluation of XML information retrieval systems. In: Proceedings of the 5th International Workshop of the Initiative for the Evaluation of XML Retrieval; 2007. p. 12–19.Google Scholar
  5. 5.
    Fuhr N, Lalmas M, editors. Special issue on INEX. Inf. Retr. 2005;8(4).Google Scholar
  6. 6.
    Kamps J, de Rijke M, Sigurbjörnsson B. The importance of length normalization for XML retrieval. Inf Retr. 2005;8(4):631–54.CrossRefGoogle Scholar
  7. 7.
    Kazai G, Gövert N, Lalmas M, Fuhr N. The INEX evaluation initiative. In: Blanken HM, Grabs T, Schek H, Schenkel R, Weikum G, editors. Intelligent search on XML data, applications, languages, models, implementations, and benchmarks. Springer; 2003. p. 279–93.Google Scholar
  8. 8.
    Kazai G, Lalmas M, Reid J. Construction of a test collection for the focused retrieval of structured documents. In: Proceedings of the 25th European Conference on IR Research; 2003. p. 88–103.Google Scholar
  9. 9.
    Lalmas M, Tombros A. INEX 2002–2006: Understanding XML retrieval evaluation. In: Proceedings of the 1st International DELOS Conference; Pisa; 2007. p. 187–96.Google Scholar
  10. 10.
    Mass Y, Mandelbrod M. Component ranking and automatic query refinement for XML retrieval. In: Proceedings of 3rd International Workshop of the Initiative for the Evaluation of XML Retrieval; 2004. p. 73–84.CrossRefGoogle Scholar
  11. 11.
    Pharo N, Trotman A. The use case track at INEX 2006. SIGIR Forum. 2007;41(1):64–6.CrossRefGoogle Scholar
  12. 12.
    van Zwol R, Baas J, van Oostendorp H, Wiering F. Bricks: the building blocks to tackle query formulation in structured document retrieval. In: Proceedings of the 28th European Conference on IR Research; 2006. p. 314–25.Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Yahoo! Inc.LondonUK
  2. 2.University of OtagoDunedinNew Zealand

Section editors and affiliations

  • Jaap Kamps
    • 1
  1. 1.University of AmsterdamAmsterdamThe Netherlands