Synonyms
Controlling overlap; Removing overlap
Definition
In semi-structured text retrieval, processing overlap techniques are used to reduce the amount of overlapping (thus redundant) information returned to the user. The existence of redundant information in result lists is caused by the nested structure of semi-structured documents, where the same text fragment may appear in several of the marked up elements (see Fig. 1). In consequence, when retrieval systems perform a focused search on this type of document and use the marked up elements as retrieval objects, very often result lists contain overlapping elements. In retrieval applications where it is assumed that the user does not want to see the same information twice, it may be necessary to reduce or completely remove this overlap and return a ranked list of no overlapping elements. Thus, depending on the underlying user model and retrieval application, different processing overlap techniques are used in order to decide, given a...
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsRecommended Reading
Clarke CLA. Controlling overlap in content-oriented XML retrieval. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2008. p. 314–21.
Geva S. GPX – gardens point XML IR at INEX 2005. In: Proceedings of the 4th International Workshop of the Initiative for the Evaluation of XML Retrieval; 2006. p. 240–53.
Kazai G, Lalmas M, de Vries AP. The overlap problem in content-oriented XML retrieval evaluation. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2007. p. 72–9.
Mass Y, Mandelbrod M. Using the INEX environment as a test bed for various user models for XML retrieval. In: Proceedings of the 4th International Workshop of the Initiative for the Evaluation of XML Retrieval; 2006. p. 187–95.
Mihajlovi V, Ramírez G, Westerveld T, Hiemstra D, Blok HE, de Vries AP. TIJAH scratches INEX 2005: vague element selection, image search, overlap and relevance feedback. 2006. p. 72–87.
Sauvagnat K, Hlaoua L, Boughanem M. XFIRM at INEX 2005: ad-hoc and relevance feedback tracks. In: Proceedings of the 4th International Workshop of the Initiative for the Evaluation of XML Retrieval; 2006. p. 88–103.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Ramírez, G. (2018). Processing Overlaps in Structured Text Retrieval. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_279
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_279
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering