Aggregation-Based Structured Text Retrieval

Tsikrika, Theodora

doi:10.1007/978-1-4899-7993-3_14-2

Theodora Tsikrika³

115 Accesses

Definition

Text retrieval is concerned with the retrieval of documents in response to user queries. This is achieved by (i) representing documents and queries with indexing features that provide a characterisation of their information content, and (ii) defining a function that uses these representations to perform retrieval. Structured text retrieval introduces a finer-grained retrieval paradigm that supports the representation and subsequent retrieval of the individual document components defined by the document’s logical structure. Aggregation-based structured text retrieval defines (i) the representation of each document component as the aggregation of the representation of its own information content and the representations of information content of its structurally related components, and (ii) retrieval of document components based on these (aggregated) representations.

The aim of aggregation-based approaches is to improve retrieval effectiveness by capturing and exploiting the...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Recommended Reading

Chiaramella Y. Information retrieval and structured documents. In Lectures on information retrieval, Third European Summer-School, Revised Lectures, LNCS, Agosti M, Crestani F, and Pasi G (eds.). Vol. 1980. Springer; 2001, p. 286–309.
Google Scholar
Chiaramella Y, Mulhem P, and Fourel F. A model for multimedia information retrieval. Technical Report FERMI, ESPRIT BRA 8134, University of Glasgow, Scotland; 1996.
Google Scholar
Croft WB. Combining approaches to information retrieval. In Advances in information retrieval: Recent research from the center for intelligent information retrieval, Croft WB (ed.). The Information retrieval series, Vol. 7. Kluwer Academic, Dordrecht; 2000,p. 1–36.
Google Scholar
Fuhr N, Gövert N, and Rölleke T. DOLORES: A system for logic-based retrieval of multimedia objects. In Proceeding of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998, p. 257–65.
Google Scholar
Fuhr N and Großjohann K. XIRQL: A query language for information retrieval in XML documents. In Proceeding of 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 2001, p. 172–80.
Google Scholar
Gövert N, Abolhassani M, Fuhr N, and Großjohann K. Content-oriented XML retrieval with HyREX. In Proceeding of 1st International Workshop of the Initiative for the Evaluation of XML retrieval; 2003, p. 26–32.
Google Scholar
Kazai G, Lalmas M, and Rölleke T A model for the representation and focussed retrieval of structured documents based on fuzzy aggregation. In Proceeding of 8th International Symposium on String Processing and Information Retrieval; 2001, p. 123–35.
Google Scholar
Lalmas M. Dempster-Shafer’s theory of evidence applied to structured documents: Modelling uncertainty. In Proceeding of 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1997, p. 110–18.
Google Scholar
Lu W, Robertson SE, and MacFarlane A. Field-weighted XML retrieval based on BM25. In Proceeding of 4th International Workshop of the Initiative for the Evaluation of XML Retrieval, Revised Selected Papers, LNCS, Vol. 3977, Springer; 2006, p. 161–71.
Google Scholar
Mass Y and Mandelbrod M. Retrieving the most relevant XML components. In Proceeding of 2nd International Workshop of the Initiative for the Evaluation of XML Retrieval; 2004, p. 53–58.
Google Scholar
Myaeng SH, Jang DH, Kim MS, and Zhoo ZC. A flexible model for retrieval of SGML documents. In Proceeding of 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1998, p. 138–45.
Google Scholar
Ogilvie P and Callan J. Hierarchical language models for retrieval of XML components. In Advances in XML Information Retrieval and Evaluation. In Proceeding of 3rd International Workshop of the Initiative for the Evaluation of XML Retrieval, Revised Selected Papers, LNCS, Vol. 3493, Springer; 2005, p. 224–37.
Google Scholar
Robertson SE, Zaragoza H, and Taylor M. Simple BM25 extension to multiple weighted fields. In Proceeding of International Conference on Information and Knowledge Management; 2004, p. 42–9.
Google Scholar
Sauvagnat K, Boughanem M, and Chrisment C. Searching XML documents using relevance propagation. In Proceeding 11th International Symposium on String Processing and Information Retrieval; 2004, p 242–54.
Google Scholar
Wilkinson R. Effective retrieval of structured documents. In Proceeding 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval; 1994, p. 311–17.
Google Scholar

Download references

Author information

Authors and Affiliations

Center for Mathematics and Computer Science, Amsterdam, The Netherlands
Theodora Tsikrika

Authors

Theodora Tsikrika
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Theodora Tsikrika .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, Georgia, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, Ontario, Canada
M. Tamer Özsu

Section Editor information

University of Amsterdam, Amsterdam, The Netherlands
Jaap Kamps

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Tsikrika, T. (2016). Aggregation-Based Structured Text Retrieval. In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_14-2

Download citation

DOI: https://doi.org/10.1007/978-1-4899-7993-3_14-2
Received: 18 April 2016
Accepted: 29 September 2016
Published: 31 December 2016
Publisher Name: Springer, New York, NY
Online ISBN: 978-1-4899-7993-3
eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics