Abstract
In this paper we present an extension of information retrieval based on Kullback-Leibler divergence (with backoff smoothing) to support structured queries on structured documents. The proposed method applies to several common retrieval tasks characterized by an implication relationship among texts, including fielded topics and XML documents. We discuss how to choose the method parameters to make the computation of the ranking function efficient. We finally report some experimental results obtained using a loose approximation of the model based on a discriminative selection strategy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Carpineto, C., De Mori, R., Romano, G., Bigi, B.: An information theoretic approach to automatic query expansion. ACM Transactions on Information Systems 19(1), 1–27 (2001)
Fuhr, N., GrossJohann, K.: XIRQL: A Query Language for Information Retrieval in XML Documents. In: Proceedings of SIGIR 2001, New Orleans, LA, USA, pp. 172–180 (2001)
Katz, S.: Estimation of probabilities from sparses data for language model component of a speech recognizer. IEEE Trans. Acoust. Speech Signal Process. 35, 400–401 (1987)
Lafferty, J., Zhai, C.: Document language models, query models, and risk minimization for information retrieval. In: Proceedings of the 24th Annual International ACM SIGIR Conference on Research, Development in Information Retrieval, New Orleans, LA, USA, pp. 111–119 (2001)
Ogilvie, P., Callan, J.: Language Models and Structured Document Retrieval. In: Proceedings of the INEX 2002 Worksop, Schloss Dagsthul, Germany, pp. 33–40 (2002)
Zhai, C., Lafferty, J.: A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems 22(2), 179–214 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Carpineto, C., Romano, G., Caracciolo, C. (2007). Information Theoretic Retrieval with Structured Queries and Documents. In: Fuhr, N., Lalmas, M., Trotman, A. (eds) Comparative Evaluation of XML Information Retrieval Systems. INEX 2006. Lecture Notes in Computer Science, vol 4518. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73888-6_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-73888-6_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73887-9
Online ISBN: 978-3-540-73888-6
eBook Packages: Computer ScienceComputer Science (R0)