Abstract
This paper presents a multi-dependency language modeling approach to information retrieval. The approach extends the basic KL-divergence retrieval approach by introducing the hybrid dependency structure, which includes syntactic dependency, syntactic proximity dependency and co-occurrence dependency, to describe dependencies between terms. Term and dependency language models are constructed for both document and query. The relevant between a document and a query is then evaluated by using the KL-divergence between their corresponding models. The new dependency retrieval model has been compared with other traditional retrieval models. Experiment results indicate that it produces significant improvements in retrieval effectiveness.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cao, G., Nie, J.-Y., Bai, J.: Integrating word relationships into language models. In: SIGIR, pp. 298–305 (2005)
Lafferty, J., Zhai, C.: Document Language Models, Query Models, and Risk Minimization for Information Retrieval. In: Proc. of SIGIR (2001)
Ponte, J.M., Croft, W.B.: A Language Modeling Approach to Information Retrieval. In: Proc. of the 21st Intl. Conf. on Research and Development in Information Retrieval (1998)
Croft, W.B., Turtle, H.R., Lewis, D.D.: The Use of Phrases and Structured Queries in Information Retrieval. In: Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Chicago, pp. 32–45 (October 1991)
Fagan, L.: Automatic phrase indexing for document retrieval: an examination of syntactic and non-syntactic methods. In: Proc of SIGIR 1987, pp. 91–101 (1987)
Van Rijsbergen, C J.: A Theoretical Basis for the Use of Co-occurrence Data in Information Retrieval. Journal of Documentation 33, 106–119 (1977)
Song, F., Croft, W.: A general language model for information retrieval. In: Proc. of Eighth Intl. Conf. on Information and Knowledge Management (1999)
Srikanth, M., Srihari, R.: Biterm Language Models for Document Retrieval. In: Proceedings of SIGIR, New York, pp. 425–426 (2002)
Nallapati, R., Allan, J.: Capturing term dependencies using a sentence tree based language model. In: CIKM (2002)
Gao, J., Nie, J.-Y., Wu, G., Cao, G.: Dependence language model for information retrieval. In: SIGIR, pp. 170–177 (2004)
Nallapati, R., Allan, J.: An Adaptive Local Dependency language Model: Relaxing the Naïve Bayes Assumption. In: Workshop on Mathematical and Formal Models in IR, ACM Special Interest Group in Information Retrieval (2003)
Hays, D G.: Dependency theory: A formalism and some observations. Language 40, 511–525 (1964)
Lin, D.: Principar—an efficient, broadcoverage, principle-based parser. In: Proceedings of COLING–94, Kyoto, Japan, pp. 482–488 (1994)
Zhai, C., Lafferty, J.: A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval. In: Proc. of SIGIR (2001)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cai, K., Chen, C., Bu, J., Qiu, G., Huang, P. (2007). A Multi-dependency Language Modeling Approach to Information Retrieval. In: Washio, T., et al. Emerging Technologies in Knowledge Discovery and Data Mining. PAKDD 2007. Lecture Notes in Computer Science(), vol 4819. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-77018-3_48
Download citation
DOI: https://doi.org/10.1007/978-3-540-77018-3_48
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-77016-9
Online ISBN: 978-3-540-77018-3
eBook Packages: Computer ScienceComputer Science (R0)