Applying Maximum Entropy to Known-Item Email Retrieval

Yahyaei, Sirvan; Monz, Christof

doi:10.1007/978-3-540-78646-7_37

Sirvan Yahyaei¹ &
Christof Monz¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4956))

Included in the following conference series:

European Conference on Information Retrieval

2162 Accesses
2 Citations

Abstract

It is becoming increasingly common in information retrieval to combine evidence from multiple resources to compute the retrieval status value of documents. Although this has led to considerable improvements in several retrieval tasks, one of the outstanding issues is estimation of the respective weights that should be associated with the different sources of evidence. In this paper we propose to use maximum entropy in combination with the limited memory LBFG algorithm to estimate feature weights. Examining the effectiveness of our approach with respect to the known-item finding task of enterprise track of TREC shows that it significantly outperforms a standard retrieval baseline and leads to competitive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Berger, A.L., Della Pietra, V.J., Della Pietra, S.A.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)
Google Scholar
Cooper, W.S.: Exploiting the maximum entropy principle to increase retrieval effectiveness. Journal of the American Society for Information Science 34(1), 31–39 (1983)
Article Google Scholar
Craswell, N., de Vries, A., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of the 14th Text REtrieval Conference (2006)
Google Scholar
Greiff, W.R., Ponte, J.M.: The maximum entropy approach and probabilistic ir models. ACM Trans. Inf. Syst. 18(3), 246–287 (2000)
Article Google Scholar
Kantor, P.B., Lee, J.J.: The maximum entropy principle in information retrieval. In: SIGIR 1986: Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 269–274. ACM Press, New York (1986)
Chapter Google Scholar
Kantor, P.B., Lee, J.J.: Testing the maximum entropy principle for information retrieval. J. Am. Soc. Inf. Sci. 49(6), 557–566 (1998)
Article Google Scholar
Lalmas, M.: Uniform representation of content and structure for structured document retrieval. In: 20th SGES International Conference on Knowledge Based Systems and Applied Artificial Intelligence (2000)
Google Scholar
Monz, C.: From Document Retrieval to Question Answering. PhD thesis, University of Amsterdam (2003)
Google Scholar
Nallapati, R.: Discriminative models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 64–71 (2004)
Google Scholar
Nocedal, J.: Updating quasi-newton matrices with limited storage. Mathematics of Computation 35, 773–782 (1980)
Article MATH MathSciNet Google Scholar
Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 143–150. ACM Press, New York (2003)
Chapter Google Scholar
Ogilvie, P., Callan, J.: Experiments with language models for known-item finding of e-mail messages. In: Proceedings of the Fourteenth Text Retrieval Conference (TREC-14) (2005)
Google Scholar
Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, p. 79. Springer, Heidelberg (2003)
Chapter Google Scholar
Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: CIKM 2004: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 42–49. ACM Press, New York (2004)
Chapter Google Scholar
Tsikrika, T., Lalmas, M.: Combining evidence from web retrieval using the inference network model - an experimental study. Information Processing & Management, Special Issue in Bayesian Networks and Information Retrieval 40(5), 751–772 (2004)
Google Scholar
Zobel, J., Moffat, A.: Exploring the similarity space. SIGIR Forum 32(1), 18–34 (1998)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Queen Mary University of London, London, E1 4NS, UK
Sirvan Yahyaei & Christof Monz

Authors

Sirvan Yahyaei
View author publications
You can also search for this author in PubMed Google Scholar
Christof Monz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Craig Macdonald Iadh Ounis Vassilis Plachouras Ian Ruthven Ryen W. White

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yahyaei, S., Monz, C. (2008). Applying Maximum Entropy to Known-Item Email Retrieval. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds) Advances in Information Retrieval. ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78646-7_37

Download citation

DOI: https://doi.org/10.1007/978-3-540-78646-7_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78645-0
Online ISBN: 978-3-540-78646-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics