Skip to main content

Applying Maximum Entropy to Known-Item Email Retrieval

  • Conference paper
Book cover Advances in Information Retrieval (ECIR 2008)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4956))

Included in the following conference series:

Abstract

It is becoming increasingly common in information retrieval to combine evidence from multiple resources to compute the retrieval status value of documents. Although this has led to considerable improvements in several retrieval tasks, one of the outstanding issues is estimation of the respective weights that should be associated with the different sources of evidence. In this paper we propose to use maximum entropy in combination with the limited memory LBFG algorithm to estimate feature weights. Examining the effectiveness of our approach with respect to the known-item finding task of enterprise track of TREC shows that it significantly outperforms a standard retrieval baseline and leads to competitive performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berger, A.L., Della Pietra, V.J., Della Pietra, S.A.: A maximum entropy approach to natural language processing. Comput. Linguist. 22(1), 39–71 (1996)

    Google Scholar 

  2. Cooper, W.S.: Exploiting the maximum entropy principle to increase retrieval effectiveness. Journal of the American Society for Information Science 34(1), 31–39 (1983)

    Article  Google Scholar 

  3. Craswell, N., de Vries, A., Soboroff, I.: Overview of the trec-2005 enterprise track. In: Proceedings of the 14th Text REtrieval Conference (2006)

    Google Scholar 

  4. Greiff, W.R., Ponte, J.M.: The maximum entropy approach and probabilistic ir models. ACM Trans. Inf. Syst. 18(3), 246–287 (2000)

    Article  Google Scholar 

  5. Kantor, P.B., Lee, J.J.: The maximum entropy principle in information retrieval. In: SIGIR 1986: Proceedings of the 9th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 269–274. ACM Press, New York (1986)

    Chapter  Google Scholar 

  6. Kantor, P.B., Lee, J.J.: Testing the maximum entropy principle for information retrieval. J. Am. Soc. Inf. Sci. 49(6), 557–566 (1998)

    Article  Google Scholar 

  7. Lalmas, M.: Uniform representation of content and structure for structured document retrieval. In: 20th SGES International Conference on Knowledge Based Systems and Applied Artificial Intelligence (2000)

    Google Scholar 

  8. Monz, C.: From Document Retrieval to Question Answering. PhD thesis, University of Amsterdam (2003)

    Google Scholar 

  9. Nallapati, R.: Discriminative models for information retrieval. In: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 64–71 (2004)

    Google Scholar 

  10. Nocedal, J.: Updating quasi-newton matrices with limited storage. Mathematics of Computation 35, 773–782 (1980)

    Article  MATH  MathSciNet  Google Scholar 

  11. Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: SIGIR 2003: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 143–150. ACM Press, New York (2003)

    Chapter  Google Scholar 

  12. Ogilvie, P., Callan, J.: Experiments with language models for known-item finding of e-mail messages. In: Proceedings of the Fourteenth Text Retrieval Conference (TREC-14) (2005)

    Google Scholar 

  13. Rasolofo, Y., Savoy, J.: Term proximity scoring for keyword-based retrieval systems. In: Sebastiani, F. (ed.) ECIR 2003. LNCS, vol. 2633, p. 79. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: CIKM 2004: Proceedings of the thirteenth ACM international conference on Information and knowledge management, pp. 42–49. ACM Press, New York (2004)

    Chapter  Google Scholar 

  15. Tsikrika, T., Lalmas, M.: Combining evidence from web retrieval using the inference network model - an experimental study. Information Processing & Management, Special Issue in Bayesian Networks and Information Retrieval 40(5), 751–772 (2004)

    Google Scholar 

  16. Zobel, J., Moffat, A.: Exploring the similarity space. SIGIR Forum 32(1), 18–34 (1998)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Craig Macdonald Iadh Ounis Vassilis Plachouras Ian Ruthven Ryen W. White

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Yahyaei, S., Monz, C. (2008). Applying Maximum Entropy to Known-Item Email Retrieval. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds) Advances in Information Retrieval. ECIR 2008. Lecture Notes in Computer Science, vol 4956. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78646-7_37

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-78646-7_37

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-78645-0

  • Online ISBN: 978-3-540-78646-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics