Skip to main content

TNO at CLEF-2001: Comparing Translation Resources

  • Conference paper
  • First Online:
Evaluation of Cross-Language Information Retrieval Systems (CLEF 2001)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2406))

Included in the following conference series:

Abstract

This paper describes the official runs of TNO TPD for CLEF-2001. We participated in the monolingual, bilingual and multilingual tasks. The main contribution of this paper is a systematic comparison of three types of translation resources for bilingual retrieval based on query translation. We compared several techniques based on machine readable dictionaries, statistical dictionaries generated from parallel corpora with a baseline of the Babelfish MT service, which is available on the web. The study showed that the topic set is too small to draw reliable conclusions. All three methods have the potential to reach about 90% of the monolingual baseline performance, but the effectiveness is not consistent across language pairs and topic collections. Because each of the individual methods are quite sensitive to missing translations, we tested a combination approach, which yielded consistent improvements up to 98% of the monolingual baseline.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Franz, M., McCarley, J.S., Roukos, S.: Ad hoc and multilingual information retrieval at IBM. Ellen Voorhees and Donna Harman, editors, The Seventh Text REtrieval Conference (TREC-7). National Institute for Standards and Technology, 1999. Special Publication 500-242.

    Google Scholar 

  2. Braschler, M., Schäuble, P.: Carol Peters, editor, Cross-Language Information Retrieval and Evaluation, number 2069 in Lecture Notes in Computer Science. Springer Verlag, 2001.

    Google Scholar 

  3. Hiemstra, D. A linguistically motivated probabilistic model of information retrieval. Christos Nicolaou and Constantine Stephanides, editors, Research and Advanced Technology for Digital Libraries-Second European Conference, ECDL’98, Proceedings, number 1513 in Lecture Notes in Computer Science, pages 569–584 Springer Verlag, September 1998.

    Google Scholar 

  4. Kraaij, W., Pohlmann, R., Hiemstra, D.: Twenty-one at TREC-8: using language technology for information retrieval. The Eighth Text Retrieval Conference (TREC-8). National Institute for Standards and Technology, 2000.

    Google Scholar 

  5. Hiemstra, D., Kraaij, W., Pohlmann, R., Westerveld, T.: Twenty-one at clef-2000: Translation resources, merging strategies and relevance feedback. Carol Peters, editor, Cross-Language Information Retrieval and Evaluation, number 2069 in Lecture Notes in Computer Science. Springer Verlag, 2001.

    Chapter  Google Scholar 

  6. Amit Singhal, Chris Buckley, and Mandar Mitra. Pivoted document length normalization. Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 21–29, 1996.

    Google Scholar 

  7. Robertson, S.E.: and Walker, S.: Some simple effective approximations to the 2-poisson model for probabilistic weighted retrieval. Proceedings of the Seventeenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 232–241, 1994.

    Google Scholar 

  8. Kraaij, W., and Pohlmann, R.: Viewing stemming as recall enhancement. Hans-Peter Frei, Donna Harman, Peter Schäuble, and Ross Wilkinson, editors, Proceedings of the 19th ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR96), pages 40–48, 1996.

    Google Scholar 

  9. Porter, M.F.:, An algorithm for suffix stripping. Program, 14(3):130–137, 1980.

    Google Scholar 

  10. Kraaij, W.,and Pohlmann, R.: Porter’s stemming algorithm for Dutch. In L.G.M. Noordman and W.A.M. de Vroomen, editors, Informatiewetenschap 1994: Weten-schappelijke bijdragen aan de derde STINFON Conferentie, pages 167–180, 1994.

    Google Scholar 

  11. Hull, D.: Stemming algorithms — a case study for detailed evaluation. Journal of the American Society for Information Science, 47(1), 1996.

    Google Scholar 

  12. McNamee, P. and Mayfield, J.: A language-independent approach to european text retrieval. Carol Peters, editor, Cross-Language Information Retrieval and Evaluation, number 2069 in Lecture Notes in Computer Science. Springer Verlag, 2001.

    Chapter  Google Scholar 

  13. Nie, J.Y., Simard, M., Isabelle, P., Durand, R.: Cross-language information retrieval based on parallel texts an d automatic mining of parallel texts in the web. Proceedings of the 22nd ACM-SIGIR Conference on Research and Development in Information Retrieval (SIGIR99), pages 74–81, 1999.

    Google Scholar 

  14. Brown, P.F., Della Pietra, S.A., Della Pietra, V.J., and Mercer, R.L.,: The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2):263–311, June 1993.

    Google Scholar 

  15. Vosse, T. G.: The Word Connection. PhD thesis, Rijksuniversiteit Leiden, Neslia Paniculata Uitgeverij, Enschede, 1994.

    Google Scholar 

  16. Nie, J.Y., Simard, M., Foster, G.,: Using parallel web pages for multi-lingual ir. Carol Peters, editor, Cross-Language Information Retrieval and Evaluation, number 2069 in Lecture Notes in Computer Science. Springer Verlag, 2001.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kraaij, W. (2002). TNO at CLEF-2001: Comparing Translation Resources. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_6

Download citation

  • DOI: https://doi.org/10.1007/3-540-45691-0_6

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44042-0

  • Online ISBN: 978-3-540-45691-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics