Skip to main content

JHU/APL Experiments at CLEF: Translation Resources and Score Normalization

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2406))

Abstract

The Johns Hopkins University Applied Physics Laboratory participated in three of the five tasks of the CLEF-2001 evaluation, monolingual retrieval, bilingual retrieval, and multilingual retrieval. In this paper we describe the fundamental methods we used and we present initial results from three experiments. The first investigation examines whether residual inverse document frequency can improve the term weighting methods used with a linguistically-motivated probabilistic model. The second experi-ment attempts to assess the benefit of various translation resources for cross-language retrieval. Our last effort aims to improve cross-collection score normalization, a task essential for the multilingual problem.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. C. Buckley, M. Mitra, J. Walz, and C. Cardie, ‘Using Clustering and Super Concepts within SMART: TREC-6’. In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500–240, 1998.

    Google Scholar 

  2. K. W. Church, ‘Char_align: A program for aligning parallel texts at the character level.’ In the Proceedings of the 31 st Annual Meeting of the Association for Computational Linguistics, pp. 1–8, 1993.

    Google Scholar 

  3. K. W. Church, ‘One Term or Two?’, In the Proceedings of the 18 th International Conference on Research and Development in Information Retrieval (SIGIR-95), pp. 310–318, 1995.

    Google Scholar 

  4. F. Gey, H. Jiang, V. Petras, and A. Chen, ‘Cross-Language Retrieval for the CLEF Collections — Comparing Multiple Methods of Retrieval.’ In Carol Peters (ed.), Cross-Language Information Retrieval and Evaluation: Proceedings of the CLEF 2000 Workshop, Lecture Notes in Computer Science 2069, Springer, pp. 116–128, 2001.

    Chapter  Google Scholar 

  5. F. Gey, H. Jiang, A. Chen, and R. Larson, ‘Manual Queries and Machine Translation in Cross-language Retrieval and Interactive Retrieval with Cheshire II at TREC-7’. In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Seventh Text REtrieval Conference (TREC-7), pp. 527–540, 1999.

    Google Scholar 

  6. D. Hiemstra and A. de Vries, ‘Relating the new language models of information retrieval to the traditional retrieval models.’ CTIT Technical Report TR-CTIT-00-09, May 2000.

    Google Scholar 

  7. P. McNamee, J. Mayfield, and C. Piatko, ‘A Language-Independent Approach to European Text Retrieval.’ In Carol Peters (ed.), Cross-Language Information Retrieval and Evaluation: Proceedings of the CLEF 2000 Workshop, Lecture Notes in Computer Science2069, Springer, pp. 129–139, 2001.

    Chapter  Google Scholar 

  8. J. Mayfield, P. McNamee, and C. Piatko, ‘The JHU/APL HAIRCUT System at TREC-8.’ In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Eighth Text REtrieval Conference (TREC-8), pp. 445–451, 2000.

    Google Scholar 

  9. D. R. H. Miller, T. Leek, and R. M. Schwartz, ‘A Hidden Markov Model Information Retrieval System.’ In the Proceedings of the 22 nd International Conference on Research and Development in Information Retrieval (SIGIR-99), pp. 214–221, August 1999.

    Google Scholar 

  10. Witten, A. Moffat, and T. Bell, ‘Managing Gigabytes’, Chapter 3, Morgan Kaufmann, 1999.

    Google Scholar 

  11. M. Yamamoto and K. Church, ‘Using Suffix Arrays to Compute Term Frequency and Document Frequency for all Substrings in a Corpus’. In Computational Linguistics, vol 27(1), pp. 1–30, 2001.

    Article  Google Scholar 

  12. http://dictionaries.travlang.com/

  13. http://europa.eu.int/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McNamee, P., Mayfield, J. (2002). JHU/APL Experiments at CLEF: Translation Resources and Score Normalization. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_17

Download citation

  • DOI: https://doi.org/10.1007/3-540-45691-0_17

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-44042-0

  • Online ISBN: 978-3-540-45691-9

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics