JHU/APL Experiments at CLEF: Translation Resources and Score Normalization

McNamee, Paul; Mayfield, James

doi:10.1007/3-540-45691-0_17

JHU/APL Experiments at CLEF: Translation Resources and Score Normalization

Paul McNamee⁵ &
James Mayfield⁵

Conference paper
First Online: 01 January 2002

571 Accesses
5 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2406))

Abstract

The Johns Hopkins University Applied Physics Laboratory participated in three of the five tasks of the CLEF-2001 evaluation, monolingual retrieval, bilingual retrieval, and multilingual retrieval. In this paper we describe the fundamental methods we used and we present initial results from three experiments. The first investigation examines whether residual inverse document frequency can improve the term weighting methods used with a linguistically-motivated probabilistic model. The second experi-ment attempts to assess the benefit of various translation resources for cross-language retrieval. Our last effort aims to improve cross-collection score normalization, a task essential for the multilingual problem.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

C. Buckley, M. Mitra, J. Walz, and C. Cardie, ‘Using Clustering and Super Concepts within SMART: TREC-6’. In E. Voorhees and D. Harman (eds.), Proceedings of the Sixth Text REtrieval Conference (TREC-6), NIST Special Publication 500–240, 1998.
Google Scholar
K. W. Church, ‘Char_align: A program for aligning parallel texts at the character level.’ In the Proceedings of the 31 ^st Annual Meeting of the Association for Computational Linguistics, pp. 1–8, 1993.
Google Scholar
K. W. Church, ‘One Term or Two?’, In the Proceedings of the 18 ^th International Conference on Research and Development in Information Retrieval (SIGIR-95), pp. 310–318, 1995.
Google Scholar
F. Gey, H. Jiang, V. Petras, and A. Chen, ‘Cross-Language Retrieval for the CLEF Collections — Comparing Multiple Methods of Retrieval.’ In Carol Peters (ed.), Cross-Language Information Retrieval and Evaluation: Proceedings of the CLEF 2000 Workshop, Lecture Notes in Computer Science 2069, Springer, pp. 116–128, 2001.
Chapter Google Scholar
F. Gey, H. Jiang, A. Chen, and R. Larson, ‘Manual Queries and Machine Translation in Cross-language Retrieval and Interactive Retrieval with Cheshire II at TREC-7’. In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Seventh Text REtrieval Conference (TREC-7), pp. 527–540, 1999.
Google Scholar
D. Hiemstra and A. de Vries, ‘Relating the new language models of information retrieval to the traditional retrieval models.’ CTIT Technical Report TR-CTIT-00-09, May 2000.
Google Scholar
P. McNamee, J. Mayfield, and C. Piatko, ‘A Language-Independent Approach to European Text Retrieval.’ In Carol Peters (ed.), Cross-Language Information Retrieval and Evaluation: Proceedings of the CLEF 2000 Workshop, Lecture Notes in Computer Science2069, Springer, pp. 129–139, 2001.
Chapter Google Scholar
J. Mayfield, P. McNamee, and C. Piatko, ‘The JHU/APL HAIRCUT System at TREC-8.’ In E. M. Voorhees and D. K. Harman, eds., Proceedings of the Eighth Text REtrieval Conference (TREC-8), pp. 445–451, 2000.
Google Scholar
D. R. H. Miller, T. Leek, and R. M. Schwartz, ‘A Hidden Markov Model Information Retrieval System.’ In the Proceedings of the 22 ^nd International Conference on Research and Development in Information Retrieval (SIGIR-99), pp. 214–221, August 1999.
Google Scholar
Witten, A. Moffat, and T. Bell, ‘Managing Gigabytes’, Chapter 3, Morgan Kaufmann, 1999.
Google Scholar
M. Yamamoto and K. Church, ‘Using Suffix Arrays to Compute Term Frequency and Document Frequency for all Substrings in a Corpus’. In Computational Linguistics, vol 27(1), pp. 1–30, 2001.
Article Google Scholar
http://dictionaries.travlang.com/
http://europa.eu.int/

Download references

Author information

Authors and Affiliations

Applied Physics Lab, Johns Hopkins University, 11100 Johns Hopkins Road, Laurel, MD, 20723-6099, USA
Paul McNamee & James Mayfield

Authors

Paul McNamee
View author publications
You can also search for this author in PubMed Google Scholar
James Mayfield
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Consiglio Nazionale delle Ricerche, Istituto di Scienza e Tecnologie, Via G. Moruzzi 1, 56124, Pisa, Italy
Carol Peters
Eurospider Information Technology AG, Schaffhauserstrasse 18, 8006, Zürich, Switzerland
Martin Braschler
E.T.S.I. Industriales, Universidad Nacional de Educación a Distancia, Ciudad Universitaria s/n, 28040, Madrid, Spain
Julio Gonzalo
Informations Zentrum Sozialwissenschaften, Lennestr. 30, 53113, Bonn, Germany
Michael Kluck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

McNamee, P., Mayfield, J. (2002). JHU/APL Experiments at CLEF: Translation Resources and Score Normalization. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds) Evaluation of Cross-Language Information Retrieval Systems. CLEF 2001. Lecture Notes in Computer Science, vol 2406. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45691-0_17

Download citation

DOI: https://doi.org/10.1007/3-540-45691-0_17
Published: 02 August 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44042-0
Online ISBN: 978-3-540-45691-9
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics