Advertisement

Combining Query Translation and Document Translation in Cross-Language Retrieval

  • Aitao Chen
  • Fredric C. Gey
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3237)

Abstract

This paper describes monolingual, bilingual, and multilingual retrieval experiments using the CLEF 2003 test collection. The paper compares query translation-based multilingual retrieval with document translation-based multilingual retrieval where the documents are translated into the query language by translating the document words individually using machine translation systems or statistical translation lexicons derived from parallel texts. The multilingual retrieval results show that document translation-based retrieval is slightly better than the query translation-based retrieval on the CLEF 2003 test collection. Furthermore, combining query translation and document translation in multilingual retrieval achieves even better performance.

Keywords

Machine Translation Query Expansion English Document Source Word Parallel Text 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Aaltio, M.: Finnish for foreigners, 3rd edn. Otava, Helsingissa (1967)Google Scholar
  2. 2.
    Atkinson, J.: Finnish grammar, 3rd edn. The Finnish Literature Society, Helsinki (1969)Google Scholar
  3. 3.
    Braschler, M., Ripplinger, B., Schäuble, P.: Experiments with the Eurospider Retrieval System for CLEF 2001. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 102–110. Springer, Heidelberg (2002)CrossRefGoogle Scholar
  4. 4.
    Chen, A., Jiang, H., Gey, F.C.: Berkeley at NTCIR-2: Chinese, Japanese, and English IR Experiments. In: Kando, N., et al. (eds.) Proceedings of the Second NTCIR Workshop Meeting on Evaluation of Chinese & Japanese Text Retrieval and Text Summarization. National Institute of Informatics, Tokyo, Japan, vol. 5, pp. 32–40 (2001)Google Scholar
  5. 5.
    Chen, A.: Cross-language Retrieval Experiments at CLEF 2002. In: Peters, C. (ed.) Working Notes for the Cross-Language Evaluation Forum (CLEF) 2002 Workshop, Rome, Italy, September 19–20, pp. 5–20 (2002)Google Scholar
  6. 6.
    Chen, A., Gey, F.C.: Building an Arabic Stemmer for Information Retrieval. In: Voorhees, E.M., Buckland, L.P. (eds.) The Eleventh Text Retrieval Conference (TREC 2002). National Institute of Standards and Technology, pp. 631–639 (2002)Google Scholar
  7. 7.
    Dunning, T.: Accurate Methods for the Statistics of Surprise and Coincidence. Computational linguistics 19, 61–74 (1993)Google Scholar
  8. 8.
  9. 9.
    Gale, W.A., Church, K.W.: A Program for Aligning Sentences in Bilingual Corpora. Computational linguistics 19, 75–102 (1993)Google Scholar
  10. 10.
    Holmes, P., Hinchliffe, I.: Swedish: A Comprehensive Grammar. Routledge, London (1994)Google Scholar
  11. 11.
    Karp, D., Schabes, Y., Zaidel, M., Egedi, D.: A Freely Available Wide Coverage Morphological Analyzer for English. In: Proceedings of COLING (1992)Google Scholar
  12. 12.
    Oard, D.W., Levow, G., Gabezas, G.I.: CLEF Experiments at the University of Maryland: Statistical Sstemming and backoff translation strategies. In: Peters, C. (ed.) CLEF 2000. LNCS, vol. 2069, pp. 176–187. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  13. 13.
    Och, F.J., Ney, H.: Improved Statistical Alignment Models. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, pp. 440–447 (2000)Google Scholar
  14. 14.
    Porter, M.: Snowball: A language for stemming algorithms (2001), Available at, http://snowball.tartarus.org/texts/introduction.html

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Aitao Chen
    • 1
  • Fredric C. Gey
    • 2
  1. 1.School of Information Management and SystemsUniversity of California at BerkeleyUSA
  2. 2.UC Data Archive & Technical Assistance (UC DATA)University of California at BerkeleyUSA

Personalised recommendations