Skip to main content

Cross-Language Retrieval via Transitive Translation

  • Chapter
Advances in Information Retrieval

Part of the book series: The Information Retrieval Series ((INRE,volume 7))

Abstract

The growth in availability of multi-lingual data in all areas of the public and private sector is driving an increasing need for systems that facilitate access to multi-lingual resources. Cross-language Retrieval (CLR) technology is a means of addressing this need.

A CLR system must address two main hurdles to effective cross-language retrieval. First, it must address the ambiguity that arises when trying to map the meaning of text across languages. That is, it must address both within-language ambiguity and cross-language ambiguity. Second, it has to incorporate multilingual resources that will enable it to perform the mapping across languages. The difficulty here is that there is a limited number of lexical resources and virtually none for some pairs of languages.

This work focuses on a dictionary approach to addressing the problem of limited lexical resources. A dictionary approach is taken since bilingual dictionaries are more prevalent and simpler to apply than other resources. We show that a transitive translation approach, where a third language is employed as an interlingua between the source and target languages, is a viable means of performing CLR between languages for which no bilingual dictionary is available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Allan, J., Ballesteros, L., Callan, J., Croft, W., and Lu, Z. (1995). Recent experiments with INQUERY. In Proceedings of the Fourth Retrieval Conference (TREC-4) Gaithersburg, MD: National Institute of Standards and Technology.

    Google Scholar 

  • Allan, J., Callan, J., Croft, W., Ballesteros, L., Broglio, J., Xu, J., and Shu, H. (1996). INQUERY at TREC-5. In Proceedings of the Fifth Retrieval Conference (TREC-5) Gaithersburg, MD: National Institute of Standards and Technology.

    Google Scholar 

  • Attar, R. and Fraenkel, A. S. (1977). Local feedback in full-text retrieval systems. Journal of the Association for Computing Machinery, 24:397–417.

    Google Scholar 

  • Ballesteros, L. and Croft, W. B. (1996). Dictionary-based methods for cross-lingual information retrieval. In Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, pages 791–801.

    Google Scholar 

  • Ballesteros, L. and Croft, W. B. (1997). Phrasal translation and query expansion techniques for cross-language information retrieval. In Proceedings of the 20th International SIGIR Conference on Research and Development in Information Retrieval, pages 84–91.

    Google Scholar 

  • Ballesteros, L. and Croft, W. B. (1998). Resolving ambiguity for cross-language retrieval. In Proceedings of the 21st International SIGIR Conference on Research and Development in Information Retrieval, pages 64–71.

    Google Scholar 

  • BBN. BBN part-of-speech tagger for Spanish. http://www.gte,com/bbnt/ (July 1999).

    Google Scholar 

  • Boughanem, M. and Soulé-Dupuy, C. (1997). Mercure at TREC-6. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 321–328.

    Google Scholar 

  • Broglio, J., Callan, J., and Croft, W. (1994). INQUERY system overview. In Proceedings of the TIPSTER Text Program (Phase I), pages 47–67.

    Google Scholar 

  • Buckley, C., Mitra, M., Walz, J., and Cardie, C. (1997). Using clustering and superconcepts within SMART: TREC-6. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 107–121.

    Google Scholar 

  • Davis, M. and Dunning, T. (1995a). Query translation using evolutionary programming for multi-lingual information retrieval. In Proceedings of the Fourth Annual Conference on Evolutionary Programming.

    Google Scholar 

  • Davis, M. and Dunning, T. (1995b). A TREC evaluation of query translation methods for multi-lingual text retrieval. In In Proceedings of the Fourth Retrieval Conference (TREC-4) Gaithersburg, MD: National Institute of Standards and Technology Special Publication 500-236.

    Google Scholar 

  • Furnas, G., Deerwester, S., Dumais, S., and R.A. Harshman, T. L., Streeter, L., and Lochbaum, K. (1988). Information retrieval using a singular value decomposition model of latent semantic structure. In Proceedings of the 1 1th lnternational SIGIR Conference on Research and Development in Information Retrieval, pages 465–480.

    Google Scholar 

  • Han, C., Fujii, H., and Croft, W. (1994). Automatic query expansion of Japanese text retrieval. Technical Report TR 95-11, Computer Science Department, University of Massachusetts.

    Google Scholar 

  • Hull, D. (1996). Stemming algorithms-a case study for detailed evalutation. Journal of the American Society for Information Science, 47:70–84.

    Article  Google Scholar 

  • Hull, D. A. and Grefenstette, G. (1996). Querying across languages: A dictionary-based approach to multilingual information retrieval. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 49–57.

    Google Scholar 

  • Landauer, T. K. and Littman, M. L. (1990). Fully automatic cross-language document retrieval. In Proceedings of the Sixth Conference of the UW Center for the New Oxford English Dictionary and Text Research, pages 31–38.

    Google Scholar 

  • Picchi, E. and Peters, C. (1996). Cross language information retrieval: A system for comparable corpus querying. In Grefenstette, G., editor, Cross-Language Information Retrieval, chapter 7, pages 81–92. Kluwer Academic Publishers.

    Google Scholar 

  • Pirkola, A. (1998). The effects of query structure and dictionary setups in dictionary-base cross-language information retrieval. In Proceedings of the 21st International Conference on Research and Development in Information Retrieval, pages 55–63.

    Google Scholar 

  • Ponte, J. (1998). A Language Modeling Approach to Information Retrieval. PhD thesis, Computer Science Department, University of Massachusetts.

    Google Scholar 

  • Rehder, B., Littman, M. L., Dumais, S., and Landauer, T. K. (1997). Automatic 3-language cross-language information retrieval with latent semantic indexing. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 233–239.

    Google Scholar 

  • Salton, G. (1972). Experiments in multi-lingual information retrieval. Technical report TR 72-154, Computer Science Department, Cornell University.

    Google Scholar 

  • Salton, G. and Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41:288–297.

    Article  Google Scholar 

  • Schmid, H. (1994). Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of the International Conference on New Methods in Language Processing.

    Google Scholar 

  • Sheridan, P. and Ballerini, J. P. (1996). Experiments in multilingual information retrieval using the SPIDER system. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 58–65.

    Google Scholar 

  • Sheridan, P., Braschler, M., and Schauble, P. (1997). Cross-language information retrieval in a multilingual legal domain. In Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, pages 253–268.

    Google Scholar 

  • Turtle, H. R. and Croft, W. B. (1991a). Efficient probabilistic inference for text retrieval. In RIAO 3 Conference Proceedings, pages 664–661.

    Google Scholar 

  • Turtle, H. R. and Croft, W. B. (1991b). Inference networks for document retrieval. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 1–24.

    Google Scholar 

  • UN. Linguistic data consortium resource: U.N. parallel text. http://www.1dc.upenn.edu/Catalog/LDC94T4A.html (June 1999).

    Google Scholar 

  • Voorhees, E. and Harman, D., editors (1997). Proceedings of the 6th Text Retrieval Conference (TREC-6), National Institute of Standards and Technology.

    Google Scholar 

  • Xerox. Xerox finite-state morphological analyzers http://www.xrce.xerox.com:80/research/mltt/Tools/morph.html (Dec. 1998).

    Google Scholar 

  • Xu, J. and Croft, W. B. (1996). Querying expansion using local and global document analysis. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 4–11.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Kluwer Academic Publishers

About this chapter

Cite this chapter

Ballesteros, L.A. (2002). Cross-Language Retrieval via Transitive Translation. In: Croft, W.B. (eds) Advances in Information Retrieval. The Information Retrieval Series, vol 7. Springer, Boston, MA. https://doi.org/10.1007/0-306-47019-5_8

Download citation

  • DOI: https://doi.org/10.1007/0-306-47019-5_8

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-0-7923-7812-9

  • Online ISBN: 978-0-306-47019-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics