Abstract
The growth in availability of multi-lingual data in all areas of the public and private sector is driving an increasing need for systems that facilitate access to multi-lingual resources. Cross-language Retrieval (CLR) technology is a means of addressing this need.
A CLR system must address two main hurdles to effective cross-language retrieval. First, it must address the ambiguity that arises when trying to map the meaning of text across languages. That is, it must address both within-language ambiguity and cross-language ambiguity. Second, it has to incorporate multilingual resources that will enable it to perform the mapping across languages. The difficulty here is that there is a limited number of lexical resources and virtually none for some pairs of languages.
This work focuses on a dictionary approach to addressing the problem of limited lexical resources. A dictionary approach is taken since bilingual dictionaries are more prevalent and simpler to apply than other resources. We show that a transitive translation approach, where a third language is employed as an interlingua between the source and target languages, is a viable means of performing CLR between languages for which no bilingual dictionary is available.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allan, J., Ballesteros, L., Callan, J., Croft, W., and Lu, Z. (1995). Recent experiments with INQUERY. In Proceedings of the Fourth Retrieval Conference (TREC-4) Gaithersburg, MD: National Institute of Standards and Technology.
Allan, J., Callan, J., Croft, W., Ballesteros, L., Broglio, J., Xu, J., and Shu, H. (1996). INQUERY at TREC-5. In Proceedings of the Fifth Retrieval Conference (TREC-5) Gaithersburg, MD: National Institute of Standards and Technology.
Attar, R. and Fraenkel, A. S. (1977). Local feedback in full-text retrieval systems. Journal of the Association for Computing Machinery, 24:397–417.
Ballesteros, L. and Croft, W. B. (1996). Dictionary-based methods for cross-lingual information retrieval. In Proceedings of the 7th International DEXA Conference on Database and Expert Systems Applications, pages 791–801.
Ballesteros, L. and Croft, W. B. (1997). Phrasal translation and query expansion techniques for cross-language information retrieval. In Proceedings of the 20th International SIGIR Conference on Research and Development in Information Retrieval, pages 84–91.
Ballesteros, L. and Croft, W. B. (1998). Resolving ambiguity for cross-language retrieval. In Proceedings of the 21st International SIGIR Conference on Research and Development in Information Retrieval, pages 64–71.
BBN. BBN part-of-speech tagger for Spanish. http://www.gte,com/bbnt/ (July 1999).
Boughanem, M. and Soulé-Dupuy, C. (1997). Mercure at TREC-6. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 321–328.
Broglio, J., Callan, J., and Croft, W. (1994). INQUERY system overview. In Proceedings of the TIPSTER Text Program (Phase I), pages 47–67.
Buckley, C., Mitra, M., Walz, J., and Cardie, C. (1997). Using clustering and superconcepts within SMART: TREC-6. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 107–121.
Davis, M. and Dunning, T. (1995a). Query translation using evolutionary programming for multi-lingual information retrieval. In Proceedings of the Fourth Annual Conference on Evolutionary Programming.
Davis, M. and Dunning, T. (1995b). A TREC evaluation of query translation methods for multi-lingual text retrieval. In In Proceedings of the Fourth Retrieval Conference (TREC-4) Gaithersburg, MD: National Institute of Standards and Technology Special Publication 500-236.
Furnas, G., Deerwester, S., Dumais, S., and R.A. Harshman, T. L., Streeter, L., and Lochbaum, K. (1988). Information retrieval using a singular value decomposition model of latent semantic structure. In Proceedings of the 1 1th lnternational SIGIR Conference on Research and Development in Information Retrieval, pages 465–480.
Han, C., Fujii, H., and Croft, W. (1994). Automatic query expansion of Japanese text retrieval. Technical Report TR 95-11, Computer Science Department, University of Massachusetts.
Hull, D. (1996). Stemming algorithms-a case study for detailed evalutation. Journal of the American Society for Information Science, 47:70–84.
Hull, D. A. and Grefenstette, G. (1996). Querying across languages: A dictionary-based approach to multilingual information retrieval. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 49–57.
Landauer, T. K. and Littman, M. L. (1990). Fully automatic cross-language document retrieval. In Proceedings of the Sixth Conference of the UW Center for the New Oxford English Dictionary and Text Research, pages 31–38.
Picchi, E. and Peters, C. (1996). Cross language information retrieval: A system for comparable corpus querying. In Grefenstette, G., editor, Cross-Language Information Retrieval, chapter 7, pages 81–92. Kluwer Academic Publishers.
Pirkola, A. (1998). The effects of query structure and dictionary setups in dictionary-base cross-language information retrieval. In Proceedings of the 21st International Conference on Research and Development in Information Retrieval, pages 55–63.
Ponte, J. (1998). A Language Modeling Approach to Information Retrieval. PhD thesis, Computer Science Department, University of Massachusetts.
Rehder, B., Littman, M. L., Dumais, S., and Landauer, T. K. (1997). Automatic 3-language cross-language information retrieval with latent semantic indexing. In Proceedings of the Sixth Retrieval Conference (TREC-6) Gaithersburg, MD: National Institute of Standards and Technology, pages 233–239.
Salton, G. (1972). Experiments in multi-lingual information retrieval. Technical report TR 72-154, Computer Science Department, Cornell University.
Salton, G. and Buckley, C. (1990). Improving retrieval performance by relevance feedback. Journal of the American Society for Information Science, 41:288–297.
Schmid, H. (1994). Probabilistic Part-of-Speech Tagging Using Decision Trees. In Proceedings of the International Conference on New Methods in Language Processing.
Sheridan, P. and Ballerini, J. P. (1996). Experiments in multilingual information retrieval using the SPIDER system. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 58–65.
Sheridan, P., Braschler, M., and Schauble, P. (1997). Cross-language information retrieval in a multilingual legal domain. In Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries, pages 253–268.
Turtle, H. R. and Croft, W. B. (1991a). Efficient probabilistic inference for text retrieval. In RIAO 3 Conference Proceedings, pages 664–661.
Turtle, H. R. and Croft, W. B. (1991b). Inference networks for document retrieval. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 1–24.
UN. Linguistic data consortium resource: U.N. parallel text. http://www.1dc.upenn.edu/Catalog/LDC94T4A.html (June 1999).
Voorhees, E. and Harman, D., editors (1997). Proceedings of the 6th Text Retrieval Conference (TREC-6), National Institute of Standards and Technology.
Xerox. Xerox finite-state morphological analyzers http://www.xrce.xerox.com:80/research/mltt/Tools/morph.html (Dec. 1998).
Xu, J. and Croft, W. B. (1996). Querying expansion using local and global document analysis. In Proceedings of the 19th International SIGIR Conference on Research and Development in Information Retrieval, pages 4–11.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Kluwer Academic Publishers
About this chapter
Cite this chapter
Ballesteros, L.A. (2002). Cross-Language Retrieval via Transitive Translation. In: Croft, W.B. (eds) Advances in Information Retrieval. The Information Retrieval Series, vol 7. Springer, Boston, MA. https://doi.org/10.1007/0-306-47019-5_8
Download citation
DOI: https://doi.org/10.1007/0-306-47019-5_8
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-7923-7812-9
Online ISBN: 978-0-306-47019-6
eBook Packages: Springer Book Archive