Skip to main content
Log in

The application of the comparable corpora in Chinese-English Cross-Lingual Information Retrieval

  • Correspondence
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

This paper proposes a novel Chinese-English Cross-Lingual Information Retrieval (CECLIR) model PME, in which bilingual dictionary and comparable corpora are used to translate the query terms. The proximity and mutual information of the term-pairs in the Chinese and English comparable corpora are employed not only to resolve the translation ambiguities but also to perform the query expansion so as to deal with the out-of-vocabulary issues in the CECLIR. The evaluation results show that the query precision of PME algorithm is about 84.4% of the monolingual information retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Oard D. A survey of multilingual text retrieval. Technical Report, UMIACS-TR-96-19, http://www.ee.umd.edu/medlab/filter/papers/sigir96.ps, 1996.

  2. Oard D. Adaptive vector space text filtering for monolingual and cross-language applications [dissertation]. University of Maryland, College Park, 1996.

    Google Scholar 

  3. Grefenstette G. Cross-Language Information Retrieval. Kluwer Academic Publishers, 1998.

  4. Church K W, Mercer R L. Introduction to the special issue on computational linguistics using large corpora.Computational Linguistics, 1993, 19(1): 1–24.

    Google Scholar 

  5. Salton G. Automatic processing of foreign language documents.Journal of the American Society for Information Science, 1970, (21): 187–194.

    Article  Google Scholar 

  6. Ballesteros L, Croft W B. Dictionary-based methods for cross-lingual information retrieval. InProc. the 7th International DEXA Conference on Database and Expert Systems Applications, 1996, pp. 791–801.

  7. Ballesteros L, Croft W B Phrasal translation and query expansion techniques for cross language information retrieval. InAAAI Symposium on Cross-Language. Text and Speech Retrieval, 1997. http://www.ee.umd.edu/medlab/filter/sss/papers/ballesteros.ps

  8. Ballesteros L, Croft W B. Resolving ambiguity for cross-language retrieval. InSIGIR’98, Melbourne, 1998, pp.64–71.

  9. Landauer T K, Littman M L. Fully automatic cross-language document retrieval. InProc. the Sixth Confernece on Electronic Text Research, 1990, pp. 31–38.

  10. Sheridan P, Wechsler M. Cross-language speech retrieval. InAAAI Symposium on Cross-language Text and Speech Retrieval, American Association for Artificial Intelligence, 1997. http://www.ee.umd.edu/medlab/filter/sss/papers/sheridan.ps

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Du Lin.

Additional information

This research is supported by the National Natural Science Foundation of China (No.69983009).

DU Lin, born in 1965, is an associate professor. He got the Ph.D. degree from Institute of Software, The Chinese Academy of Sciences in 1999, and got the B.S. and M.S. degrees from Chongqing University in 1987 and 1990 respectively. His current researches mainly focus on Chinese information retrieval and multilingual information retrieval.

ZHANG Yibo was born in 1973. He received the B.S. and M.S. degrees from Central South University of Technology in 1994 and 1998 respectively. Now he is a Ph.D. candidate of Institute of Software, The Chinese Academy of Sciences His research interests include cross-language information retrieval.

SUN Le, born in 1971, is an assistant professor. He received the B.S., M.S. and Ph.D. degrees from the Nanjing University of Science and Technology in 1991, 1995 and 1998 respectively. His current interests are machine assisted translation.

SUN Yufang, born in 1947, is a professor. He received the M.S. degree from Institute of Software, The Chinese Academy of Sciences in 1983. His research interests include operating system and Chinese information processing.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Du, L., Zhang, Y., Sun, L. et al. The application of the comparable corpora in Chinese-English Cross-Lingual Information Retrieval. J. Comput. Sci. & Technol. 16, 351–358 (2001). https://doi.org/10.1007/BF02948983

Download citation

  • Received:

  • Revised:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02948983

Keywords

Navigation