Comparing the Retrieval Performance of English and Japanese Text Databases

  • H. Fujii
  • W. B. Croft
Part of the Text, Speech and Language Technology book series (TLTB, volume 11)


Text retrieval systems provide a good test-bed for language processing technologies. Any qualitative or quantitative aspects of the language, i.e., lexicon, morphology, syntax, semantics and pragmatics, can be applied to these systems. A query as a representation of the user’s information need, is entered to a retrieval system, and the system retrieves the relevant documents from the (possibly gigabytes of) full-text database. Information retrieval (IR) relies on using the linguistic and statistical characteristics of the text. A comparative study of IR performance between two languages will help to understand the role of language in the retrieval process.


Information Retrieval Retrieval Performance Query Expansion Test Collection Lexical Ambiguity 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Callan, J. P., Croft, W. B. and Harding, S. M. 1992. The INQUERY Retrieval System. 3rd International Conference on Database and Expert Systems Application, pp. 7883.Google Scholar
  2. Callan, J. P. and Croft, W. B. 1993. An Evaluation of Query Processing Strategies Using the TIPSTER Collection. ACM SIGIR-93, pp. 347-355.Google Scholar
  3. Croft, W. B., Turtle, H. R. and Lewis, D. D. 1991. The use of phrases and structured queries in information retrieval. ACM SIGIR-91, pp. 32–45.Google Scholar
  4. Fagan, J. 1987. Experiments in automatic phrase indexing for document retrieval: A comparison of syntactic and non-syntactic methods. Doctoral dissertation, Cornell University.Google Scholar
  5. Frakes, W. B. 1992. Introduction to information storage and retrieval systems. In Frakes and Baeza-Yates (eds), Information retrieval. Englewood Cliffs, NJ: Prentice Hall.Google Scholar
  6. Fujii, H. 1997. An investigation of the linguistic characteristics of Japanese informa- tion retrieval. Doctoral dissertation, University of Massachusetts, Amherst.Google Scholar
  7. Fujii, H. and Croft, W. B. 1993. A comparison of indexing techniques for Japanese text retrieval. ACM SIGIR-93, pp. 237–246.Google Scholar
  8. Harman, D. 1992. The DARPA TIPSTER project. SIGIR Forum,26(2), pp. 26–28. Kageyama, T. 1989. The place of morphology in the grammar: Verb-verb compounds in Japanese. In Booij and van Marie (eds), Yearbook of morphology Google Scholar
  9. Kajiwara, K. 1993. History of words for the thermometer in Japanese: changes and acceptance of modern Chinese words (A type). The National Language Research Institute Research Report, 105 (14), pp. 81–137.Google Scholar
  10. Kimoto, H., Tanaka, T. and Ishikawa, T. 1993. A proposal for constructing a test collection for information retrieval systems. IPSJ JohoGaku Kiso, 32 (1), pp. 1–8.Google Scholar
  11. Matsumoto, Y., Kurohashi, S. and Myoki, Y. 1991. User’s guide for the JUMAN system: A user-extensible morphological analyzer for Japanese. Nagao Laboratory, Kyoto University.Google Scholar
  12. Matsuo, J., Nishio, T. and Tanaka, A. 1965. Japanese synonymy and its problems Google Scholar
  13. The National Language Research Institute Report 28. Tokyo: Shuei-Shuppan. Turtle, H.R. 1991. Inference network for document retrieval Doctoral dissertation. University of Massachusetts.Google Scholar
  14. Turtle, H.R., and Croft, W.B. 1991. Evaluation of an inference network-based retrieval model. ACM Transactions on Information Systems, 9(3), pp. 187–222.Google Scholar
  15. Salton, G. and McGill, M. 1983. Introduction to Modern Structured Information Retrieval McGraw-Hill.Google Scholar

Copyright information

© Springer Science+Business Media Dordrecht 1999

Authors and Affiliations

  • H. Fujii
  • W. B. Croft

There are no affiliations available

Personalised recommendations