Skip to main content

Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries

  • Conference paper
Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access (ICADL 2003)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2911))

Included in the following conference series:

Abstract

This paper proposes an efficient client-server-based query translation approach to allowing more feasible implementation of cross-language information retrieval (CLIR) services in digital library (DL) systems. A centralized query translation server is constructed to process the translation requests of cross-lingual queries from connected DL systems. To extract translations not covered by standard dictionaries, the server is developed based on a novel integration of dictionary resources and Web mining methods, including anchor-text and search-result methods, which exploit huge amounts of multilingual and wide-scoped Web resources as live bilingual corpora to alleviate translation difficulties, and have been proven particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminologies. The proposed approach was implemented in a query translation engine called LiveTrans, which has been shown its feasibility in providing efficient English-Chinese CLIR services for DL.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. ANSI/NISO Z39.50-1995: Information Retrieval (Z39.50): Application Service Definition and Protocol Specification (1995)

    Google Scholar 

  2. Borgman, C. L.: Multi-Media, Multi-Cultural, and Multi-Lingual Digital Libraries: Or How Do We Exchange Data in 400 Languages? D-Lib Magazine (June 1997)

    Google Scholar 

  3. Cao, Y., Li, H.: Base Noun Phrase Translation Using Web Data and the EM Algorithm. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 127–133 (2002)

    Google Scholar 

  4. Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. In: Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence, pp. 558–567 (1997)

    Google Scholar 

  5. Dreilinger, D., Howe, A.: Experiences with Selecting Search Engines Using Meta-Search. ACM Transactions on Information Systems, 195–222 (1996)

    Google Scholar 

  6. Dumais, S.T., Landauer, T.K., Littman, M.L.: Automatic Cross-Linguistic Information Retrieval Using Latent Semantic Indexing. In: Proceedings of ACM-SIGIR Workshop on Cross-Linguistic Information Retrieval, pp. 16–24 (1996)

    Google Scholar 

  7. Feldman, R., Dagan, I.: KDT - Knowledge Discovery in Texts. In: Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining (1995)

    Google Scholar 

  8. Fung, P., Yee, L.Y.: An IR Approach for Translating New Words from Nonparallel, Comparable Texts. In: Proceedings of the 36th Annual Conference of the Association for Computational Linguistics, pp. 414–420 (1998)

    Google Scholar 

  9. Gravano, L., Chang, K., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford Protocol Proposal for Internet Retrieval and Search. In: Proceedings of the ACM SIGMOD Conference, pp. 126–137 (1997)

    Google Scholar 

  10. Kwok, K.L.: NTCIR-2 Chinese, Cross Language Retrieval Experiments Using PIRCS. In: Proceedings of NTCIR workshop meeting, pp. 111–118 (2001)

    Google Scholar 

  11. Larson, R.R., Gey, F., Chen, A.: Harvesting Translingual Vocabulary Mappings for Multilingual Digital Libraries. In: Proceedings of ACM/IEEE Joint Conference on Digital Libraries, pp. 185–190 (2002)

    Google Scholar 

  12. Lavrenko, V., Choquette, M., Croft, W.B.: Cross-Lingual Relevance Models. In: Proceedings of ACM-SIGIR, pp. 175–182 (2002)

    Google Scholar 

  13. Lu, W.H., Chien, L.F., Lee, H.J.: Anchor Text Mining for Translation of Web Queries. In: Proceedings of the IEEE International Conference on Data Mining, pp. 401–408 (2001)

    Google Scholar 

  14. Lu, W.H., Chien, L.F., Lee, H.J.: Translation of Web Queries using Anchor Text Mining. ACM Transactions on Asian Language Information Processing, 159–172 (2002)

    Google Scholar 

  15. Lu, W.H., Chien, L.F., Lee, H.J.: A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 584–590 (2002)

    Google Scholar 

  16. Nie, J.Y., Isabelle, P., Simard, M., Durand, R.: Cross-language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web. In: Proceedings of ACM-SIGIR Conference, pp. 74–81 (1999)

    Google Scholar 

  17. Oard, D.W.: Cross-language Text Retrieval Research in the USA. In: Proceedings of the 3rd ERCIM DELOS Workshop, Zurich, Switzerland (1997)

    Google Scholar 

  18. Oard, D.W.: Serving Users in Many Languages: Cross-Language Information Retrieval for Digital Libraries. D-Lib Magazine (December 1997)

    Google Scholar 

  19. Peters, C., Picchi, E.: Across Languages, Across Cultures: Issues in Multilinguality and Digital Libraries. D-Lib Magazine (May 1997)

    Google Scholar 

  20. Powell, J., Fox, E.A.: Multilingual Federated Searching Across Heterogeneous Collections. D-Lib Magazine (September 1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Lu, WH., Wang, JH., Chien, LF. (2003). Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-24594-0_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20608-8

  • Online ISBN: 978-3-540-24594-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics