Abstract
This paper proposes an efficient client-server-based query translation approach to allowing more feasible implementation of cross-language information retrieval (CLIR) services in digital library (DL) systems. A centralized query translation server is constructed to process the translation requests of cross-lingual queries from connected DL systems. To extract translations not covered by standard dictionaries, the server is developed based on a novel integration of dictionary resources and Web mining methods, including anchor-text and search-result methods, which exploit huge amounts of multilingual and wide-scoped Web resources as live bilingual corpora to alleviate translation difficulties, and have been proven particularly effective for extracting multilingual translation equivalents of query terms containing proper names or new terminologies. The proposed approach was implemented in a query translation engine called LiveTrans, which has been shown its feasibility in providing efficient English-Chinese CLIR services for DL.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
ANSI/NISO Z39.50-1995: Information Retrieval (Z39.50): Application Service Definition and Protocol Specification (1995)
Borgman, C. L.: Multi-Media, Multi-Cultural, and Multi-Lingual Digital Libraries: Or How Do We Exchange Data in 400 Languages? D-Lib Magazine (June 1997)
Cao, Y., Li, H.: Base Noun Phrase Translation Using Web Data and the EM Algorithm. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 127–133 (2002)
Cooley, R., Mobasher, B., Srivastava, J.: Web Mining: Information and Pattern Discovery on the World Wide Web. In: Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence, pp. 558–567 (1997)
Dreilinger, D., Howe, A.: Experiences with Selecting Search Engines Using Meta-Search. ACM Transactions on Information Systems, 195–222 (1996)
Dumais, S.T., Landauer, T.K., Littman, M.L.: Automatic Cross-Linguistic Information Retrieval Using Latent Semantic Indexing. In: Proceedings of ACM-SIGIR Workshop on Cross-Linguistic Information Retrieval, pp. 16–24 (1996)
Feldman, R., Dagan, I.: KDT - Knowledge Discovery in Texts. In: Proceedings of the 1st International Conference on Knowledge Discovery and Data Mining (1995)
Fung, P., Yee, L.Y.: An IR Approach for Translating New Words from Nonparallel, Comparable Texts. In: Proceedings of the 36th Annual Conference of the Association for Computational Linguistics, pp. 414–420 (1998)
Gravano, L., Chang, K., Garcia-Molina, H., Paepcke, A.: STARTS: Stanford Protocol Proposal for Internet Retrieval and Search. In: Proceedings of the ACM SIGMOD Conference, pp. 126–137 (1997)
Kwok, K.L.: NTCIR-2 Chinese, Cross Language Retrieval Experiments Using PIRCS. In: Proceedings of NTCIR workshop meeting, pp. 111–118 (2001)
Larson, R.R., Gey, F., Chen, A.: Harvesting Translingual Vocabulary Mappings for Multilingual Digital Libraries. In: Proceedings of ACM/IEEE Joint Conference on Digital Libraries, pp. 185–190 (2002)
Lavrenko, V., Choquette, M., Croft, W.B.: Cross-Lingual Relevance Models. In: Proceedings of ACM-SIGIR, pp. 175–182 (2002)
Lu, W.H., Chien, L.F., Lee, H.J.: Anchor Text Mining for Translation of Web Queries. In: Proceedings of the IEEE International Conference on Data Mining, pp. 401–408 (2001)
Lu, W.H., Chien, L.F., Lee, H.J.: Translation of Web Queries using Anchor Text Mining. ACM Transactions on Asian Language Information Processing, 159–172 (2002)
Lu, W.H., Chien, L.F., Lee, H.J.: A Transitive Model for Extracting Translation Equivalents of Web Queries through Anchor Text Mining. In: Proceedings of the 19th International Conference on Computational Linguistics, pp. 584–590 (2002)
Nie, J.Y., Isabelle, P., Simard, M., Durand, R.: Cross-language Information Retrieval Based on Parallel Texts and Automatic Mining of Parallel Texts from the Web. In: Proceedings of ACM-SIGIR Conference, pp. 74–81 (1999)
Oard, D.W.: Cross-language Text Retrieval Research in the USA. In: Proceedings of the 3rd ERCIM DELOS Workshop, Zurich, Switzerland (1997)
Oard, D.W.: Serving Users in Many Languages: Cross-Language Information Retrieval for Digital Libraries. D-Lib Magazine (December 1997)
Peters, C., Picchi, E.: Across Languages, Across Cultures: Issues in Multilinguality and Digital Libraries. D-Lib Magazine (May 1997)
Powell, J., Fox, E.A.: Multilingual Federated Searching Across Heterogeneous Collections. D-Lib Magazine (September 1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lu, WH., Wang, JH., Chien, LF. (2003). Towards Web Mining of Query Translations for Cross-Language Information Retrieval in Digital Libraries. In: Sembok, T.M.T., Zaman, H.B., Chen, H., Urs, S.R., Myaeng, SH. (eds) Digital Libraries: Technology and Management of Indigenous Knowledge for Global Access. ICADL 2003. Lecture Notes in Computer Science, vol 2911. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24594-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-24594-0_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20608-8
Online ISBN: 978-3-540-24594-0
eBook Packages: Springer Book Archive