Abstract
The information content of languages other than English are increasing rapidly on WWW. To access information of a language other than the native language we need Cross-Language Information Retrieval (CLIR). The approaches to CLIR can be classified into three different categories • document translation, query translation and interlingua matching. The dictionary based query translation approach has been widely used by researchers of CLIR. The translation ambiguity and target polysemy are the two major problems of dictionary based CLIR. In this paper, we have investigated part of speech and co-occurrence based disambiguation techniques for English-Hindi CLIR system.
Preview
Unable to display preview. Download preview PDF.
References
Douglas W.: A Comparative Study of Query and Document Translation for Cross Language Information Retrieval, Proceedings of the Third Conference of the Association for Machine Translation in the Americas on Machine Translation and the Information Soup, pp. 472–483 (1998)
Hsin-Hsi Chen, Guo-Wei Bian and Wen-Cheng Lin,: Resolving Translation Ambiguity and Target Polysemy in Cross-Language Information Retrieval in proceedings of 27th Annual Meeting of the Association for Computational Linguistics, University of Maryland, College Park, Maryland, USA, ACL (1999)
Ballesteros L, Croft B.: Dictionary Methods for Cross-Lingual Information Retrieval. 7th DEXA Conf. on Database and Expert Systems Applications. Pages 791–801 (1996)
Ballesteros L., Bruce C.W.: Resolving Ambiguity for Cross-language Retrieval. In Proceedings of the 21st International ACM SIGIR Conference on Research and Development in Information Retrieval (1998)
Pirkola A.: The effects of query structure and dictionary setups in dictionary-based cross-language information retrieval. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 55–63 (1998)
Davis M., Dunning T.: Query Translation using Evolutionary Programming for Multilingual Information Retrieval. The 41h Evolutionary Programming Conf., (1995).
Hull. D.A.: Using structured queries for disambiguation in cross-language information retrieval. In Proc. of AAAI spring symposium on cross-language text and speech retrieval, Stanford, CA (1997)
Jianfeng Gao, Jian-Yun Nie, Endong Xun, Jian Zhang, Ming Zhou, Changning Huang: Improving Query Translation for Cross-Lan guage Information Retrieval using Statistical Models In Proceeding of 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (2001)
Sadat F., Maeda A., Yoshikawa M, Uemura S.: A Combined Statistical Query Term Disambiguation in Cross-Language Information Retrieval, Proceedings of the 13th International Workshop on Database and Expert Systems Applications (DEXA’02) 1529-4188/02 (2002)
Clough Paul, and Mark Stevenson,: “Evaluating the Contribution of EuroWordNet and Word Sense Disambiguation to Cross-language Information Retrieval” In: Proceedings of the Second Global WordNet Conference, pp. 97–105 (2004)
Adriani M., van Rijsbergen C.J.,: Term Similarity Based Query Expansion for Cross Language Information Retrieval. In Proceedings of Research and Advanced Technology for Digital Libraries, Third European Conference (ECDL’99), p. 311–322. Springer Verlag, Paris, September (1999)
Kekäläinen J., Järvelin K.: The impact of query structure and query expansion on retrieval performance. Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia (1998)
Davis M.W., Ogden W.C.: Free Resources And Advanced Alignment For Cross-Language Text Retrieval. TREC 1997:385–395(1997)
Monz C., Dorr B.J.: Iterative translation disambiguation for cross-language information retrievalin Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval (2005)
Seetha A., Das S., Kumar M.: Evaluation of the English-Hindi Cross Language Information Retrieval System Based on Dictionary Based Query Translation Method. In proceedings of 10th International Conference on Information Technology (ICIT 2007), http://doi.ieeecomputersociety.org/10.1109/ICIT.2007.40
Daqing He, Oard D.W., Wang J., Jun Luo, Demner-Fushman D., Darwish K., Resnik P., Khudanpur S., Nossal M., Subotin M., Leuski A.: Making MIRACLEs: Interactive translingual search for Cebuano and Hindi September ACM Transactions on Asian Language Information Processing (TALIP), Volume 2 Issue 3 (2003)
Pingali P., Varma V.: IIIT Hyderabad at CLEF 2007-Adhoc Indian Language CLIR task 2007 CLEF-2007, Cross Language Evaluation Forum 2007 Workshop at Budapest Hungary, At Eleventh European Conference on Digital Libraries (2007).
Mandal D., Dandapat S., Gupta M., Banerjee P., Sarkar S.: Bengali and Hindi to English Cross-language Text Retrieval un der Limited Resources in CLEF 2007 working notes (2007).
Davis M.W., Ogden W.C.: Free Resources And Advanced Alignment For Cross-Language Text Retrieval. TREC: Gaithersburg, Maryland, 385–395 (1997)
Seetha A., Das S., Kumar M.,: Construction of Hindi test collection for CLIR research. In Proceedings of International Conference on Cognitive Systems (ICCS 2004) New Delhi, December 14–15, (available at www.niitcrcs.com/iccs/iccs2004/Papers/240%20Anurag%20Sheetha.pdf) (2004)
Croft W.B., Cook R., Wilder D: Providing Government Information on the Internet: Experiences with THOMAS. in Proceedings of DL. pp. 19–24 (1995)
Kamps J, Monz C., Maarten de Rijke Sigurbjörnsson B.: Monolingual Document Retrieval: English versus other European Language s. In Proceedings of the Fourth Dutch Belgian Information Retrieval Workshop (DIR-2003). Pages: 35–39 (2003)
Porter M.F.: An algorithm for suffix stripping, in Program—automated library and information systems, 14(3): 130–137 (1980)
Demner-Fushman D., Oard D.W.: The effect of bilingual term list size on dictionary based cross-language information retrieval. In 36th Annual Hawaii International Conference on System Sciences (HICSS’03)—Track 4. Hawaii (2003)
Larkey L. S., Allan J., Connell, M. E., Bolivar A., Wade, C.: UMass at TREC 2002: Cross language and novelty tracks The 11th Text Retrieval Conference TREC 2002 NIST (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Indian Institute of Information Technology, India
About this paper
Cite this paper
Das, S., Seetha, A., Kumar, M., Rana, J.L. (2009). Disambiguation Strategies for English-Hindi Cross Language Information Retrieval System. In: Tiwary, U.S., Siddiqui, T.J., Radhakrishna, M., Tiwari, M.D. (eds) Proceedings of the First International Conference on Intelligent Human Computer Interaction. Springer, New Delhi. https://doi.org/10.1007/978-81-8489-203-1_30
Download citation
DOI: https://doi.org/10.1007/978-81-8489-203-1_30
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-8489-404-2
Online ISBN: 978-81-8489-203-1
eBook Packages: Computer ScienceComputer Science (R0)