Advertisement

A Hybrid Approach for Transliteration of Name Entities

  • R. C. Balabantaray
  • S. Mohanty
  • R. K. Das

Abstract

To develop a system for translation of one language to another is one of the most important research challenges in Artificial Intelligence (AI). In Machine Translation (MT) the name entity recognition (NER) is one of the most challenging task. In this paper we propose a new statistical method for transliterating the identified name entities based on the linguistic knowledge of possible conjuncts and diphthongs in source and target language. The work presented in this paper is part of a larger effort to develop MT system which can take care of name entities.

Keywords

Machine Translation Target Language Name Entity Recognition Source Language Statistical Machine Translation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Nasreen, A.J., Larkey, L.S.,: Statistical Transliteration for English-Arabic Cross Language Information Retrieval. In: Proceedings of the Twelfth International Conference on Information and Knowledge Management (CIKM 2003), New Orleans, USA, 139–146.(2003)Google Scholar
  2. 2.
    Al-Onaizan Y. and Knight K.: Named Entity Translation: Extended Abstract. Proceedings of the Human Language Technology Conference (HLT 2002), 122–124 (2002)Google Scholar
  3. 3.
    Al-Onaizan, Y. and Knight, K.: Translating Named Entities Using Monolingual and Bilingual Resources. Proceedings of the 40th Annual Meeting of the ACL (ACL 2002), 400–408 (2002).Google Scholar
  4. 4.
    Al-Onaizan, Y., Knight, K.: Machine Transliteration of Names in Arabic Text. Proceedings of the ACL Workshop on Computational Approaches to Semitic Languages (2002)Google Scholar
  5. 5.
    McCallum, A., Li, W.: Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons. In CoNLL (2003)Google Scholar
  6. 6.
    Arbabi, M., Scott, M., Fischthal, V., Cheng C., Bar E.: Algorithms for Arabic name transliteration. IBM Journal of Research and Development, 38(2), 183–193. (1994)CrossRefGoogle Scholar
  7. 7.
    Ekbal, A., Naskar, S.K., Bandopadhaya, S.: A modified joint source-channel model for transliteration. Proceedings of the COLING/ACL on Main conference poster sessions. Sydney, Australia. pp: 191–198 (2006)Google Scholar
  8. 8.
    Crego J.M., Marino J.B., de Gispert, A.: Reordered Search and Tuple Unfolding for Ngrambased SMT. Proceedings of the MT-Summit X, Phuket, Thailand, 283–289 (2005)Google Scholar
  9. 9.
    Bikel D. M., Schwartz R. L., Weischedel R. M.: An algorithm that learns what’s in a name. Machine Learning, 34, 211–231 (1999)MATHCrossRefGoogle Scholar
  10. 10.
    Freitag, D: Information extraction from html: application of a general machine learning approach. In AAAI-98 (1998)Google Scholar
  11. 11.
    Goto I., Kato, N., Uratani, N., Ehara T.: Transliteration considering Context Information based on the Maximum Entropy Method. Proceedings of the MT-Summit IX, New Orleans, USA, 125–132 (2003)Google Scholar
  12. 12.
    Grishman, R.: Information Extraction: Techniques and Challenges”, Lecture Notes in Computer Science, Vol. 1299, Springer-Verlag (1997)Google Scholar
  13. 13.
    Li, H., Min, Z. Jian, S.: A Joint Source-Channel Model for Machine Transliteration. Proceedings of the 42nd Annual Meeting of the ACL (ACL 2004), Barcelona, Spain, 159–166 (2004)Google Scholar
  14. 14.
    Young, J. S., Hong S. L., Paek, E., An English to Korean Transliteration Model of Extended Markov Window. Proceedings of COLING 2000, 1, 383–389 (2000)Google Scholar
  15. 15.
    Knight K., Graehl, J.: Machine Transliteration, Computational Linguistics, 24(4), 599–612 (1998)Google Scholar
  16. 16.
    Marino J. B., Banchs R., Crego J. M., A. de Gispert, P. Lambert, J. A. Fonollosa and M. Ruiz, Bilingual N-gram Statistical Machine Translation. Proceedings of the MT-Summit X, Phuket, Thailand, 275–282.Google Scholar
  17. 17.
    Craven M., Kumlien, J.: Constructing biological knowledge bases by extracting information from text sources. In ISMB-99 (1999)Google Scholar
  18. 18.
    Meng Helen M., Wai-Kit Lo, Chen, B., Tang, K.: Generating Phonetic Cognates to handle Name Entities in English-Chinese Crosslanguage Spoken Document Retrieval. Proceedings of the Automatic Speech Recognition and Understanding (ASRU) Workshop, Trento, Italy (2001)Google Scholar
  19. 19.
    Bender, O., Josef Och F., Ney, H.: Maximum Entropy Models for Named Entity Recognition. In: Proceedings of CoNLL-2003, Edmonton, Canada, pp. 148–151 (2003)Google Scholar
  20. 20.
    Bunescu R., Mooney, R. J.: Relational markov networks for collective information extraction. In ICML-2004 Workshop on Statistical Relational Learning (2004)Google Scholar
  21. 21.
    Stalls, B.G., Knight K.: Translating names and technical terms in Arabic text. Proceedings of the COLING/ACL Workshop on Computational Approaches to Semitic Languages, Montral, Canada, 34–41. (1998)Google Scholar
  22. 22.
    Paola, V., Khudanpur, S.: Transliteration of Proper Names in Crosslingual Information Retrieval. Proceedings of the ACL 2003 Workshop on Multilingual and Mixedlanguag Named Entity Recognition, Sapporo, Japan, pp.57–60. (2003)Google Scholar
  23. 23.
    Mohanty, S., Balabantaray, R.C.: Name Entity Recognition in OMTrans. Communicated to International Journal of Translation (2007)Google Scholar

Copyright information

© Indian Institute of Information Technology, India 2009

Authors and Affiliations

  • R. C. Balabantaray
    • 1
  • S. Mohanty
    • 2
  • R. K. Das
    • 2
  1. 1.International Institute of Information TechnologyBhubaneswar, OrissaIndia
  2. 2.Dept. of Computer Science & ApplicationUtkal UniversityBhubaneswar, OrissaIndia

Personalised recommendations