Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5459))

Included in the following conference series:

Abstract

Telugu is the third most spoken language in India and one of the fifteen most spoken languages in the world. But, there is no standardized input method for Telugu, which has a widespread use. Since majority of users of Telugu typing tools on the computers are familiar with English, we propose a transliteration based text input method in which the users type Telugu using Roman script. We have shown that simple edit-distance based approach can give a light-weight system with good efficiency for a text input method. We have tested the approach with three datasets – general data, countries and places and person names. The approach has worked considerably well for all the datasets and holds promise as an efficient text input method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Andrew, T.F., Sherri, L.C., Christopher, M.A.: Cross Linguistic Name Matching in English and Arabic: A One to Many Mapping Extension of the Levenshtein Edit Distance Algorithm. In: Human Language Technology Conference of the North American Chapter of the ACL, pp. 471–478 (2006)

    Google Scholar 

  2. Animesh, N., Ravi Kiran Rao, B., Pawandeep, S., Sudip, S., Ratna, S.: Named Entity Recognition for Indian Languages. In: Workshop on NER for South and South East Asian Languages (NERSSEA), International Joint Conference on Natural Language Processing (IJCNLP) (2008)

    Google Scholar 

  3. Anirudha, J., Ashish, G., Aditya, C., Vikram, P., Gaurav, M.: Keylekh: A keyboard for text entry in Indic scripts. In: Proc. Computer Human Interaction (CHI) (2004)

    Google Scholar 

  4. Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate Record Detection: A Survey. IEEE Transactions on Knowledge and Data Engineering 19(1), 1–16 (2007)

    Article  Google Scholar 

  5. Prasad, P., Vasudeva, V.: Word normalization in Indian languages. In: 4th International Conference on Natural Language Processing (ICON) (2005)

    Google Scholar 

  6. Ranbeer, M., Nikita, P., Prasad, P., Vasudeva, V.: Experiments in Cross-lingual IR among Indian Languages. In: International Workshop on Cross Language Information Processing (CLIP 2007) (2007)

    Google Scholar 

  7. Report of the Committee for Standardization of Keyboard Layout for Indian Script Based Computers. Electronics Information & Planning Journal 14(1) (October 1986)

    Google Scholar 

  8. Sandeva, G., Yoshihiko, H., Yuichi, I., Fumio, K.: An Efficient and User Friendly Sinhala Input method based on Phonetic Transcription. Journal of Natural Language Processing 14(5) (October 2007)

    Google Scholar 

  9. Sandeva, G., Yoshihiko, H., Yuichi, I., Fumio, K.: SriShell Primo: A Predictive Sinhala Text Input System. In: Workshop on NLP for Less Privileged Languages (NLPLPL), International Joint Conference on Natural Language Processing (IJCNLP) (2008)

    Google Scholar 

  10. Serva, M., Petroni, F.: Indo-European languages tree by Levenshtein distance. Exploring the Frontiers of Physics (EPL) (6) (2008)

    Google Scholar 

  11. William, W.C., Pradeep, R., Stephen, E.F.: A Comparison of String Distance Metrics for Name-Matching Tasks. In: Proceedings of Association for the Advancement of Artificial Intelligence (AAAI) (2003)

    Google Scholar 

  12. Winkler, W.E.: The State of Record Linkage and Current Research Problems. In: Statistics of Income Division, Internal Revenue Service Publication, R99/04

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sowmya, V.B., Varma, V. (2009). Transliteration Based Text Input Methods for Telugu. In: Li, W., Mollá-Aliod, D. (eds) Computer Processing of Oriental Languages. Language Technology for the Knowledge-based Economy. ICCPOL 2009. Lecture Notes in Computer Science(), vol 5459. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00831-3_12

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00831-3_12

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00830-6

  • Online ISBN: 978-3-642-00831-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics