Skip to main content

Disambiguation Based on Wordnet for Transliteration of Arabic Numerals for Korean TTS

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 3878))

Abstract

Transliteration of Arabic numerals is not easily resolved. Arabic numerals occur frequently in scientific and informative texts and deliver significant meanings. Since readings of Arabic numerals depend largely on their context, generating accurate pronunciation of Arabic numerals is one of the critical criteria in evaluating TTS systems. In this paper, (1) contextual, pattern, and arithmetic features are extracted from a transliterated corpus; (2) ambiguities of homographic classifiers are resolved based on the semantic relations in KorLex1.0 (Korean Lexico-Semantic Network); (3) a classification model for accurate and efficient transliteration of Arabic numerals is proposed in order to improve Korean TTS systems. The proposed model yields 97.3% accuracy, which is 9.5% higher than that of a customized Korean TTS system.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Agirre, E., et al.: Word Sense Disambiguation using Conceptual Density. In: Pro-ceedings of the 16th International Confernce on Computational Linguistics (COLING 1996), pp. 16–22 (1996)

    Google Scholar 

  2. Castillo, M., et al.: Automatic Assignment of Domain Labels to WordNet. In: Pro-ceeding of the 2nd International WordNet Conference, pp. 75–82 (2004)

    Google Scholar 

  3. Jung, Y.I.: Imprementation of an Automatic Transliteration System of Arabic Numerals for Korean TTS, Master’s thesis, Pusan National University (2004)

    Google Scholar 

  4. Kim, J.S., et al.: Disambiguation model of Homographs based on Statistic using Weight. Korean Information Science: Softwares and Applications 30(11), 1112–1123 (2003)

    Google Scholar 

  5. Leacock, C., et al.: Combining Local Context and WordNet Similarity for Word Sense Identification. In: WordNet - An electronic lexical database, pp. 265–283. MIT Press, Cambridge (1998)

    Google Scholar 

  6. Manning, C.D., et al.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (2001)

    Google Scholar 

  7. Fellbaum, C.: WordNet - An electronic lexical database. MIT Press, Cambridge (1998)

    MATH  Google Scholar 

  8. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers, San Mateo (1993)

    Google Scholar 

  9. Sproat, R., et al.: Normalization of Non-Standard Word. Computer Speech and Language 15(3), 287–333 (2001)

    Article  Google Scholar 

  10. Tetschner, W.: Text-to-Speech - Naturalness and Accuracy. ASR News (July 2003), http://www.asrnews.com/ttsap/ttspap11.htm (referred to on June 7, 2004)

  11. Witten, I.H., et al.: Data Mining. Morgan Kaufmann Publishers, San Diego (1999)

    Google Scholar 

  12. Yarowsky, D.: Homograph Disambiguation in Text-to-speech Synthesis. In: Pro-gress in Speech Synthesis, pp. 159–174. Springer, New York (1997)

    Google Scholar 

  13. Yoon, A.S., et al.: An Automatic Transcription System for Arabic Numerals in Korean. In: Proceedings of 2003 International Conference on Natural Language Processing and Knowledge Engineering, pp. 221–226 (2003)

    Google Scholar 

  14. Yoon, A.S., et al.: Automatic Transcription of Three Ambiguous Symbols Used with Arabic Numerals: Period, Colon and Slash. Language and Information 8, 117–136 (2004)

    Google Scholar 

  15. Yu, M.S., et al.: Disambiguating the senses of non-text symbols for Mandarin TTS systems with a three-layer classifier. Speech communication 39(3/4), 191–229 (2003); Learning Tool: Weka 3: http://www.cs.waikato.ac.nz/ml/weka/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Jung, Y., Yoon, A., Kwon, HC. (2006). Disambiguation Based on Wordnet for Transliteration of Arabic Numerals for Korean TTS. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2006. Lecture Notes in Computer Science, vol 3878. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11671299_38

Download citation

  • DOI: https://doi.org/10.1007/11671299_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-32205-4

  • Online ISBN: 978-3-540-32206-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics