A Phonetization Approach for the Forced-Alignment Task in SPPAS

  • Brigitte BigiEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9561)


The phonetization of text corpora requires a sequence of processing steps and resources in order to convert a normalized text in its constituent phones and then to directly exploit it by a given application. This paper presents a generic approach for text phonetization and concentrates on the aspects of phonetizing unknown words. This serves to develop a phonetizer in the context of forced-alignment application. The proposed approach is dictionary-based, which is as language-independent as possible. It is used on French, English, Spanish, Italian, Catalan, Polish, Mandarin Chinese, Taiwanese, Cantonese and Japanese in SPPAS software, a tool distributed under the terms of the GPL license.


Phonetization Graphemes-phonemes Unknown words LRL 



This work has been partly carried out thanks to the support of the French state program ORTOLANG (Ref. Nr. ANR-11-EQPX-0032) funded by the “Investissements d’Avenir” French Government program, managed by the French National Research Agency (ANR). The support is gratefully acknowledged (


  1. 1.
    Allen, J., Hunnicutt, M.S., Dennis, H.: From Text to Speech: The MITalk System. Cambridge University Press, New York (1987)Google Scholar
  2. 2.
    Belrhali, R., Aubergé, V., Boë, L.J.: From lexicon to rules: toward a descriptive method of french text-to-phonetics transcription. In: The Second International Conference on Spoken Language Processing (1992)Google Scholar
  3. 3.
    Bigi, B.: A multilingual text normalization approach. In: Vetulani, Z., Mariani, J. (eds.) LTC 2011. LNAI, vol. 8387, pp. 515–526. Springer, Heidelberg (2014)Google Scholar
  4. 4.
    Bigi, B.: SPPAS: a tool for the phonetic segmentations of speech. In: The Eighth International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 1748–1755 (2012). ISBN 978-2-9517408-7-7Google Scholar
  5. 5.
    Bigi, B., Péri, P., Bertrand, R.: Orthographic transcription: which enrichment is required for phonetization? In: The Eighth International Conference on Language Resources and Evaluation, Istanbul, Turkey, pp. 1756–1763 (2012). ISBN 978-2-9517408-7-7Google Scholar
  6. 6.
    Bigi, B., Portes, C., Steuckardt, A., Tellier, M.: Multimodal annotations and categorization for political debates. In: ICMI Workshop on Multimodal Corpora for Machine learning, Alicante (Spain) (2011)Google Scholar
  7. 7.
    Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)CrossRefGoogle Scholar
  8. 8.
    Blache, P., Bertrand, R., Bigi, B., Bruno, E., Cela, E., Espesser, R., Ferré, G., Guardiola, M., Hirst, D., Magro, E.P., Martin, J.C., Meunier, C., Morel, M.A., Murisasco, E., Nesterenko, I., Nocera, P., Pallaud, B., Prévot, L., Priego-Valverde, B., Seinturier, J., Tan, N., Tellier, M., Rauzy, S.: Multimodal annotation of conversational data. In: The Fourth Linguistic Annotation Workshop, Uppsala, Sueden, pp. 186–191 (2010)Google Scholar
  9. 9.
    Caseiro, D., Trancoso, L., Oliveira, L., Viana, C.: Grapheme-to-phone using finite-state transducers. In: IEEE Workshop on Speech Synthesis, pp. 215–218 (2002)Google Scholar
  10. 10.
    Chalamandaris, A., Raptis, S., Tsiakoulis, P.: Rule-based grapheme-to-phoneme method for the Greek. Trees 18, 19 (2005)Google Scholar
  11. 11.
    Daelemans, W.M.P., van den Bosch, A.P.J.: Language-independent data-oriented grapheme-to-phoneme conversion. In: van Santen, J.P.H., Olive, J.P., Sproat, R.W., Hirschberg, J. (eds.) Progress in Speech Synthesis, pp. 77–89. Springer, New York (1997)CrossRefGoogle Scholar
  12. 12.
    Damper, R., Marchand, Y., Adamson, M., Gustafson, K.: Comparative evaluation of letter-to-sound conversion techniques for english text-to-speech synthesis. In: The Third ESCA/COCOSDA Workshop (ETRW) on Speech Synthesis (1998)Google Scholar
  13. 13.
    Demenko, G., Wypych, M., Baranowska, E.: Implementation of grapheme-to-phoneme rules and extended sampa alphabet in polish text-to-speech synthesis. Speech Lang. Technol. 7, 79–97 (2003)Google Scholar
  14. 14.
    Divay, M., Guyomard, M.: Grapheme-to-phoneme transcription for French. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 575–578 (1977)Google Scholar
  15. 15.
    Dutoit, T.: An Introduction to Text-to-Speech Synthesis. Text, Speech and Language Technology, vol. 3. Springer, Dordrecht (1997)Google Scholar
  16. 16.
    El-Imam, Y.: Phonetization of Arabic: rules and algorithms. Comput. Speech Lang. 18(4), 339–373 (2004)CrossRefGoogle Scholar
  17. 17.
    El-Imam, Y., Don, Z.: Text-to-speech conversion of standard Malay. Int. J. Speech Technol. 3(2), 129–146 (2000)CrossRefzbMATHGoogle Scholar
  18. 18.
    Galescu, L., Allen, J.: Bi-directional conversion between graphemes and phonemes using a joint n-gram model. In: 4th ISCA Tutorial and Research Workshop (ITRW) on Speech Synthesis (2001)Google Scholar
  19. 19.
    Gera, P.: Text to speech synthesis for Punjabi language. M.Tech Thesis, Thapar University (2006)Google Scholar
  20. 20.
    Goldman, J.P.: EasyAlign: a friendly automatic phonetic alignment tool under Praat. In: Interspeech. No. Ses1-S3: 2, Florence, Italy (2011)Google Scholar
  21. 21.
    Herment, S., Loukina, A., Tortel, A., Hirst, D., Bigi, B.: A multi-layered learners corpus: automatic annotation. In: 4th International Conference on Corpus Linguistics Language, Corpora and Applications: Diversity and Change, Jaén (Spain) (2012)Google Scholar
  22. 22.
    Jiampojamarn, S., Cherry, C., Kondrak, G.: Joint processing and discriminative training for letter-to-phoneme conversion. In: ACL, pp. 905–913 (2008)Google Scholar
  23. 23.
    József, D., Ovidiu, B., Gavril, T.: Automated grapheme-to-phoneme conversion system for Romanian. In: 6th Conference on Speech Technology and Human-Computer Dialogue, pp. 1–6 (2011)Google Scholar
  24. 24.
    Kim, B., Lee, G.G., Lee, J.H.: Morpheme-based grapheme to phoneme conversion using phonetic patterns and morphophonemic connectivity information. J. ACM Trans. Asian Lang. Inf. Process. 1(1), 65–82 (2002)CrossRefGoogle Scholar
  25. 25.
    Laurent, A., Deléglise, P., Meignier, S.: Grapheme to phoneme conversion using an SMT system. In: Interspeech, pp. 708–711 (2009)Google Scholar
  26. 26.
    Levinson, S., Olive, J., Tschirgi, J.: Speech synthesis in telecommunications. IEEE Commun. Mag. 31(11), 46–53 (1993)CrossRefGoogle Scholar
  27. 27.
    Nagoya Institute of Technology: Open-source large vocabulary CSR engine Julius, rev. 4.1.5 (2010)Google Scholar
  28. 28.
    Schlippe, T., Ochs, S., Schultz, T.: Grapheme-to-phoneme model generation for Indo-European languages. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 4801–4804 (2012)Google Scholar
  29. 29.
    Tarsaku, P., Sornlertlamvanich, V., Thongprasirt, R.: Thai grapheme-to-phoneme using probabilistic GLR parser. In: Interspeech, Aalborg, Denmark (2001)Google Scholar
  30. 30.
    Taylor, P.: Hidden Markov models for grapheme to phoneme conversion. In: Interspeech, pp. 1973–1976 (2005)Google Scholar
  31. 31.
    Thangthai, A., Wutiwiwatchai, C., Rugchatjaroen, A., Saychum, S.: A learning method for Thai phonetization of English words. In: Interspeech, pp. 1777–1780 (2007)Google Scholar
  32. 32.
    Torkkola, K.: An efficient way to learn English grapheme-to-phoneme rules automatically. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 199–202 (1993)Google Scholar
  33. 33.
    Young, S., Young, S.: The HTK hidden Markov model toolkit: design and philosophy, vol. 2, pp. 2–44. Entropic Cambridge Research Laboratory, Ltd. (1994)Google Scholar
  34. 34.
    Yvon, F., de Mareüil, P.B., et al.: Objective evaluation of grapheme to phoneme conversion for text-to-speech synthesis in French. Comput. Speech Lang. 12(4), 393–410 (1998)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Laboratoire Parole et Langage, CNRSAix-Marseille UniversitéAix-en-ProvenceFrance

Personalised recommendations