Skip to main content

A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Abstract

Minimal deterministic finite-state transducers (MDFSTs) are powerful models that can be used to represent pronunciation dictionaries in a compact form. Intuitively, we would assume that by increasing the size of the dictionary, the size of the MDFSTs would increase as well. However, as we show in the paper, this intuition does not hold for highly inflected languages. With such languages the size of the MDFSTs begins to decrease once the number of words in the represented dictionary reaches a certain threshold. Motivated by this observation, we have developed a new type of FST, called a finite-state super transducer (FSST), and show experimentally that the FSST is capable of representing pronunciation dictionaries with fewer states and transitions than MDFSTs. Furthermore, we show that (unlike MDFSTs) our FSSTs can also accept words that are not part of the represented dictionary. The phonetic transcriptions of these out-of-dictionary words may not always be correct, but the observed error rates are comparable to the error rates of the traditional methods for grapheme-to-phoneme conversion.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Allauzen, C., Riley, M.D., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: a general and efficient weighted finite-state transducer library. In: Holub, J., Žd’árek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007). http://www.openfst.org

    Chapter  Google Scholar 

  2. Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)

    Article  Google Scholar 

  3. Black, A., Taylor, P., Caley, R.: The festival speech synthesis system: system documentation (2.4.0). Technical report, Human Communication Research Centre, December 2014

    Google Scholar 

  4. Golob, Ž.: Reducing redundancy of finite-state transducers in automatic speech synthesis for embedded systems. Ph.D. thesis, University of Ljubljana, Faculty of Electrical Engineering, Tržaska 25, SI-1000 Ljubljana, Slovenia (2014)

    Google Scholar 

  5. Golob, Ž., Žganec Gros, J., Žganec, M., Vesnicer, B., Dobrišek, S.: FST-based pronunciation lexicon compression for speech engines. Int. J. Adv. Rob. Syst. 9(211), 1–9 (2012)

    Article  Google Scholar 

  6. Žganec Gros, J., Cvetko-Orešnik, V., Jakopin, P.: SI-PRON pronunciation lexicon: a new language resource for Slovenian. Informatica (Slovenia) 30(4), 447–452 (2006)

    Google Scholar 

  7. The Carnegie Mellon Speech Group: The Carnegie Mellon University Pronouncing Dictionary (Version 0.7b) [Electronic database]. Carnegie Mellon University, Pittsburgh (1995). http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict

  8. Hahn, S., Vozila, P., Bisani, M.: Comparison of grapheme-to-phoneme methods on large pronunciation dictionaries and LVCSR tasks. In: Interspeech, Portland, OR, USA, pp. 2538–2541, September 2012

    Google Scholar 

  9. Jiampojamarn, S., Kondrak, G.: Letter-phoneme alignment: an exploration. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 780–788. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)

    Google Scholar 

  10. Lehnen, P., Allauzen, A., Lavergne, T., Yvon, F., Hahn, S., Ney, H.: Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion. In: Interspeech, Lyon, France, pp. 2326–2330, August 2013

    Google Scholar 

  11. Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)

    MathSciNet  Google Scholar 

  12. Mohri, M.: Minimization algorithms for sequential transducers. Theoret. Comput. Sci. 234(1–2), 177–201 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  13. Šef, T., Skrjanc, M., Gams, M.: Automatic lexical stress assignment of unknown words for highly inflected slovenian language. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, p. 165. Springer, Heidelberg (2002). http://dx.doi.org/10.1007/3-540-46154-X_23

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Simon Dobrišek .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Golob, Ž., Žganec Gros, J., Štruc, V., Mihelič, F., Dobrišek, S. (2016). A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_43

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-45510-5_43

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-45509-9

  • Online ISBN: 978-3-319-45510-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics