A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

Golob, Žiga; Žganec Gros, Jerneja; Štruc, Vitomir; Mihelič, France; Dobrišek, Simon

doi:10.1007/978-3-319-45510-5_43

A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion

Žiga Golob¹⁷,
Jerneja Žganec Gros¹⁷,
Vitomir Štruc¹⁸,
France Mihelič¹⁸ &
…
Simon Dobrišek¹⁸

Conference paper
First Online: 03 September 2016

1655 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9924))

Abstract

Minimal deterministic finite-state transducers (MDFSTs) are powerful models that can be used to represent pronunciation dictionaries in a compact form. Intuitively, we would assume that by increasing the size of the dictionary, the size of the MDFSTs would increase as well. However, as we show in the paper, this intuition does not hold for highly inflected languages. With such languages the size of the MDFSTs begins to decrease once the number of words in the represented dictionary reaches a certain threshold. Motivated by this observation, we have developed a new type of FST, called a finite-state super transducer (FSST), and show experimentally that the FSST is capable of representing pronunciation dictionaries with fewer states and transitions than MDFSTs. Furthermore, we show that (unlike MDFSTs) our FSSTs can also accept words that are not part of the represented dictionary. The phonetic transcriptions of these out-of-dictionary words may not always be correct, but the observed error rates are comparable to the error rates of the traditional methods for grapheme-to-phoneme conversion.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Allauzen, C., Riley, M.D., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: a general and efficient weighted finite-state transducer library. In: Holub, J., Žd’árek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007). http://www.openfst.org
Chapter Google Scholar
Bisani, M., Ney, H.: Joint-sequence models for grapheme-to-phoneme conversion. Speech Commun. 50(5), 434–451 (2008)
Article Google Scholar
Black, A., Taylor, P., Caley, R.: The festival speech synthesis system: system documentation (2.4.0). Technical report, Human Communication Research Centre, December 2014
Google Scholar
Golob, Ž.: Reducing redundancy of finite-state transducers in automatic speech synthesis for embedded systems. Ph.D. thesis, University of Ljubljana, Faculty of Electrical Engineering, Tržaska 25, SI-1000 Ljubljana, Slovenia (2014)
Google Scholar
Golob, Ž., Žganec Gros, J., Žganec, M., Vesnicer, B., Dobrišek, S.: FST-based pronunciation lexicon compression for speech engines. Int. J. Adv. Rob. Syst. 9(211), 1–9 (2012)
Article Google Scholar
Žganec Gros, J., Cvetko-Orešnik, V., Jakopin, P.: SI-PRON pronunciation lexicon: a new language resource for Slovenian. Informatica (Slovenia) 30(4), 447–452 (2006)
Google Scholar
The Carnegie Mellon Speech Group: The Carnegie Mellon University Pronouncing Dictionary (Version 0.7b) [Electronic database]. Carnegie Mellon University, Pittsburgh (1995). http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict
Hahn, S., Vozila, P., Bisani, M.: Comparison of grapheme-to-phoneme methods on large pronunciation dictionaries and LVCSR tasks. In: Interspeech, Portland, OR, USA, pp. 2538–2541, September 2012
Google Scholar
Jiampojamarn, S., Kondrak, G.: Letter-phoneme alignment: an exploration. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, ACL 2010, pp. 780–788. Association for Computational Linguistics, Stroudsburg, PA, USA (2010)
Google Scholar
Lehnen, P., Allauzen, A., Lavergne, T., Yvon, F., Hahn, S., Ney, H.: Structure learning in hidden conditional random fields for grapheme-to-phoneme conversion. In: Interspeech, Lyon, France, pp. 2326–2330, August 2013
Google Scholar
Mohri, M.: Finite-state transducers in language and speech processing. Comput. Linguist. 23(2), 269–311 (1997)
MathSciNet Google Scholar
Mohri, M.: Minimization algorithms for sequential transducers. Theoret. Comput. Sci. 234(1–2), 177–201 (2000)
Article MathSciNet MATH Google Scholar
Šef, T., Skrjanc, M., Gams, M.: Automatic lexical stress assignment of unknown words for highly inflected slovenian language. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, p. 165. Springer, Heidelberg (2002). http://dx.doi.org/10.1007/3-540-46154-X_23
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Alpineon Research and Development, Alpineon d.o.o., Ulica Iga Grudna 15, 1000, Ljubljana, Slovenia
Žiga Golob & Jerneja Žganec Gros
Faculty of Electrical Engineering, University of Ljubljana, Tržaška 25, 1000, Ljubljana, Slovenia
Vitomir Štruc, France Mihelič & Simon Dobrišek

Authors

Žiga Golob
View author publications
You can also search for this author in PubMed Google Scholar
Jerneja Žganec Gros
View author publications
You can also search for this author in PubMed Google Scholar
Vitomir Štruc
View author publications
You can also search for this author in PubMed Google Scholar
France Mihelič
View author publications
You can also search for this author in PubMed Google Scholar
Simon Dobrišek
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon Dobrišek .

Editor information

Editors and Affiliations

Masaryk University , Brno, Czech Republic
Petr Sojka
Masaryk University , Brno, Czech Republic
Aleš Horák
Masaryk University , Brno, Czech Republic
Ivan Kopeček
Masaryk University , Brno, Czech Republic
Karel Pala

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Golob, Ž., Žganec Gros, J., Štruc, V., Mihelič, F., Dobrišek, S. (2016). A Composition Algorithm of Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds) Text, Speech, and Dialogue. TSD 2016. Lecture Notes in Computer Science(), vol 9924. Springer, Cham. https://doi.org/10.1007/978-3-319-45510-5_43

Download citation

DOI: https://doi.org/10.1007/978-3-319-45510-5_43
Published: 03 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-45509-9
Online ISBN: 978-3-319-45510-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics