Abstract
Advances in speech processing research rely on the availability of public resources such as corpora, statistical models and baseline systems. In contrast to languages such as English, there are few specific resources for Brazilian Portuguese. This work describes efforts aiming to decrease such gap. Baseline acoustic models for Brazilian Portuguese were built using the CMU Sphinx toolkit and public domain resources: speech corpora, phonetic dictionary and language model. Experiments were carried on for dictation and grammar tasks and the obtained results can be used to support further researches. Part of the trained acoustic models and a reference speech corpus were made publicly available.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Vertanen, K.: Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. University of Cambridge, Tech. Rep. (2006)
Samudravijaya, K., Barot, M.: A comparison of public domain software tools for speech recognition. In: Proceedings of Workshop on Spoken Language Processing, pp. 125–131 (2003)
Ma, G., Zhou, W., Zheng, J., You, X., Ye, W.: A comparison between HTK and Sphinx on Chinese Mandarin. In: International Joint Conference on Artificial Intelligence (2009)
Young, S.E.: The HTK Book. Microsoft Corporation, Version 3.0 (2000)
Neto, N., Patrick, C., Klautau, A., Trancoso, I.: Free tools and resources for Brazilian Portuguese speech recognition. The Brazilian Computer Society 16, 53–68 (2010)
Varela, A., Cuayáhuitl, H., Nolazco-Flores, J.: Creating a Mexican Spanish version of the CMU Sphinx-III speech recognition system. In: Progress in Pattern Recognition, Speech and Image Analysis, pp. 251–258 (2003)
Satori, H., Hiyassat, H., Harti, M., Chenfour, N.: Investigation Arabic speech recognition using CMU Sphinx system. International Arab Journal of Information Technology 6 (2009)
Gulic, M., Lucanin, D., Simic, A.: A digit and spelling speech recognition system for the croatian language. In: 34th International Convention on Information and Communication Technology. Eletronics and Microeletronics, pp. 23–27 (2011)
Siravenha, A., Neto, N., Macedo, V., Klautau, A.: Uso de regras fonológicas com determinação de vogal tônica para conversão grafema-fone em Português Brasileiro. In: 7th International Information and Telecommunication Technologies Symposium (2008)
Santos, F., Barone, D., Adami, A.: A baseline system for continuous speech recognition of Brazilian Portuguese using the West Point Brazilian Portuguese speech corpus. In: International Conference on Computational Processing of the Portuguese Language (2010)
http://www.cetuc.puc-rio.br/pos-novo.htm (visited in November 2011)
acdc.linguateca.pt/cetenfolha/ (visited in November 2011)
http://www.laps.ufpa.br/falabrasil (visited in January 2012)
http://cmusphinx.sourceforge.net/wiki/tutorialam (visited in November 2011)
Singh, R., Raj, B., Stern, R.M.: Automatic clustering and generation of contextual questions for tied states in hidden markov models. In: Proceedings of ICASSP, pp. 117–120 (1999)
Sethy, A., Narayanan, S., Parthasarthy, S.: A syllable based approach for improved recognition of spoken names. In: Proceedings of the ISCA Pronunciation Modeling Workshop (2002)
Maskey, S., Bacchiani, M., Roark, B., Sproat, R.: Improved name recognition with meta-data dependent name networks. In: Proceedings of ICASSP (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oliveira, R., Batista, P., Neto, N., Klautau, A. (2012). Baseline Acoustic Models for Brazilian Portuguese Using CMU Sphinx Tools. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_42
Download citation
DOI: https://doi.org/10.1007/978-3-642-28885-2_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)