Baseline Acoustic Models for Brazilian Portuguese Using CMU Sphinx Tools

Oliveira, Rafael; Batista, Pedro; Neto, Nelson; Klautau, Aldebaro

doi:10.1007/978-3-642-28885-2_42

Rafael Oliveira²³,
Pedro Batista²³,
Nelson Neto²³ &
…
Aldebaro Klautau²³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

Included in the following conference series:

International Conference on Computational Processing of the Portuguese Language

1188 Accesses
1 Citations

Abstract

Advances in speech processing research rely on the availability of public resources such as corpora, statistical models and baseline systems. In contrast to languages such as English, there are few specific resources for Brazilian Portuguese. This work describes efforts aiming to decrease such gap. Baseline acoustic models for Brazilian Portuguese were built using the CMU Sphinx toolkit and public domain resources: speech corpora, phonetic dictionary and language model. Experiments were carried on for dictation and grammar tasks and the obtained results can be used to support further researches. Part of the trained acoustic models and a reference speech corpus were made publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Vertanen, K.: Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. University of Cambridge, Tech. Rep. (2006)
Google Scholar
Samudravijaya, K., Barot, M.: A comparison of public domain software tools for speech recognition. In: Proceedings of Workshop on Spoken Language Processing, pp. 125–131 (2003)
Google Scholar
Ma, G., Zhou, W., Zheng, J., You, X., Ye, W.: A comparison between HTK and Sphinx on Chinese Mandarin. In: International Joint Conference on Artificial Intelligence (2009)
Google Scholar
Young, S.E.: The HTK Book. Microsoft Corporation, Version 3.0 (2000)
Google Scholar
Neto, N., Patrick, C., Klautau, A., Trancoso, I.: Free tools and resources for Brazilian Portuguese speech recognition. The Brazilian Computer Society 16, 53–68 (2010)
Google Scholar
Varela, A., Cuayáhuitl, H., Nolazco-Flores, J.: Creating a Mexican Spanish version of the CMU Sphinx-III speech recognition system. In: Progress in Pattern Recognition, Speech and Image Analysis, pp. 251–258 (2003)
Google Scholar
Satori, H., Hiyassat, H., Harti, M., Chenfour, N.: Investigation Arabic speech recognition using CMU Sphinx system. International Arab Journal of Information Technology 6 (2009)
Google Scholar
Gulic, M., Lucanin, D., Simic, A.: A digit and spelling speech recognition system for the croatian language. In: 34th International Convention on Information and Communication Technology. Eletronics and Microeletronics, pp. 23–27 (2011)
Google Scholar
Siravenha, A., Neto, N., Macedo, V., Klautau, A.: Uso de regras fonológicas com determinação de vogal tônica para conversão grafema-fone em Português Brasileiro. In: 7th International Information and Telecommunication Technologies Symposium (2008)
Google Scholar
Santos, F., Barone, D., Adami, A.: A baseline system for continuous speech recognition of Brazilian Portuguese using the West Point Brazilian Portuguese speech corpus. In: International Conference on Computational Processing of the Portuguese Language (2010)
Google Scholar
http://www.cetuc.puc-rio.br/pos-novo.htm (visited in November 2011)
acdc.linguateca.pt/cetenfolha/ (visited in November 2011)
http://www.laps.ufpa.br/falabrasil (visited in January 2012)
http://cmusphinx.sourceforge.net/wiki/tutorialam (visited in November 2011)
Singh, R., Raj, B., Stern, R.M.: Automatic clustering and generation of contextual questions for tied states in hidden markov models. In: Proceedings of ICASSP, pp. 117–120 (1999)
Google Scholar
Sethy, A., Narayanan, S., Parthasarthy, S.: A syllable based approach for improved recognition of spoken names. In: Proceedings of the ISCA Pronunciation Modeling Workshop (2002)
Google Scholar
Maskey, S., Bacchiani, M., Roark, B., Sproat, R.: Improved name recognition with meta-data dependent name networks. In: Proceedings of ICASSP (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

Signal Processing Laboratory, Federal University of Pará, Rua Augusto Correa 1, 660750110, Belém, PA, Brazil
Rafael Oliveira, Pedro Batista, Nelson Neto & Aldebaro Klautau

Authors

Rafael Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Batista
View author publications
You can also search for this author in PubMed Google Scholar
Nelson Neto
View author publications
You can also search for this author in PubMed Google Scholar
Aldebaro Klautau
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

UFSCAR, Rod. Washington Luís, 13565-905, São Carlos, Brazil
Helena Caseli
UFRGS, Av. Bento Gonçalves, 9500, 91501-970, Porto Alegre, Brazil
Aline Villavicencio
DETI/IEETA, Universidade de Aveiro, Campus Universitário de Santiago, 3810-193, Aveiro, Portugal
António Teixeira
UC/ IT, DEEC, Universidade de Coimbra, Polo 2, 3030-290, Coimbra, Portugal
Fernando Perdigão

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Oliveira, R., Batista, P., Neto, N., Klautau, A. (2012). Baseline Acoustic Models for Brazilian Portuguese Using CMU Sphinx Tools. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_42

Download citation

DOI: https://doi.org/10.1007/978-3-642-28885-2_42
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28884-5
Online ISBN: 978-3-642-28885-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics