Skip to main content

Baseline Acoustic Models for Brazilian Portuguese Using CMU Sphinx Tools

  • Conference paper
Computational Processing of the Portuguese Language (PROPOR 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7243))

Abstract

Advances in speech processing research rely on the availability of public resources such as corpora, statistical models and baseline systems. In contrast to languages such as English, there are few specific resources for Brazilian Portuguese. This work describes efforts aiming to decrease such gap. Baseline acoustic models for Brazilian Portuguese were built using the CMU Sphinx toolkit and public domain resources: speech corpora, phonetic dictionary and language model. Experiments were carried on for dictation and grammar tasks and the obtained results can be used to support further researches. Part of the trained acoustic models and a reference speech corpus were made publicly available.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Vertanen, K.: Baseline WSJ acoustic models for HTK and Sphinx: Training recipes and recognition experiments. University of Cambridge, Tech. Rep. (2006)

    Google Scholar 

  2. Samudravijaya, K., Barot, M.: A comparison of public domain software tools for speech recognition. In: Proceedings of Workshop on Spoken Language Processing, pp. 125–131 (2003)

    Google Scholar 

  3. Ma, G., Zhou, W., Zheng, J., You, X., Ye, W.: A comparison between HTK and Sphinx on Chinese Mandarin. In: International Joint Conference on Artificial Intelligence (2009)

    Google Scholar 

  4. Young, S.E.: The HTK Book. Microsoft Corporation, Version 3.0 (2000)

    Google Scholar 

  5. Neto, N., Patrick, C., Klautau, A., Trancoso, I.: Free tools and resources for Brazilian Portuguese speech recognition. The Brazilian Computer Society 16, 53–68 (2010)

    Google Scholar 

  6. Varela, A., Cuayáhuitl, H., Nolazco-Flores, J.: Creating a Mexican Spanish version of the CMU Sphinx-III speech recognition system. In: Progress in Pattern Recognition, Speech and Image Analysis, pp. 251–258 (2003)

    Google Scholar 

  7. Satori, H., Hiyassat, H., Harti, M., Chenfour, N.: Investigation Arabic speech recognition using CMU Sphinx system. International Arab Journal of Information Technology 6 (2009)

    Google Scholar 

  8. Gulic, M., Lucanin, D., Simic, A.: A digit and spelling speech recognition system for the croatian language. In: 34th International Convention on Information and Communication Technology. Eletronics and Microeletronics, pp. 23–27 (2011)

    Google Scholar 

  9. Siravenha, A., Neto, N., Macedo, V., Klautau, A.: Uso de regras fonológicas com determinação de vogal tônica para conversão grafema-fone em Português Brasileiro. In: 7th International Information and Telecommunication Technologies Symposium (2008)

    Google Scholar 

  10. Santos, F., Barone, D., Adami, A.: A baseline system for continuous speech recognition of Brazilian Portuguese using the West Point Brazilian Portuguese speech corpus. In: International Conference on Computational Processing of the Portuguese Language (2010)

    Google Scholar 

  11. http://www.cetuc.puc-rio.br/pos-novo.htm (visited in November 2011)

  12. acdc.linguateca.pt/cetenfolha/ (visited in November 2011)

  13. http://www.laps.ufpa.br/falabrasil (visited in January 2012)

  14. http://cmusphinx.sourceforge.net/wiki/tutorialam (visited in November 2011)

  15. Singh, R., Raj, B., Stern, R.M.: Automatic clustering and generation of contextual questions for tied states in hidden markov models. In: Proceedings of ICASSP, pp. 117–120 (1999)

    Google Scholar 

  16. Sethy, A., Narayanan, S., Parthasarthy, S.: A syllable based approach for improved recognition of spoken names. In: Proceedings of the ISCA Pronunciation Modeling Workshop (2002)

    Google Scholar 

  17. Maskey, S., Bacchiani, M., Roark, B., Sproat, R.: Improved name recognition with meta-data dependent name networks. In: Proceedings of ICASSP (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oliveira, R., Batista, P., Neto, N., Klautau, A. (2012). Baseline Acoustic Models for Brazilian Portuguese Using CMU Sphinx Tools. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds) Computational Processing of the Portuguese Language. PROPOR 2012. Lecture Notes in Computer Science(), vol 7243. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28885-2_42

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28885-2_42

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28884-5

  • Online ISBN: 978-3-642-28885-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics