Phonetic Sequence to Graphemes Conversion Based on DTW and One-Stage Algorithms

Teruszkin, Rafael; Gil Vianna Resende, Fernando

doi:10.1007/11751984_26

Rafael Teruszkin²⁴ &
Fernando Gil Vianna Resende Jr.^24,25

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3960))

Included in the following conference series:

International Workshop on Computational Processing of the Portuguese Language

423 Accesses

Abstract

This work proposes an algorithm for converting phonetic sequences into graphemes using DTW on the recognition of isolated words or closed sentences, and using One-Stage on a continuous speech recognition task. Most speech recognition systems resolve the task of recognition on a single stage without having an intermediate phonetic sequence result. The proposed solution is hybrid in the sense that it uses HMMs and Viterbi Decoding for recognizing a phonetic sequence (actually, triphones) and then DTW or One-Stage to generate the corresponding graphemes. Experimental results showed an average accuracy rate of 100% on the recognition of closed sentences, and average word recognition rate of 84% for the continuous speech recognition task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

HTK - Hidden Markov Model Toolkit, http://htk.eng.cam.ac.uk
Rabiner, L.R., Juang, B.: Fundamentals on Speech Recognition. New Jersey, Prentice Hall (1996)
Google Scholar
Ney, H.: The Use of a One-Stage Dynamic Programming Algorithm for Connected Word Recognition. In: Proceedings of ICASSP (1984)
Google Scholar
Alcaim, A., Solewicz e, J.A., Moraes, J.A.: Freqüência de ocorrência dos fones e listas de frases foneticamente balanceadas no português falado no Rio de Janeiro. Revista da Sociedade Brasileira de Telecomunicações, Rio de Janeiro, 7(1), 23–41 (1992)
Google Scholar
Barbosa, F.L.F., et al.: Grapheme-phone transcription algorithm for a Brazilian Portuguese TTS. In: Mamede, N.J., Baptista, J., Trancoso, I., Nunes, M.d.G.V. (eds.) PROPOR 2003. LNCS, vol. 2721, pp. 23–30. Springer, Heidelberg (2003)
Chapter Google Scholar
Huang, X., Acero, A., Hon, H.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development, ch. 11. Prentice Hall, Englewood Cliffs (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

Programa de Engenharia Elétrica, COPPE, UFRJ, Brazil
Rafael Teruszkin & Fernando Gil Vianna Resende Jr.
Departamento de Engenharia Eletrônica e de Computação, Escola Politécnica, UFRJ, Brazil
Fernando Gil Vianna Resende Jr.

Authors

Rafael Teruszkin
View author publications
You can also search for this author in PubMed Google Scholar
Fernando Gil Vianna Resende Jr.
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Pontifícia Universidade do Rio Grande do Sul, Porto Alegre, Brasil
Renata Vieira
Departamento de Informática, Universidade de Évora, Portugal
Paulo Quaresma
NILC-ICMC, University of São Paulo, CP 668P, 13560-970, São Carlos, SP, Brazil
Maria das Graças Volpe Nunes
L2F/INESC-ID Lisboa, Email: qa-clef@l2f.inesc-id.pt, Rua Alves Redol, 9, 1000-029, Lisboa, Portugal
Nuno J. Mamede
Instituto Militar de Engenharia, Praça General Tibúrcio, 80, Rio de Janeiro, Brazil
Cláudia Oliveira
Pontifícia Universidade Católica do Rio de Janeiro, Rua Marquês de São Vicente, 225, Rio de Janeiro, Brazil
Maria Carmelita Dias

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teruszkin, R., Gil Vianna Resende, F. (2006). Phonetic Sequence to Graphemes Conversion Based on DTW and One-Stage Algorithms. In: Vieira, R., Quaresma, P., Nunes, M.d.G.V., Mamede, N.J., Oliveira, C., Dias, M.C. (eds) Computational Processing of the Portuguese Language. PROPOR 2006. Lecture Notes in Computer Science(), vol 3960. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11751984_26

Download citation

DOI: https://doi.org/10.1007/11751984_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34045-4
Online ISBN: 978-3-540-34046-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics