Designing Syllable Models for an HMM Based Speech Recognition System

Proença, Kseniya; Demuynck, Kris; Van Compernolle, Dirk

doi:10.1007/978-3-319-43958-7_25

Designing Syllable Models for an HMM Based Speech Recognition System

Kseniya Proença¹⁶,
Kris Demuynck¹⁷ &
Dirk Van Compernolle¹⁶

Conference paper
First Online: 13 August 2016

2210 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9811))

Abstract

In this paper we present novel ways of incorporating syllable information into an HMM based speech recognition system. Syllable based acoustic modelling is appealing as syllables have certain acoustic-phonetic dependencies that can not be modeled in a pure phone based system. On the other hand, syllable based systems suffer from sparsity issues. In this paper we investigate the potential of different acoustic units such as phone, phone clusters, phones-in-syllables, demi-syllables and syllables in combination with a variety of back-off schemes. Experimental results are presented on the Wall Street Journal database. When working with traditional frame based features only, results only show minor improvements. However, we expect that the developed system will show its full potential when incorporating additional segmental features at the syllable level.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Demuynck, K., Duchateau, J., Compernolle, D.V.: Optimal feature sub-space selection based on discriminant analysis. In: Sixth European Conference on Speech Communication and Technology, EUROSPEECH 1999, Budapest, Hungary, 5–9 September 1999
Google Scholar
Demuynck, K., Roelens, J., Compernolle, D.V., Wambacq, P.: Spraak: an open source “speech recognition and automatic annotation kit”. In: INTERSPEECH, p. 495 (2008)
Google Scholar
Ganapathiraju, A., Hamaker, J., Picone, J., Ordowski, M., Doddington, G.R.: Syllable-based large vocabulary continuous speech recognition. IEEE Trans. Speech Audio Process. 9(4), 358–366 (2001)
Article Google Scholar
Goldenthal, W.D.: Statistical trajectory models for phonetic recognition. Ph.D. thesis, Massachusetts Institute of Technology, Department of Aeronautics and Astronautics (1994)
Google Scholar
Hauenstein, A.: Using syllables in a hybrid HMM-ANN recognition system. In: EUROSPEECH (1997)
Google Scholar
Hu, Z., Schalkwyk, J., Barnard, E., Cole, R.A.: Speech recognition using syllable-like units. In: ICSLP (1996)
Google Scholar
Huang, X., Acero, A., Hon, H.W.: Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, Prentice-Hall, Inc., Upper Saddle River (2001)
Google Scholar
Jones, R.J., Downey, S., Mason, J.S.: Continuous speech recognition using syllables. In: EUROSPEECH (1997)
Google Scholar
Liao, H., Alberti, C., Bacchiani, M., Siohan, O.: Decision tree state clustering with word and syllable features. In: INTERSPEECH, pp. 2958–2961 (2010)
Google Scholar
Paul, D.B., Baker, J.M.: The design for the wall street journal-based CSR corpus. In: ICSLP (1992)
Google Scholar
Rogova, K., Demuynck, K., Van Compernolle, D.: Automatic syllabification using segmental conditional random fields. Comput. Linguist. Neth. J. 3, 34–48 (2013)
Google Scholar
Syrdal, A., Bennett, R., Greenspan, S.: Applied Speech Technology. Taylor & Francis, Oxford (1994). http://books.google.be/books?id=kyJBjxw3ducC
MATH Google Scholar
Carnegie Mellon Universit: CMU pronouncing dictionary (2008). http://svn.code.sf.net/p/cmusphinx/code/trunk/cmudict
Zhang, L., Edmondson, W.H.: Speech recognition using syllable patterns. In: INTERSPEECH (2002)
Google Scholar

Download references

Author information

Authors and Affiliations

ESAT - PSI, KULeuven, Kasteelpark Arenberg 10, 2441, 3001, Leuven, Belgium
Kseniya Proença & Dirk Van Compernolle
ELIS, UGent, Sint-Pietersnieuwstraat 41, 9000, Ghent, Belgium
Kris Demuynck

Authors

Kseniya Proença
View author publications
You can also search for this author in PubMed Google Scholar
Kris Demuynck
View author publications
You can also search for this author in PubMed Google Scholar
Dirk Van Compernolle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kseniya Proença .

Editor information

Editors and Affiliations

SPIIRAS , Saint-Petersburg, Russia
Andrey Ronzhin
Moscow State Linguistic University , Moscow, Russia
Rodmonga Potapova
Budapest University of Technology and Economics, Budapest, Hungary
Géza Németh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Proença, K., Demuynck, K., Van Compernolle, D. (2016). Designing Syllable Models for an HMM Based Speech Recognition System. In: Ronzhin, A., Potapova, R., Németh, G. (eds) Speech and Computer. SPECOM 2016. Lecture Notes in Computer Science(), vol 9811. Springer, Cham. https://doi.org/10.1007/978-3-319-43958-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-319-43958-7_25
Published: 13 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43957-0
Online ISBN: 978-3-319-43958-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics