Abstract
A system for automatic labeling and segmentation of speech signals starting from their corresponding text will be described2. The system uses continuous Hidden Markov Models (HMM) to represent a predefined set of acoustic-phonetic units and pronunciation networks to allow different phonetic realizations of a given sentence. The system has been applied to an American (TIMIT) and an Italian (APASCI) speech database.
This work is a contribution to MAIA (Modello Avanzato di Intelligenza Artificiale, Advanced Model of Artificial Intelligence)project, which is currently under development at IRST.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
[1] A. Marzal and E. Vidal,“A Review and New Approaches for Automatic Segmentation of Speech Signals”, Proceedings of the European Signal Processing Conference, pp. 43–55, Barcelona, Spain. September 1990.
[2] L. F. Lamel and J. L. Gauvain, “Experiments on Speaker-Independent Phone Recognition Using BREF.”. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Vol 1, pp. 557–560, San Francisco, USA, 1992.
[3] F. Brugnara, D. Falavigna, and M. Omologo,“A HMM-Based System for Automatic Segmentation and Labeling of Speech”, Proceedings of the International Conference on Spoken and Language Processing, pp. 803–806, Banff, Alberta, Canada, October 1992.
[4] L. F. Lamel, R. H. Kassel, and S. Seneff, “Speech Database Development: Design and Analysis of the Acoustic-Phonetic Corpus”, Proceedings of the DARPA Speech Recognition Workshop, pp. 100–109, Palo Alto, California, USA, February 1986.
[5] B. Angelini, F. Brugnara, D. Falavigna, D. Giuliani, R. Gretter, M. Omologo, “A Baseline of a Speaker Independent Continuous Speech Recognizer of Italian”, Proceedings Eurospeech, Berlin, Germany, September 1993.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1995 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Angelini, B., Brugnara, F., Falavigna, D., Giuliani, D., Gretter, R., Omologo, M. (1995). Automatic Speech Labeling Using Word Pronunciation Networks and Hidden Markov Models. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-57745-1_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-63344-7
Online ISBN: 978-3-642-57745-1
eBook Packages: Springer Book Archive