Abstract
This paper presents some automatic adjustments of the structure of Markov models with the objective of either reducing model complexity, or improving recognition performance. These modifications are tested on a 36 word vocabulary recorded by more than 500 speakers over the telephone network. The reduction of the model complexity is carried out by merging the similar gaussian functions using an iterative procedure. A 40% reduction of the number of gaussian functions is obtained on word based models without altering recognition performance.
The improvement of the recognition performance is obtained by dynamically expanding the Markov model. This is achieved mainly by splitting the gaussian functions which make the highest contribution to the observation probability of the training set and by discarding the infrequently used transitions. After some iterations (involving the splitting and discarding operators) a 30% reduction of the word error rate is achieved using pseudo-diphone based models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Gagnoulet, C.: “Speech recognition over the telephone: experiments in France”; Voice Systems Worldwide 1990 conference, London, May 1990, p173–177.
Jouvet, D., Monné, J., Dubois, D.: “A new network-based speaker-independent connected-word speech recognition system”, Proc. IEEE Int. Conf. ASSP 1986, Tokyo, April 1986.
Juang, B. H., Rabiner, L. R., Levinson, S. E., Sondhi, M. M.: “Recent developments in the application of hidden Markov models to speaker-independent isolated word recognition”, Proc. IEEE Int. Conf. ASSP 1985.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1992 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Jouvet, D., Mauuary, L., Monné, J. (1992). Automatic adjustments of the Markov models topology for speech recognition applications over the telephone. In: Laface, P., De Mori, R. (eds) Speech Recognition and Understanding. NATO ASI Series, vol 75. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-76626-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-76626-8_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-76628-2
Online ISBN: 978-3-642-76626-8
eBook Packages: Springer Book Archive