On the Performance of SCHMM for Isolated Word Recognition and Rejection

Teixeira, Carlos; Trancoso, Isabel; Serralheiro, Antonio

doi:10.1007/978-3-642-57745-1_16

Carlos Teixeira,
Isabel Trancoso &
Antonio Serralheiro²

Part of the book series: NATO ASI Series ((NATO ASI F,volume 147))

229 Accesses

Abstract

A common problem with isolated word recognition systems arises when an untrained user speaks an unwanted word, outside the active vocabulary. This word will be recognised as one of the keywords, thus steering the dialogue into a wrong direction. The use of garbage or sink models (SM) is a known technique to avoid those extraneous words being recognised as vocabulary words. Each lexical word from the active vocabulary is represented in the recognition process by at least one word model (WM). A single SM intends to be a general description for a wide number of lexical items - all those which do not belong to the limited active vocabulary. Our previous work [3] has indicated that multiple SM’s can improve the rejection score when compared with a single SM in the context of a Continuous Hidden Markov Model (CHMM) with a single observation component. This improvement is related to the vocabulary size. For very small vocabularies, there are no advantages in using more than one SM, whereas for larger vocabularies, better results can be achieved with multiple models. When searching for the optimal number of multiple SM’s, an upper bound seems to be imposed by the available amount of speech training material. In fact, this amount should be particularly relevant for training sink models as they intend to represent the whole word universe (minus the small keyword vocabulary set). The parametric description provided by a single gaussian distribution is known to be a poor model for the observation probability density function (pdf). However, due to the restricted amount of speech training material, the use of multiple gaussian mixtures to describe the observation pdf’s did not improve our results. In the present work, we compare the performance of continuous and semi-continuous HMM (SCHMM) recognisers for dealing with the problem of word rejection. The latter type of recogniser has several advantages over the first one in cases of reduced training material which is indeed one of the critical factors in this study and in terms of computational complexity. This approach combines a common set of pdf’s in a codebook with the word or sub-word models themselves. The codebook and the models can be easily initialised and reestimated separately using different sets of training material or mutually optimised using the unified modelling approach described in [1]. Separate software tools for processing each stage of training and testing were developed providing a complete SCHMM recognition platform. In the present work some effort was also spent in finding how to combine the initialisation steps. The tests reported here enable us to compare CHMM and SCHMM while using multiple SM’s. Another issue to be addressed is the type and amount of speech material to be used for SM’s training. The discussion of HMM clustering techniques for selecting the speech material used to train each sink model in the context of multiple sink modelling is described in [2].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

X. Huang and M. Jack, “Semi-continuous Markov models for speech signals”, Computer Speech and Language, 3, pp. 239–251, 1989
Article Google Scholar
C. Teixeira and B. Lindberg, “Word Rejection Experiments on the SUNSTAR Multi-lingual Speech Database” Proc. RecPad 92 (APRP), pp. 77–83, Coimbra, March 1992
Google Scholar
C. Teixeira and I. Trancoso, “Word Rejection Using Multiple Sink Models”, Proc. ICSLP′92. pp. 1443–1450. Banff. October 1992
Google Scholar

Download references

Author information

Authors and Affiliations

INESC/IST, INESC — R. Alves Redol 9, 1000, Lisboa, Portugal
Antonio Serralheiro

Authors

Carlos Teixeira
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Trancoso
View author publications
You can also search for this author in PubMed Google Scholar
Antonio Serralheiro
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Electronics and Technology of Computers Faculty of Sciences, University of Granada, E-18071, Granada, Spain
Antonio J. Rubio Ayuso & Juan M. López Soler &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Teixeira, C., Trancoso, I., Serralheiro, A. (1995). On the Performance of SCHMM for Isolated Word Recognition and Rejection. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-57745-1_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-63344-7
Online ISBN: 978-3-642-57745-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics