Skip to main content

On the Performance of SCHMM for Isolated Word Recognition and Rejection

  • Conference paper
Speech Recognition and Coding

Part of the book series: NATO ASI Series ((NATO ASI F,volume 147))

  • 229 Accesses

Abstract

A common problem with isolated word recognition systems arises when an untrained user speaks an unwanted word, outside the active vocabulary. This word will be recognised as one of the keywords, thus steering the dialogue into a wrong direction. The use of garbage or sink models (SM) is a known technique to avoid those extraneous words being recognised as vocabulary words. Each lexical word from the active vocabulary is represented in the recognition process by at least one word model (WM). A single SM intends to be a general description for a wide number of lexical items - all those which do not belong to the limited active vocabulary. Our previous work [3] has indicated that multiple SM’s can improve the rejection score when compared with a single SM in the context of a Continuous Hidden Markov Model (CHMM) with a single observation component. This improvement is related to the vocabulary size. For very small vocabularies, there are no advantages in using more than one SM, whereas for larger vocabularies, better results can be achieved with multiple models. When searching for the optimal number of multiple SM’s, an upper bound seems to be imposed by the available amount of speech training material. In fact, this amount should be particularly relevant for training sink models as they intend to represent the whole word universe (minus the small keyword vocabulary set). The parametric description provided by a single gaussian distribution is known to be a poor model for the observation probability density function (pdf). However, due to the restricted amount of speech training material, the use of multiple gaussian mixtures to describe the observation pdf’s did not improve our results. In the present work, we compare the performance of continuous and semi-continuous HMM (SCHMM) recognisers for dealing with the problem of word rejection. The latter type of recogniser has several advantages over the first one in cases of reduced training material which is indeed one of the critical factors in this study and in terms of computational complexity. This approach combines a common set of pdf’s in a codebook with the word or sub-word models themselves. The codebook and the models can be easily initialised and reestimated separately using different sets of training material or mutually optimised using the unified modelling approach described in [1]. Separate software tools for processing each stage of training and testing were developed providing a complete SCHMM recognition platform. In the present work some effort was also spent in finding how to combine the initialisation steps. The tests reported here enable us to compare CHMM and SCHMM while using multiple SM’s. Another issue to be addressed is the type and amount of speech material to be used for SM’s training. The discussion of HMM clustering techniques for selecting the speech material used to train each sink model in the context of multiple sink modelling is described in [2].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. X. Huang and M. Jack, “Semi-continuous Markov models for speech signals”, Computer Speech and Language, 3, pp. 239–251, 1989

    Article  Google Scholar 

  2. C. Teixeira and B. Lindberg, “Word Rejection Experiments on the SUNSTAR Multi-lingual Speech Database” Proc. RecPad 92 (APRP), pp. 77–83, Coimbra, March 1992

    Google Scholar 

  3. C. Teixeira and I. Trancoso, “Word Rejection Using Multiple Sink Models”, Proc. ICSLP′92. pp. 1443–1450. Banff. October 1992

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1995 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Teixeira, C., Trancoso, I., Serralheiro, A. (1995). On the Performance of SCHMM for Isolated Word Recognition and Rejection. In: Ayuso, A.J.R., Soler, J.M.L. (eds) Speech Recognition and Coding. NATO ASI Series, vol 147. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-57745-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-57745-1_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-63344-7

  • Online ISBN: 978-3-642-57745-1

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics