Skip to main content

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

  • Conference paper
Book cover Text, Speech and Dialogue (TSD 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Included in the following conference series:

Abstract

This paper describes several ways of acoustic keywords spotting (KWS), based on Gaussian mixture model (GMM) hidden Markov models (HMM) and phoneme posterior probabilities from FeatureNet. Context-independent and dependent phoneme models are used in the GMM/HMM system. The systems were trained and evaluated on informal continuous speech. We used different complexities of KWS recognition network and different types of phoneme models. We study the impact of these parameters on the accuracy and computational complexity, an conclude that phoneme posteriors outperform conventional GMM/HMM system.

This work was partially supported by EC project Augmented Multi-party Interaction (AMI), No. 506811 and Grant Agency of Czech Republic under project No. 102/05/0278. Jan Černocký was supported by post-doctoral grant of Grant Agency of Czech Republic No. GA102/02/D108.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. Pattern Analysis and Machine Inteligence PAMI-5(2)

    Google Scholar 

  2. Hermansky, H.: Perceptual linear predictive (PLP) analysis for the speech. Journal of the Acoustical Society of America (1990) , JASA 1990, pp. 1738–1752 (1990)

    Google Scholar 

  3. Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: International Conference on Acoustics, Speech, and Signal Processing, 2003. ICASSP 2003, Hong Kong (April 2003)

    Google Scholar 

  4. Rohlicek, J.R., Russell, W., Roukos, S., Gish, H.: Continuous hidden markov modeling for speaker-independent word spotting. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1989), Glasgow, UK, May 1989, vol. 1 (1989)

    Google Scholar 

  5. Schwarz, P., Matějka, P., Č ernocký, J.: Towards lower error rates in phoneme recognition. In: Proc. TSD 2004, Brno, Czech Republic, September 2004, pp. 465–472 (2004) ISBN 87-90834-09-7

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karafiát, M., Černocký, J. (2005). Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_39

Download citation

  • DOI: https://doi.org/10.1007/11551874_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-28789-6

  • Online ISBN: 978-3-540-31817-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics