Abstract
This paper describes several ways of acoustic keywords spotting (KWS), based on Gaussian mixture model (GMM) hidden Markov models (HMM) and phoneme posterior probabilities from FeatureNet. Context-independent and dependent phoneme models are used in the GMM/HMM system. The systems were trained and evaluated on informal continuous speech. We used different complexities of KWS recognition network and different types of phoneme models. We study the impact of these parameters on the accuracy and computational complexity, an conclude that phoneme posteriors outperform conventional GMM/HMM system.
This work was partially supported by EC project Augmented Multi-party Interaction (AMI), No. 506811 and Grant Agency of Czech Republic under project No. 102/05/0278. Jan Černocký was supported by post-doctoral grant of Grant Agency of Czech Republic No. GA102/02/D108.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. Pattern Analysis and Machine Inteligence PAMI-5(2)
Hermansky, H.: Perceptual linear predictive (PLP) analysis for the speech. Journal of the Acoustical Society of America (1990) , JASA 1990, pp. 1738–1752 (1990)
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: International Conference on Acoustics, Speech, and Signal Processing, 2003. ICASSP 2003, Hong Kong (April 2003)
Rohlicek, J.R., Russell, W., Roukos, S., Gish, H.: Continuous hidden markov modeling for speaker-independent word spotting. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1989), Glasgow, UK, May 1989, vol. 1 (1989)
Schwarz, P., Matějka, P., Č ernocký, J.: Towards lower error rates in phoneme recognition. In: Proc. TSD 2004, Brno, Czech Republic, September 2004, pp. 465–472 (2004) ISBN 87-90834-09-7
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karafiát, M., Černocký, J. (2005). Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_39
Download citation
DOI: https://doi.org/10.1007/11551874_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)