Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

Szöke, Igor; Schwarz, Petr; Matějka, Pavel; Burget, Lukáš; Karafiát, Martin; Černocký, Jan

doi:10.1007/11551874_39

Igor Szöke¹⁹,
Petr Schwarz¹⁹,
Pavel Matějka¹⁹,
Lukáš Burget¹⁹,
Martin Karafiát¹⁹ &
…
Jan Černocký¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3658))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

823 Accesses
40 Citations

Abstract

This paper describes several ways of acoustic keywords spotting (KWS), based on Gaussian mixture model (GMM) hidden Markov models (HMM) and phoneme posterior probabilities from FeatureNet. Context-independent and dependent phoneme models are used in the GMM/HMM system. The systems were trained and evaluated on informal continuous speech. We used different complexities of KWS recognition network and different types of phoneme models. We study the impact of these parameters on the accuracy and computational complexity, an conclude that phoneme posteriors outperform conventional GMM/HMM system.

This work was partially supported by EC project Augmented Multi-party Interaction (AMI), No. 506811 and Grant Agency of Czech Republic under project No. 102/05/0278. Jan Černocký was supported by post-doctoral grant of Grant Agency of Czech Republic No. GA102/02/D108.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bahl, L.R., Jelinek, F., Mercer, R.L.: A maximum likelihood approach to continuous speech recognition. IEEE Trans. Pattern Analysis and Machine Inteligence PAMI-5(2)
Google Scholar
Hermansky, H.: Perceptual linear predictive (PLP) analysis for the speech. Journal of the Acoustical Society of America (1990) , JASA 1990, pp. 1738–1752 (1990)
Google Scholar
Janin, A., Baron, D., Edwards, J., Ellis, D., Gelbart, D., Morgan, N., Peskin, B., Pfau, T., Shriberg, E., Stolcke, A., Wooters, C.: The ICSI meeting corpus. In: International Conference on Acoustics, Speech, and Signal Processing, 2003. ICASSP 2003, Hong Kong (April 2003)
Google Scholar
Rohlicek, J.R., Russell, W., Roukos, S., Gish, H.: Continuous hidden markov modeling for speaker-independent word spotting. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1989), Glasgow, UK, May 1989, vol. 1 (1989)
Google Scholar
Schwarz, P., Matějka, P., Č ernocký, J.: Towards lower error rates in phoneme recognition. In: Proc. TSD 2004, Brno, Czech Republic, September 2004, pp. 465–472 (2004) ISBN 87-90834-09-7
Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Information Technology, Brno University of Technology, Czech Republic
Igor Szöke, Petr Schwarz, Pavel Matějka, Lukáš Burget, Martin Karafiát & Jan Černocký

Authors

Igor Szöke
View author publications
You can also search for this author in PubMed Google Scholar
Petr Schwarz
View author publications
You can also search for this author in PubMed Google Scholar
Pavel Matějka
View author publications
You can also search for this author in PubMed Google Scholar
Lukáš Burget
View author publications
You can also search for this author in PubMed Google Scholar
Martin Karafiát
View author publications
You can also search for this author in PubMed Google Scholar
Jan Černocký
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of West Bohemia in Pilsen, Univerzitni 8, 30614, Plzen, Czech Republic
Václav Matoušek , Pavel Mautner & Tomáš Pavelka , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karafiát, M., Černocký, J. (2005). Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech. In: Matoušek, V., Mautner, P., Pavelka, T. (eds) Text, Speech and Dialogue. TSD 2005. Lecture Notes in Computer Science(), vol 3658. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11551874_39

Download citation

DOI: https://doi.org/10.1007/11551874_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28789-6
Online ISBN: 978-3-540-31817-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics