Whispered speech recognition based on gammatone filterbank cepstral coefficients
- 41 Downloads
This paper presents the results on whispered speech recognition using gammatone filterbank cepstral coefficients for speaker dependent mode. The isolated words used for this experiment are taken from the Whi-Spe database. Whispered speech recognition is based on dynamic time warping and hidden Markov models methods. The experiments are focused on the following modes: normal speech, whispered speech and their combinations (normal/whispered and whispered/normal). The results demonstrated an important improvement in recognition after application of cepstral mean subtraction, especially in mixed train/test scenarios.
Unable to display preview. Download preview PDF.
- 1.C. Zhang and J. H. L. Hansen, Interspeech 2007, 2289 (2007).Google Scholar
- 4.S. T. Jovičić, “Formant feature differences between whispered and voiced sustained vowels,” ACUSTICA–Acta Acustica, 84, 739 (1998).Google Scholar
- 8.B. Marković, J. Galić, Ð. Grozdić, and S. T. Jovičić, “Application of DTW method for whispered speech recognition,” in Proc. Speech Language 2013, 4th Int. Conf. Fundamental and Applied Aspects of Speech and Language, Belgrade, Serbia, 2013 (FAASL, 2013), p. 308.Google Scholar
- 9.J. Galić, S. T. Jovičić, Ð. Grozdić, and B. Marković, HTK-Based Recognition of Whispered Speech, Ed. by A. Ronzhin et al., (SPECOM 2014, LNAI 8773, Springer Int. Publishing, Switzerland, 2014), p. 251.Google Scholar
- 10.B. Marković, S. T. Jovičić, J. Galić, and Ð. Grozdić, Whispered Speech Database: Design, Processing and Application, Ed. by I. Habernal and V. Matousek (TSD 2013, LNAI 8082, Springer-Verlag, Berlin, 2013), p. 591.Google Scholar
- 11.S. T. Jovičić, Z. Kašić, M. Ðordević, and M. Rajković, “Serbian emotional speech database: design, processing and evaluation,” (SPECOM-2004, St. Petersburg, Russia, 2004), p. 77.Google Scholar
- 16.The Hidden Markov Model Toolkit. URL: http://htk.eng.cam.ac.uk/.Google Scholar