Abstract
This paper focuses on design and implementation of a noise-resilient and speaker independent speech recognition system for isolated word recognition. In this work auditory transform (AT) based features called as Cochlear Filter Cepstral Coefficients (CFCCs) has been used for feature extraction and its robustness against noise and variation in vocal track length (VTL) performance has been enhanced by the application of wavelet based denoising algorithm and invariant-integration method respectively. The resultant features are called as enhanced CFCC Invariant-Integration Features (ECFCCIIFs). To accomplish the objective of this paper, feature-finding neural network (FFNN) is used as classifier for the recognition of isolated words. Results are compared with the results obtained by the standard CFCC features and it is observed that, at both matching and mismatching conditions the ECFCCIIFs features remains high recognition rate under low Signal-to-noise ratiosĀ (SNRs) and their performance are more effective under high SNRs too.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Acero, A., Stern, R.M.: Environmental robustness in automatic speech recognition. In: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 1990), vol.Ā 2, pp. 849ā852. IEEE Press, Albuquerque (1990)
Li, Q.: An auditory-based transform for audio signal processing. In: Proceedings of IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, New Paltz, New York (2009)
Zhang, J., Li, G.-L., Zheng, Y.-Z., Liu, X.-Y.: A Novel Noise-robust Speech Recognition System Based on Adaptively Enhanced Bark Wavelet MFCC. In: Sixth International Conference onĀ Fuzzy Systems and Knowledge Discovery (FSKD 2009), Tianjin, pp. 443ā447 (2009)
Muller, F., Mertins, A.: Invariant-integration method for robust feature extraction in speaker-independent speech recognition. In: Int. Conf. Spoken Language Processing (Interspeech 2009-ICSLP), Brighton, pp. 2975ā2978 (2009)
Muller, F., Mertins, A.: On Using the Auditory Image Model and Invariant-Integration for Noise Robust Automatic Speech Recognition. In: Proc. Int. Conf. Audio, Speech, and Signal Processing, Kyoto, Japan, pp. 4905ā4908 (2012)
Gramss, T., Strube, H.W.: Recognition of isolated words based on psychoacoustics and neurobiology. Speech CommunicationĀ 9, 35ā40 (1990)
Cooke, M., Lee, T.-W.: Speech separation challenge, http://www.interspeech2006.org
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Umarani, S.D., Wahidabanu, R.S.D., Raviram, P. (2013). Speaker Invariant and Noise Robust Speech Recognition Using Enhanced Auditory and VTL Based Features. In: Satapathy, S., Udgata, S., Biswal, B. (eds) Proceedings of the International Conference on Frontiers of Intelligent Computing: Theory and Applications (FICTA). Advances in Intelligent Systems and Computing, vol 199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35314-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-35314-7_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35313-0
Online ISBN: 978-3-642-35314-7
eBook Packages: EngineeringEngineering (R0)