Abstract
Recently the parametric representation using cochlea behavior has been used in different studies related with Automatic Speech Recognition (ASR). This paper shows how using an alternative solution reported in the state of the art solves the Lesser and Berkeley’s cochlea model in ASR tasks. An approach that considers a new form to construct the bank filter in the parametric representation used to extract MFCC is proposed. Then this distribution of the bank filter to have a new representation of the speech in frequency domain is used. It is important to indicate that MFCC parameters use Mel scale to create a bank filter. The cochlea behavior based on the theory to create the central frequencies of the bank filter was used, .The Mel scale function was substituted for our purpose. A 98.5% performance was reached, for a task that uses isolated digits pronounced by 5 different speakers in the Spanish language and corpus SUSAS with neutral sound records with some advantages in comparison with MFCC was used.
Chapter PDF
Similar content being viewed by others
Keywords
References
Noll, A.M.: Shortime Spectrum and Cepstrum Techniques for Vocal Pitch Detection. Journal of Acoustical Society of America 36, 296–302 (1964)
John, M.: Linear Prediction: A Tutorial Review. Proceedings of the IEEE 63(4), 561–580 (1975)
Davis, S.B., Mermelstein, P.: Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentence. IEEE Transactions on Acoustics, Speech and Signal Processing ASSP-28(4) (1980)
Hermansky: Perceptual Linear Predictive (PLP) analysis of speech. Journal of Acoustical Society of America, 1738–1752 (April 1990)
Kim, D.S., Lee, S.Y., Kill, R.M.: Auditory processing of speech signals for robust speech recognition in real word noisy environments. IEEE Trans. Speech Audio Processing 7(1), 55–69 (1999)
Geisler, C.D.: A model of the effect of outer hair cell motility on cochlear vibration. Hear. Res. 24, 125–131 (1996)
Geisler, C.D., Shan, X.: A model for cochlear vibration based on feedback from motile outer hair cells. In: Dallos, P., Geilser, C.D., Matthews, J.W., Ruggero, M.A., Steele, C.R. (eds.) The Mechanics and Biophysics of Hearing, pp. 86–95. Springer, New York (1990)
Holmberg, M., Gelbart, D., Hemmert, W.: Automatic speech recognition with an adaptation model motivated by auditory processing. IEEE Trans, Audio, Speech, Language Processing 14(1), 44–49 (2006)
Haque, S., Togneri, R.: A feature extraction method for automatic speech recognition based on the cochlear nucleus. In: 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30 (2010)
Harczos, T., Szepannek, G., Klefenz, F.: Towards Automatic Speech Recognition based on Cochlear Traveling Wave Delay Trajectories, Auditory signal processing in hearing-impaired listeners. In: Dau, T., Buchholz, J.M., Harte, J.M., Christiansen, T.U. (eds.) 1st International Symposium on Auditory and Audiological Research, ISAAR 2007 (2007) ISBN: 87-990013-1-4. Print: Centertryk A/S
de Boer, S.E.: Mechanics of the cochlea: modeling effects. In: Dallos, P., Fay, R.R. (eds.) The Cochlea, ch. 5. Springer, USA (1996)
Robles, L., Ruggero, M.A.: Mechanics of the Mammalian Cochlea. Physiological Reviews 81(3) (July 2001), Printed in USA
Peterson, Bogert: A dynamical theory of the cochlea. Journal of the Acoustical Society of America 22(3), 369–381 (1950)
Keener, Sneyd: Journal of Mathematical Physiology. Springer, USA (2008)
Elliot, S.J., Ku, E.M., Lineton, B.A.: A state space model for cochlear mechanics. Journal of Acoustical Society of America 122, 2759–2771 (2007)
Elliott, S.J., Lineton, B., Ni, G.: Fluid coupling in a discrete model of cochlear mechanics. Journal of Acoustical Society of America 130, 1441–1451 (2011)
Ku, E.M., Elliot, S.J., Lineton, B.A.: Statistics of instabilities in a state space model of the human cochlea. Journal of Acoustical Society of America 124, 1068–1079 (2008)
Neely, S.T.: A model for active elements in cochlear biomechanics. Journal of Acoustical Society of America 79, 1472–1480 (1986)
Békésy: Concerning the pleasures of observing and the mechanics of the inner ear, Nobel Lecture (December 11, 1961)
Mario, J.H., Rodríguez, J.L.O., Guerra, S.S., Barrón, R.: Fernández: Computational Model of the Cochlea using Resonance Analysis. Journal Revista Mexicana Ingeniería Biomédica 33(2), 77–86 (2012)
Hernández, J.: Mario: Modelo mecánico acústico del oído interno en reconocimiento de voz, Ph. D. Thesis, Center for Computing Research-IPN (June 2013)
Lesser, M.B., Berkley, D.A.: Fluid mechanics of the cochlea. Journal Fluid Mechanics 51(Pt. 3), 497–512 (1972)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Oropeza-Rodríguez, J.L., Suárez-Guerra, S., Jiménez-Hernández, M. (2014). The Place Theory as an Alternative Solution in Automatic Speech Recognition Tasks. In: Bayro-Corrochano, E., Hancock, E. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2014. Lecture Notes in Computer Science, vol 8827. Springer, Cham. https://doi.org/10.1007/978-3-319-12568-8_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-12568-8_21
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-12567-1
Online ISBN: 978-3-319-12568-8
eBook Packages: Computer ScienceComputer Science (R0)