A Speaker Recognition System Based on an Auditory Model and Neural Nets: Performance at Different Levels of Sound Pressure and of Gaussian White Noise

  • Ernesto A. Martínez–Rams
  • Vicente Garcerán–Hernández
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6687)


This paper performs the assessment of an auditory model based on a human nonlinear cochlear filter-bank and on Neural Nets. The efficiency of this system in speaker recognition tasks has been tested at different levels of voice pressure and different levels of noise. The auditory model yields five psychophysical parameters with which a neural network is trained. We used a number of Spanish words from the ’Ahumada’ database as uttered by native male speakers.


Hair Cell Speaker Recognition Speaker Modeling Inner Hair Cell Auditory Model 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Lopez-Poveda, E.A., Meddis, R.: A human nonlinear cochlear filterbank. J. Acoust. Soc. Am. 110(6), 3107–3118 (2001)CrossRefGoogle Scholar
  2. 2.
    Atal, B.S., Hanauer, S.L.: Speech analysis and synthesis by linear prediction of the speech wave. Journal of The American Acoustics Society 50, 637–655 (1971)CrossRefGoogle Scholar
  3. 3.
    Merkel, J.D., Gray, A.H.: Linear prediction of speech. Springer, Heidelberg (1976)CrossRefGoogle Scholar
  4. 4.
    Furui, S.: Cepstral analysis techniques for automatic speaker verification. IEEE Transaction on Acoustics, Speech and Signal Processing 27, 254–277 (1981)CrossRefGoogle Scholar
  5. 5.
    Mermelstein, P.: Distance measures for speech recognition, psychological and instrumental. In: Chen, C.H. (ed.) Pattern Recognition and Artificial Intelligence, pp. 374–388. Academic, New York (1976)Google Scholar
  6. 6.
    Fant, G.: Acoustic Theory of Speech Production. Mouton, The Hague (1970)Google Scholar
  7. 7.
    von Békésy, G.: Experiments in Hearing. McGraw-Hill, New York (1960); (reprinted in 1989)Google Scholar
  8. 8.
    Anderson, T.R.: A comparison of auditory models for speaker independent phoneme recognition. In: Proceedings of the 1993 International Conference on Acoustics, Speech and Signal Processing, vol. 2, pp. 231–234 (1993)Google Scholar
  9. 9.
    Anderson, T.R.: Speaker independent phoneme recognition with an auditory model and a neural network: a comparison with traditional techniques. In: Proceedings of the Acoustics, Speech, and Signal Processing, pp. 149–152 (1991)Google Scholar
  10. 10.
    Anderson, T.R.: Auditory models with Kohonen SOFM and LVQ for speaker Independent Phoneme Recognition. In: IEEE International Conference on Neural Networks, vol. 7, pp. 4466–4469 (1994)Google Scholar
  11. 11.
    Jankowski Jr., C.R., Lippmann, R.P.: Comparison of auditory models for robust speech recognition. In: Proceedings of the Workshop on Speech and Natural Language, pp. 453–454 (1992)Google Scholar
  12. 12.
    Kasper, K., Reininger, H., Wolf, D.: Exploiting the potential of auditory preprocessing for robust speech recognition by locally recurrent neural networks. In: Proc. Int. Conf. Acoustics, Speech and Signal Processing (ICASSP), pp. 1223–1226 (1997)Google Scholar
  13. 13.
    Kim, D.-S., Lee, S.-Y., Hil, R.M.: Auditory processing of speech signals for robust speech recognition in real-world noisy environments. IEEE Transactions on Speech and Audio Processing, 55–69 (1999)Google Scholar
  14. 14.
    Koizumi, T., Mori, M., Taniguchi, S.: Speech recognition based on a model of human auditory system. In: 4th International Conference on Spoken Language Processing, pp. 937–940 (1996)Google Scholar
  15. 15.
    Hunt, M.J., Lefébvre, C.: Speaker dependent and independent speech recognition experiments with an auditory model. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 215–218 (1988)Google Scholar
  16. 16.
    JColombi, J.M., Anderson, T.R., Rogers, S.K., Ruck, D.W., Warhola, G.T.: Auditory model representation and comparison for speaker recognition. In: IEEE International Conference on Neural Networks, pp. 1914–1919 (1993)Google Scholar
  17. 17.
    JColombi, J.M.: Cepstral and Auditory Model Features for Speaker Recognition. Master’s thesis (1992)Google Scholar
  18. 18.
    Shao, Y., Wang, D.: Robust speaker identification using auditory features and computational auditory scene analysis. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 1589–1592 (2008)Google Scholar
  19. 19.
    Martínez–Rams, E., Garcerán–Hernández, V.: Assessment of a speaker recognition system based on an auditory model and neural nets. In: Mira, J., Ferrández, J.M., Álvarez, J.R., de la Paz, F., Toledo, F.J. (eds.) IWINAC 2009. LNCS, vol. 5602, pp. 488–498. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  20. 20.
    Ortega-Garcia, J., González-Rodriguez, J., Marrero-Aguiar, V., et al.: Ahumada: A large speech corpus in Spanish for speaker identification and verification. Speech Communication 31(2-3), 255–264 (2004)CrossRefGoogle Scholar
  21. 21.
    Shamma, S.A., Chadwich, R.S., Wilbur, W.J., Morrish, K.A., Rinzel, J.: A biophysical model of cochlear processing: intensity dependence of pure tone responses. J. Acoust. Soc. Am. 80(1), 133–145 (1986)CrossRefGoogle Scholar
  22. 22.
    Lopez Poveda, E.A., Eustaquio-Martín, A.: A biophysical model of the Inner Hair Cell: The contribution of potassium currents to peripherical auditory compression. Journal of the Association for Research in Otolaryngology JARO 7, 218–235 (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ernesto A. Martínez–Rams
    • 1
  • Vicente Garcerán–Hernández
    • 2
  1. 1.Universidad de OrienteSantiago de CubaCuba
  2. 2.Universidad Politécnica de CartagenaCartagena, MurciaEspaña

Personalised recommendations