Biologically inspired speaker verification using Spiking Self-Organising Map

  • Tariq Tashan
  • Tony Allen
  • Lars Nolle
Conference paper


This paper presents a speaker verification system that uses a self organising map composed of spiking neurons. The architecture of the system is inspired by the biomechanical mechanism of the human auditory system which converts speech into electrical spikes inside the cochlea. A spike-based rank order coding input feature vector is suggested that is designed to be representative of the real biological spike trains found within the human auditory nerve. The Spiking Self Organising Map (SSOM) updates its winner neuron only when its activity exceeds a specified threshold. The algorithm is evaluated using 50 speakers from the Centre for Spoken Language Understanding (CSLU2002) speaker verification database and shows a speaker verification performance of 90.1%. This compares favorably with previous non-spiking self organising map that used Discrete Fourier Transform (DFT)-based input feature vector with the same dataset.


Hair Cell Discrete Fourier Transform Auditory Nerve Basilar Membrane Speaker Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Reynolds, D.A. and R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. Speech and Audio Processing, IEEE Transactions on, 1995. 3(1): p. 72-83.CrossRefGoogle Scholar
  2. 2.
    Campbell, W.M., et al., Support vector machines for speaker and language recognition. Computer Speech & Language, 2006. 20(2-3): p. 210-229.CrossRefGoogle Scholar
  3. 3.
    Seddik, H., A. Rahmouni, and M. Sayadi. Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier. in Control, Communications and Signal Processing, 2004. First International Symposium on. 2004.Google Scholar
  4. 4.
    Oglesby, J. and J.S. Mason. Radial basis function networks for speaker recognition. in Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on. 1991.Google Scholar
  5. 5.
    Farrell, K.R., R.J. Mammone, and K.T. Assaleh, Speaker recognition using neural networks and conventional classifiers. Speech and Audio Processing, IEEE Transactions on, 1994. 2(1): p. 194-205.CrossRefGoogle Scholar
  6. 6.
    Kishore, S.P. and B. Yegnanarayana. Speaker verification: minimizing the channel effects using autoassociative neural network models. in Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on. 2000.Google Scholar
  7. 7.
    Mueen, F., et al. Speaker recognition using artificial neural networks. in Students Conference, ISCON '02. Proceedings. IEEE. 2002.Google Scholar
  8. 8.
    Kusumoputro, B., et al. Speaker identification in noisy environment using bispectrum analysis and probabilistic neural network. in Computational Intelligence and Multimedia Applications, 2001. ICCIMA 2001. Proceedings. Fourth International Conference on. 2001.Google Scholar
  9. 9.
    Monte, E., et al. Text independent speaker identification on noisy environments by means of self organizing maps. in Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 1996.Google Scholar
  10. 10.
    George, S., et al. Speaker recognition using dynamic synapse based neural networks with wavelet preprocessing. in Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on. 2001.Google Scholar
  11. 11.
    Bing, L., W.M. Yamada, and T.W. Berger. Nonlinear Dynamic Neural Network for Text- Independent Speaker Identification using Information Theoretic Learning Technology. in Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE. 2006.Google Scholar
  12. 12.
    Timoszczuk, A.P. and E.F. Cabral, Speaker recognition using pulse coupled neural networks, in 2007 Ieee International Joint Conference on Neural Networks, Vols 1-6. 2007, IEEE: New York. p. 1965-1969.Google Scholar
  13. 13.
    Wysoski, S.G., L. Benuskova, and N. Kasabov, Text-independent speaker authentication with spiking neural networks, in Artificial Neural Networks - ICANN 2007, Pt 2, Proceedings, J. MarquesDeSa, et al., Editors. 2007. p. 758-767.Google Scholar
  14. 14.
    Møller, A.R., Hearing: anatomy, physiology, and disorders of the auditory system. 2006: Academic Press.Google Scholar
  15. 15.
    Young, E.D., Neural representation of spectral and temporal information in speech. Philosophical Transactions of the Royal Society B: Biological Sciences, 2008. 363(1493): p. 923-945.CrossRefGoogle Scholar
  16. 16.
    Rabiner, L.R. and R.W. Schafer, Theory and Applications of Digital Speech Processing. 2010: Pearson.Google Scholar
  17. 17.
    Greenberg, S., et al., Physiological Representations of Speech: Speech Processing in the Auditory System. 2004, Springer New York. p. 163-230.Google Scholar
  18. 18.
    Thorpe, S. and J. Gautrais, Rank order coding. Computational Neuroscience: Trends in Research, ed. J.M. Bower. 1998, New York: Plenum Press Div Plenum Publishing Corp. 113- 118.Google Scholar
  19. 19.
    George, S., et al. Using dynamic synapse based neural networks with wavelet preprocessing for speech applications. in Neural Networks, 2003. Proceedings of the International Joint Conference on. 2003.Google Scholar
  20. 20.
    Tashan, T., T. Allen, and L. Nolle. Vowel based speaker verification using self organising map. in The Eleventh IASTED International Conference on Artificial Intelligence and Applications (AIA 2011). 2011. Innsbruck, Austria: ACTA Press.Google Scholar
  21. 21.
    Tashan, T. and T. Allen, Two stage speaker verification using Self Organising Map and Multilayer Perceptron Neural Network, in Research and Development in Intelligent Systems XXVIII, M. Bramer, M. Petridis, and L. Nolle, Editors. 2011, Springer London. p. 109-122.Google Scholar
  22. 22.
    Rabiner, L.R. and R.W. Schafer, Digital processing of speech signals. Prentice-Hall signal processing series. 1978, Englewood Cliffs, N.J.: Prentice-Hall.Google Scholar
  23. 23.
    Tashan, T., T. Allen, and L. Nolle, Speaker verification using heterogeneous neural network architecture with linear correlation speech activity detection. In Press: Expert Systems, 2012.Google Scholar
  24. 24.
    Panchev, C. and S. Wermter, Spike-timing-dependent synaptic plasticity: from single spikes to spike trains. Neurocomputing, 2004. 58-60(0): p. 365-371.CrossRefGoogle Scholar
  25. 25.
    Jayanna, H.S. and S.R.M. Prasanna, An experimental comparison of modelling techniques for speaker recognition under limited data condition. Sadhana-Academy Proceedings in Engineering Sciences, 2009. 34(5): p. 717-728.Google Scholar

Copyright information

© Springer-Verlag London 2012

Authors and Affiliations

  1. 1.Nottingham Trent UniversityNottinghamUK

Personalised recommendations