Abstract
This paper presents a speaker verification system that uses a self organising map composed of spiking neurons. The architecture of the system is inspired by the biomechanical mechanism of the human auditory system which converts speech into electrical spikes inside the cochlea. A spike-based rank order coding input feature vector is suggested that is designed to be representative of the real biological spike trains found within the human auditory nerve. The Spiking Self Organising Map (SSOM) updates its winner neuron only when its activity exceeds a specified threshold. The algorithm is evaluated using 50 speakers from the Centre for Spoken Language Understanding (CSLU2002) speaker verification database and shows a speaker verification performance of 90.1%. This compares favorably with previous non-spiking self organising map that used Discrete Fourier Transform (DFT)-based input feature vector with the same dataset.
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Reynolds, D.A. and R.C. Rose, Robust text-independent speaker identification using Gaussian mixture speaker models. Speech and Audio Processing, IEEE Transactions on, 1995. 3(1): p. 72-83.
Campbell, W.M., et al., Support vector machines for speaker and language recognition. Computer Speech & Language, 2006. 20(2-3): p. 210-229.
Seddik, H., A. Rahmouni, and M. Sayadi. Text independent speaker recognition using the Mel frequency cepstral coefficients and a neural network classifier. in Control, Communications and Signal Processing, 2004. First International Symposium on. 2004.
Oglesby, J. and J.S. Mason. Radial basis function networks for speaker recognition. in Acoustics, Speech, and Signal Processing, 1991. ICASSP-91., 1991 International Conference on. 1991.
Farrell, K.R., R.J. Mammone, and K.T. Assaleh, Speaker recognition using neural networks and conventional classifiers. Speech and Audio Processing, IEEE Transactions on, 1994. 2(1): p. 194-205.
Kishore, S.P. and B. Yegnanarayana. Speaker verification: minimizing the channel effects using autoassociative neural network models. in Acoustics, Speech, and Signal Processing, 2000. ICASSP '00. Proceedings. 2000 IEEE International Conference on. 2000.
Mueen, F., et al. Speaker recognition using artificial neural networks. in Students Conference, ISCON '02. Proceedings. IEEE. 2002.
Kusumoputro, B., et al. Speaker identification in noisy environment using bispectrum analysis and probabilistic neural network. in Computational Intelligence and Multimedia Applications, 2001. ICCIMA 2001. Proceedings. Fourth International Conference on. 2001.
Monte, E., et al. Text independent speaker identification on noisy environments by means of self organizing maps. in Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on. 1996.
George, S., et al. Speaker recognition using dynamic synapse based neural networks with wavelet preprocessing. in Neural Networks, 2001. Proceedings. IJCNN '01. International Joint Conference on. 2001.
Bing, L., W.M. Yamada, and T.W. Berger. Nonlinear Dynamic Neural Network for Text- Independent Speaker Identification using Information Theoretic Learning Technology. in Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE. 2006.
Timoszczuk, A.P. and E.F. Cabral, Speaker recognition using pulse coupled neural networks, in 2007 Ieee International Joint Conference on Neural Networks, Vols 1-6. 2007, IEEE: New York. p. 1965-1969.
Wysoski, S.G., L. Benuskova, and N. Kasabov, Text-independent speaker authentication with spiking neural networks, in Artificial Neural Networks - ICANN 2007, Pt 2, Proceedings, J. MarquesDeSa, et al., Editors. 2007. p. 758-767.
Møller, A.R., Hearing: anatomy, physiology, and disorders of the auditory system. 2006: Academic Press.
Young, E.D., Neural representation of spectral and temporal information in speech. Philosophical Transactions of the Royal Society B: Biological Sciences, 2008. 363(1493): p. 923-945.
Rabiner, L.R. and R.W. Schafer, Theory and Applications of Digital Speech Processing. 2010: Pearson.
Greenberg, S., et al., Physiological Representations of Speech: Speech Processing in the Auditory System. 2004, Springer New York. p. 163-230.
Thorpe, S. and J. Gautrais, Rank order coding. Computational Neuroscience: Trends in Research, ed. J.M. Bower. 1998, New York: Plenum Press Div Plenum Publishing Corp. 113- 118.
George, S., et al. Using dynamic synapse based neural networks with wavelet preprocessing for speech applications. in Neural Networks, 2003. Proceedings of the International Joint Conference on. 2003.
Tashan, T., T. Allen, and L. Nolle. Vowel based speaker verification using self organising map. in The Eleventh IASTED International Conference on Artificial Intelligence and Applications (AIA 2011). 2011. Innsbruck, Austria: ACTA Press.
Tashan, T. and T. Allen, Two stage speaker verification using Self Organising Map and Multilayer Perceptron Neural Network, in Research and Development in Intelligent Systems XXVIII, M. Bramer, M. Petridis, and L. Nolle, Editors. 2011, Springer London. p. 109-122.
Rabiner, L.R. and R.W. Schafer, Digital processing of speech signals. Prentice-Hall signal processing series. 1978, Englewood Cliffs, N.J.: Prentice-Hall.
Tashan, T., T. Allen, and L. Nolle, Speaker verification using heterogeneous neural network architecture with linear correlation speech activity detection. In Press: Expert Systems, 2012.
Panchev, C. and S. Wermter, Spike-timing-dependent synaptic plasticity: from single spikes to spike trains. Neurocomputing, 2004. 58-60(0): p. 365-371.
Jayanna, H.S. and S.R.M. Prasanna, An experimental comparison of modelling techniques for speaker recognition under limited data condition. Sadhana-Academy Proceedings in Engineering Sciences, 2009. 34(5): p. 717-728.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag London
About this paper
Cite this paper
Tashan, T., Allen, T., Nolle, L. (2012). Biologically inspired speaker verification using Spiking Self-Organising Map. In: Bramer, M., Petridis, M. (eds) Research and Development in Intelligent Systems XXIX. SGAI 2012. Springer, London. https://doi.org/10.1007/978-1-4471-4739-8_1
Download citation
DOI: https://doi.org/10.1007/978-1-4471-4739-8_1
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-4738-1
Online ISBN: 978-1-4471-4739-8
eBook Packages: Computer ScienceComputer Science (R0)