International Journal of Speech Technology

, Volume 19, Issue 3, pp 457–465 | Cite as

Performance of speaker identification using CSM and TM



The main objective of this paper is to develop the system of speaker identification. Speaker identification is a technology that allows a computer to automatically identify the person who is speaking, based on the information received from speech signal. One of the most difficult problems in speaker recognition is dealing with noises. The performance of speaker recognition using close speaking microphone (CSM) is affected in background noises. To overcome this problem throat microphone (TM) which has a transducer held at the throat resulting in a clean signal and unaffected by background noises is used. Acoustic features namely linear prediction coefficients, linear prediction cepstral coefficients, Mel frequency cepstral coefficients and relative spectral transform-perceptual linear prediction are extracted. These features are classified using RBFNN and AANN and their performance is analyzed. A new method was proposed for identification of speakers in clean and noisy using combined CSM and TM. The identification performance of the combined system is increased than individual system due to complementary nature of CSM and TM.


Autoassociative neural network Radial basis function neural network Linear prediction coefficients Linear prediction cepstral coefficients Mel frequency cepstral coefficients Relative spectral transform perceptual linear prediction 


  1. Chauhan, T., Soni, H., & Zafar, S. (2013). A review of automatic speaker recognition system. International Journal of Soft Computing and Engineering, 3, 132–135.Google Scholar
  2. Dhanalakshmi, P., Palanivel, S., & Ramalingam, V. (2011). Classification of audio signals using aann and gmm. Applied Soft Computing, 11(10), 716–723.CrossRefGoogle Scholar
  3. Erzin, E. (2009). Improved throat microphone speech recognition by joint analysis of throat and acoustic microphone recordings. IEEE, 17(7), 1558–7916.Google Scholar
  4. Gbadamosi, L. (2013). Text independent biometric speaker recognition system. International Journal of Research in Computer Science, 3, 9–15.CrossRefGoogle Scholar
  5. Haykin, S. (2001). Neural networks: A comprehensive foundation. Singapore: Pearson Education.MATHGoogle Scholar
  6. Hermansky, H. (1990). Perceptual linear predictive (plp) analysis for speech. The Journal of the Acoustical Society of America, 87(4), 1738–1752.CrossRefGoogle Scholar
  7. Kinnunen, T., & Li, H. (2010). An overview of text-independent speaker recognition: From features to supervectors. Speech Communication, 52, 12–40.CrossRefGoogle Scholar
  8. Krishnamoorthy, P., Jayanna, H. S., & Prasanna, S. R. M. (2011). Speaker recognition under limited data condition by noise addition. Expert System with Applications, 38(10), 13487–13490.CrossRefGoogle Scholar
  9. Kumar, P., Jakhanwal, N., & Chandra, M. (2011). Text dependent speaker identification in noisy environment. In IEEE international conference on device and communication (pp. 1–4).Google Scholar
  10. Mubeen, N., Shahina, A., Nayeemulla Khan, A., & Vinoth, G. (2012). Combining spectral features of standard and throat microphones for speaker identification. In IEEE ICRTIT (pp. 119–122), Chennai, Tamil Nadu.Google Scholar
  11. Nath, D., & Kalita, S. K. (2015). Composite feature selection method based on spoken word and speaker recognition. International Journal of Computer Applications, 121(8), 18–23.CrossRefGoogle Scholar
  12. Nigade, Anuradha S., & Chitode, J. S. (2012). Throat microphone signals for isolated word recognition using LPC. International Journal of Advanced Research in Computer Science and Software Engineering, 2(8), 401–407.Google Scholar
  13. Palanivel, S. (2004). Person authentication using speech, face and visual speech, Ph.D. Thesis, IIT, Madras.Google Scholar
  14. Patel, J. K., & Nandurbarkar, A. (2015). Development and implementation of algorithm for speaker recognition for Gujarati language. International Research Journal of Engineering and Technology, 2(2), 444–448.Google Scholar
  15. Rabiner, L., & Schafer, R. W. (2005). Digital processing of speech signals. Upper Saddle River, NJ: Pearson Education.Google Scholar
  16. Sadic, S., & Bilginer Gulmezoglu, M. (2011). Common vector approach and its combination with GMM for text independent speaker recognition. Expert System with Applications, 38(9), 11394–11400.CrossRefGoogle Scholar
  17. Shahina, A., Yegnanarayanan, B., & Kesheorey, M. R. (2004). Throat microphone signal for speaker recognition. In Proceedings of the international conference on spoken language processing.Google Scholar
  18. Shaughnessy, D. O. (1986). Speaker recognition. In IEEE international conference on acoustics, speech, signal processing (Vol. 3, pp. 4–17).Google Scholar
  19. Sumithra, M. G., Thanuskodi, K., & Archana, A. H. J. (2011). A new speaker recognition system with combined feature extraction techniques. Journal of Computer Scence, 3, 459–465.CrossRefGoogle Scholar
  20. Wali, S. S., Hatture, S. M., & Nandyal, S. (2015). MFCC based text-dependent speaker identification using BPNN. International Journal of Signal Processing Systems, 3(1), 30–34.Google Scholar
  21. Xu, C., Maddage, N. C., & Shao, X. (2005). Automatic music classification and summarization. IEEE Transactions on Speech and Audio Processing, 13, 441–450.CrossRefGoogle Scholar
  22. Yujin, Y., Peihua, Z., & Qun, Z. (2010). Research of speaker recognition based on combination of LPCC and MFCC. In 2010 IEEE international conference on intelligent computing and intelligent systems (ICIS).Google Scholar
  23. Zhu, L., & Yang, Q. (2012). Speaker recognition system based on weighted feature parameter. In International conference on solid state devices and materials science (pp. 1515–1522), Macao.Google Scholar

Copyright information

© Springer Science+Business Media New York 2016

Authors and Affiliations

  1. 1.Department of CSEAnnamalai UniversityChidambaramIndia

Personalised recommendations