A Robust Wavelet Based Decomposition and Multilayer Neural Network for Speaker Identification

  • M. D. PawarEmail author
  • Rajendra Kokate
Conference paper
Part of the Lecture Notes in Networks and Systems book series (LNNS, volume 65)


The major aim of this research paper is to recognize and figure out the problem of recognizing a speaker from its voice, we propose a new methodology for feature extraction based on speakers pitch, stationary Wavelet, and multilayered Neural Networks. In this exercise, we designed a methodology to study for Text-Dependent Speaker Identification. Wavelet analysis comprises Stationary wavelet analysis, Continuous wavelet analysis, and discrete wavelet analysis, the classification module comprises an artificial neural network, General regression forming the decision through majority test/train result scheme. A performance test is conducted using the recorded database for text dependent and text independent. Stationary wavelet with the multilayered neural network has shown better accuracy and faster identification time compared with traditional MFCC, discrete, and continuous wavelet transform approaches.


Speaker identification and recognition Stationary wavelet Neural networks methodology 


  1. 1.
    Kumar P, Lahudkar SL (2015) International Journal on Recent and Innovation Trends in Computing and Communication 3(4):2106–2109. ISSN: 2321-81692106, IJRITCC, Apr 2015Google Scholar
  2. 2.
    Sarikaya R, Pellom BL, Hansen JHL (1998) Wavelet packet transform features with application to speaker identification. In: Proceedings of the IEEE Nordic Signal Processing Symposium, pp 81–84Google Scholar
  3. 3.
    Dan Z, Zheng S, Sun S, Dong R (2008) Speaker recognition based on LS-SVM. In: The 3rd international conference on Innovative computing Information and Control ICICIC, 2008, pp 01–04Google Scholar
  4. 4.
    Wang P (2005) Feature extraction bases on Mel-Scale wavelet transform for heart sound analysis. In: Proceedings of the engineering in medicine and annual conference, Shanghai, China, pp 01–05Google Scholar
  5. 5.
    Abdalla MI, Ali S (2010) Wavelet-based Mel-frequency cepstral coefficients for speaker identification using hidden Markov models. J Telecomm 1(2):16–21Google Scholar
  6. 6.
    Kekre HB, Kulkarni V (2010) Speaker identification by using vector quantization. Int J Eng Sci Technol 2(5):1325–1331Google Scholar
  7. 7.
    Wu J-D, Lin B-F (2009) Speaker identification using discrete wavelet packet trans-form technique with irregular position. Expert Syst Appl 36:3136–3143CrossRefGoogle Scholar
  8. 8.
    Suvarna Kumar G, Prasad Raju KA, Rao M et al (2010) Speaker recognition using GMM’. Int J Eng Sci Technol 2(6):2428–2436Google Scholar
  9. 9.
    Speaker Identification based on GFCC using GMM. Int J Innov Res Adv Eng (IJIRAE) 1(8):224. ISSN: 2349-2163. (2014), IJIRAE—All Rights Reserved
  10. 10.
    Srinivasan A (2012) Speaker identification and verification using vector quantization and Mel frequency cepstral coefficients journal of applied sciences. Eng Technol 4(1):33–40. ISSN: 2040-7467 © Maxwell Scientific Organization, 2011 Published: 01 Jan 2012, pp 110–114Google Scholar
  11. 11.
    Kabir A, Ahsan SMM (2007) Vector quantization in text dependent automatic speaker recognition using Mel-frequency Cepstrum Coefficient. In: 6th WSEAS international conference on circuit systems, electronics control and signal processing, Cairo, Egypt, 29–31 Dec 2007, pp 352–355Google Scholar
  12. 12.
    Sunitha C (2015) Speaker recognition using MFCC and improved weighted vector quantization algorithm. Int J Eng Technol (IJET) 7(5):1685–1692. ISSN: 0975-4024Google Scholar
  13. 13.
    Dhonde SB (2015) Speaker recognition system using Gaussian mixture model. Int J Comput Appl 130(14):0975–8887, 38 Nov 2015Google Scholar
  14. 14.
    Pawar RV, Kajave PP, Mali SN (2005) Speaker identification using neural network. World Academy of Sci Eng Technol 12:31–35Google Scholar
  15. 15.
    Revada LKV, Rambatla VK, Ande KVN (2011) A novel approach to speech recognition by using generalized regression neural networks. IJCSI Int J Compute Sci Issues 1:483–489Google Scholar
  16. 16.
    Lu W, Sun W, Lu H (2009) Robust watermarking based on DWT and non-negative matrix factorization’. Comput Electr Eng 35:183–188CrossRefGoogle Scholar
  17. 17.
    Wu JD, Lin BF (2009) Speaker identification using discrete wavelet packet transform technique with irregular decomposition. Expert Syst Appl 36(2):3136–3143CrossRefGoogle Scholar
  18. 18.
    Vetterli M, Kovacevic J (1995) Wavelets and sub band coding. Prentice-Hall, New JerseyGoogle Scholar
  19. 19.
    Shukla A, Tiwari R, Hemant Kumar M, Kala R (2009) Speaker identification using wavelet analysis and modular neural networks. J Acoust Soc India (JASI) 36(1):14–19Google Scholar
  20. 20.
    Daqrouq K, Abu Hilal T, Sherif M, El-Hajjar S, Al-Qawasmi A (2009) Speaker identification system using wavelet transform and neural network. IEEE Xplore, 24 Oct 2009, pp 559–564Google Scholar
  21. 21.
    Hariharan M, Yaacob S, Hasrul MN, Wei OQ, Speech emotion recognition using stationary wavelet transform and timbral texture features, School of Mechatronic Engineering, University Malaysia Perlis (UniMAP)Google Scholar
  22. 22.
    Hariharan M, et al. (2013) Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy. Dig Sign Process A Rev J 23:952–959MathSciNetCrossRefGoogle Scholar
  23. 23.
    Sindhu R, Siew-Chin Neoh, Hariharana M  (2014) A hybrid expert system approach for telemonitoring of vocal fold pathology. Int J Medical Engineering and Informatics 6(3):218–237.Google Scholar
  24. 24.
    Keskes H, Braham A, Lachiri Z (2013) Broken rotor bar diagnosis in induction machines through stationary wavelet packet transform and multiclass wavelet SVM. Electr Power Syst Res 97:151–157CrossRefGoogle Scholar
  25. 25.
    Ge Z, Iyer AN, Cheluvaraja S, Sundaram R, Ganapathiraju A (2017) Neural network based speaker classification and verification systems with enhanced features interactive intelligence. In: Intelligent systems conference, 7–8 Sept 2017, London, UK Inc., Indianapolis, IndianaGoogle Scholar
  26. 26.
    Ye J (2004) Speech recognition using time domain features from phase space reconstructions. Ph.D thesis. Marquette University Milwaukee, WisconsinGoogle Scholar
  27. 27.
    Specht DF (1991) A general regression neural network’. IEEE Trans Neural Network 2(6):568–576CrossRefGoogle Scholar
  28. 28.
    Amrouche A, Rouvaen J (2006) Efficient system for speech recognition using general regression neural network. Int J Intell Technol 1(2):183–189Google Scholar
  29. 29.
    Almaade N, Aggoun A, Amira A (2015) Speaker identification using multimodal neural networks and wavelet analysis. IET Biom 4(1):18–28CrossRefGoogle Scholar
  30. 30.
    Ganchev T, Tasoulis D, Vrahatis M, Fakotakis D (2007) Generalized locally recurrent probabilistic neural networks with application to text-independent speaker verification. Neuro Comput 70:1424–1438Google Scholar
  31. 31.
    Wang JC, Yang CH, Wang JF, Lee HP (2007) Robust speaker identification and verification. Taiwan IEEE Comput Intell Mag 2:52–59CrossRefGoogle Scholar
  32. 32.
    Nijhawan G, Soni MK (2014) Speaker recognition using MFCC and vector quantisation. Int J Recent Trends Eng Technol 11(1):211–218Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  1. 1.Department of Electronics and Telecommunication EngineeringMaharashtra Institute of TechnologyAurangabadIndia
  2. 2.Department of Instrumentation EngineeringGovernment College of EngineeringJalgaonIndia

Personalised recommendations