Skip to main content
Log in

Robust noise MKMFCC–SVM automatic speaker identification

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper proposes robust noise automatic speaker identification (ASI) scheme named MKMFCC–SVM. It based on the Multiple Kernel Weighted Mel Frequency Cepstral Coefficient (MKMFCC) and support vector machine (SVM). Firstly, the MKMFCC is employed for extracting features from degraded audio and it uses multiple kernels such as the exponential and tangential and for MFCC’s weighting. Secondly, the extracted features are then categorized with the SVM classification technique. A comparative study is performed between the proposed MKMFCC–SVM and the MFCC–SVM ASI schemes using the MKMFCC and MFCCs with five schemes for extracting features from telephone-analogous and noisy-like degraded audio signals. Experimental tests prove that the proposed MKMFCC–SVM ASI scheme yields higher identification rate in noise presence or degradation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Boujelbene, S. Z., Mezghani, D. B. A., & Ellouze, N. (2010). Improving SVM by modifying kernel functions for speaker identification task. International Journal of Digital Content Technology and its Applications, 4(6, 100–105.

    Google Scholar 

  • Campbell, W. M., Campbell, J. P., Gleason, T. P., Reynolds, D. A., & Shen, W. (2007). Speaker verification using support vector machines and high-level features. IEEE Transactions on Audio, Speech and Language Processing, 15(7), 2085–2094.

    Article  Google Scholar 

  • Dharanipragada, S., Yapanel, U. H., & Rao, B. D. (2007) Robust feature extraction for continuous speech recognition using the MVDR spectrum estimation method. IEEE Transactions on Audio, Speech, and Language Processing, 15(1), 224–234.

    Article  Google Scholar 

  • Ding, I.-J., & Yen, C.-T. (2015) Enhancing GMM speaker identification by incorporating SVM speaker verification for intelligent web-based speech applications. Multimedia Tools and Applications, 74, 5131–5140.

    Article  Google Scholar 

  • Furui, S. (1981) Cepstral Analysis Technique for Automatic Speaker Verification. IEEE Transactions on Acoustics, Speech, and Signal Processing, 20(2), 254–272.

    Article  Google Scholar 

  • Galushkin, A. I. (2007). Neural networks theory. Berlin: Springer.

    MATH  Google Scholar 

  • Gandhiraj, R., Sathidevi, P. S. (2007). Auditory-based wavelet packet filter bank for speech recognition using neural network. In Proceedings of the 15th International Conference on Advanced Computing and Communications, pp. 666–671.

  • Hayati, M., shirvany, Y. (2007). Artificial neural network approach for short term load forecasting for Illam region. Proceeding of World Academy of Science, Engineering and Technology, 22. ISSN 1307–6884.

  • Hossain, M., Ahmed, B., Asrafi, M. (2007). A real time speaker identification using artificial neural network. In 10th International Conference on Computer and Information Technology, pp. 1–5.

  • Huang, C., Song, B., & Zhao, L. (2016). Emotional speech feature normalization and recognition based on speaker-sensitive feature clustering. International Journal of Speech Technology, 19, 805–816.

    Article  Google Scholar 

  • Li, Z., & Gao, Y. (2016). Acoustic feature extraction method for robust speaker identification. Multimedia Tools and Applications, 75, 7391–7406.

    Article  Google Scholar 

  • Mellahi, T., & Hamdi, R. (2015). LPC-based formant enhancement method in Kalman filtering for speech enhancement. International Journal of Electronics and Communications, 69(2), 545–554.

    Article  Google Scholar 

  • Naeeni, B. H., Amindavar, H., & Bakhshi, H. (2010). Blind per tone equalization of multilevel signals using support vector machines for OFDM in wireless communication. International Journal of Electronics and Communications, 64(2), 186–190.

    Article  Google Scholar 

  • Polur, P. D., & Miller, G. E. (2005). Experiments with fast Fourier transform, linear predictive and cepstral coefficients in dysarthric speech recognition algorithms using hidden Markov model. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 13(4), 558–561.

    Article  Google Scholar 

  • Qian, F., Hu, G., & Yao, X. (2008). Semi-supervised internet network traffic classification using a Gaussian mixture model. International Journal of Electronics and Communications, 62(7), 557–564.

    Article  Google Scholar 

  • Ramaiah, V. S., & Rao, R. R. (2016). Speaker diarization system using MKMFCC parameterization and WLI-fuzzy clustering. International Journal of Speech Technology, 19, 945–963.

    Article  Google Scholar 

  • Selva Nidhyananthan, S., Shantha Selva Kumari, R., & Senthur Selvi, T. (2016). Noise robust speaker identification using RASTA-MFCC Feature with quadrilateral filter bank structure. Wireless Personal Communications, 91, 1321–1333.

    Article  Google Scholar 

  • Shuling, L., & Wang C. (2009). Nonspecific speech recognition method based on composite LVQ1 and LVQ2 network. In Chinese Control and Decision Conference (CCDC), pp. 2304–2388.

  • Xu, L., & Yang, Z. (2016). Speaker identification based on state space model. International Journal of Speech Technology, 19, 404–414.

    Article  Google Scholar 

  • You, C. H., Lee, K. A., & Li, H. (2010). GMM-SVM kernel with a Bhattacharyya-based distance for speaker recognition. IEEE Transactions on Audio, Speech, and Language Processing, 18(6), 1300–1312.

    Article  Google Scholar 

  • Zergat, K. Y., & Amrouche, A. (2014). New scheme based on GMM-PCA-SVM modeling for automatic speaker recognition. International Journal of Speech Technology, 17, 373–381.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Osama S. Faragallah.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Faragallah, O.S. Robust noise MKMFCC–SVM automatic speaker identification. Int J Speech Technol 21, 185–192 (2018). https://doi.org/10.1007/s10772-018-9494-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-018-9494-9

Keywords

Navigation