Closed-Set Text-Independent Automatic Speaker Recognition System Using VQ/GMM

  • Bidhan Barai
  • Debayan Das
  • Nibaran Das
  • Subhadip Basu
  • Mita Nasipuri
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 695)

Abstract

Automatic speaker recognition (ASR) is one type of biometric recognition of human, known as voice biometric recognition. Among plenty of acoustic features, Mel-Frequency Cepstral Coefficients (MFCCs) and Gammatone Frequency Cepstral Coefficients (GFCCs) are used popularly in ASR. The state-of-the-art techniques for modeling/classification(s) are Vector Quantization (VQ), Gaussian Mixture Models (GMMs), Hidden Markov Model (HMM), Artificial Neural Network (ANN), Deep Neural Network (DNN). In this paper, we cite our experimental results upon three databases, namely Hyke-2011, ELSDSR, and IITG-MV SR Phase-I, based on MFCCs and VQ/GMM where maximum log-likelihood (MLL) scoring technique is used for the recognition of speakers and analyzed the effect of Gaussian components as well as Mel-scale filter bank’s minimum frequency. By adjusting proper Gaussian components and minimum frequency, the accuracies have been increased by 10–20% in noisy environment.

Keywords

ASR Acoustic Feature MFCC GFCC VQ GMM MLL Score 

Notes

Acknowledgements

This project is partially supported by the CMATER laboratory of the Computer Science and Engineering Department, Jadavpur University, India, TEQIP-II, PURSE-II, and UPE-II projects of Government of India. Subhadip Basu is partially supported by the Research Award (F.30-31/2016(SA-II)) from UGC, Government of India. Bidhan Barai is partially supported by the RGNF Research Award (F1-17.1/2014-15/RGNF-2014-15-SC-WES-67459/(SA-III)) from UGC, Government of India.

References

  1. 1.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using gmm supervectors for speaker verification. IEEE Signal Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  2. 3.
    Jain, V.K., Kumar, S., Fernandes, S.L.: Extraction of emotions from multilingual text using intelligent text processing and computational linguistics. J. Comput. Sci. (2017)Google Scholar
  3. 4.
    Kanagasundaram, A., Vogt, R., Dean, D.B., Sridharan, S., Mason, M.W.: I-vector based speaker recognition on short utterances. In: Proceedings of the 12th Annual Conference of the International Speech Communication Association, pp. 2341–2344. International Speech Communication Association (ISCA) (2011)Google Scholar
  4. 5.
    Madikeri, S.R., Murthy, H.A.: Mel filter bank energy-based slope feature and its application to speaker recognition. In: Communications (NCC), 2011 National Conference on, pp. 1–4. IEEE (2011)Google Scholar
  5. 7.
    Murthy, H.A., Yegnanarayana, B.: Group delay functions and its applications in speech technology. Sadhana 36(5), 745–782 (2011)CrossRefGoogle Scholar
  6. 8.
    Nakagawa, S., Wang, L., Ohtsuka, S.: Speaker identification and verification by combining MFCC and phase information. IEEE Trans. Audio Speech Lang. Process. 20(4), 1085–1095 (2012)CrossRefGoogle Scholar
  7. 9.
    Pruzansky, S.: Pattern-matching procedure for automatic talker recognition. J. Acoust. Soc. Am. 35(3), 354–358 (1963)CrossRefGoogle Scholar
  8. 11.
    Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)CrossRefGoogle Scholar
  9. 12.
    Sapijaszko, G.I., Mikhael, W.B.: An overview of recent window based feature extraction algorithms for speaker recognition. In: Circuits and Systems (MWSCAS), 2012 IEEE 55th International Midwest Symposium on, pp. 880–883. IEEE (2012)Google Scholar
  10. 13.
    Soong, F.K., Rosenberg, A.E., Juang, B.H., Rabiner, L.R.: Report: a vector quantization approach to speaker recognition. AT&T Techn. J. 66(2), 14–26 (1987)CrossRefGoogle Scholar
  11. 14.
    Zhao, X., Wang, D.: Analyzing noise robustness of MFCC and GFCC features in speaker identification. In: Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on, pp. 7204–7208. IEEE (2013)Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Bidhan Barai
    • 1
  • Debayan Das
    • 1
  • Nibaran Das
    • 1
  • Subhadip Basu
    • 1
  • Mita Nasipuri
    • 1
  1. 1.Jadavpur UniversityKolkataIndia

Personalised recommendations