Advertisement

Robust Speaker Recognition Using Improved GFCC and Adaptive Feature Selection

  • Xingyu Zhang
  • Xia Zou
  • Meng SunEmail author
  • Penglong Wu
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 895)

Abstract

Speaker recognition systems have shown good performance in noise-free environments, but the performance will severely deteriorate in the presence of noises. At the front end of the systems, Mel-Frequency Cepstral Coefficient (MFCC), or a relatively noise-robust feature Gammatone Frequency Cepstral Coefficients (GFCC), is commonly used as time-frequency feature. To further improve the noise-robustness of GFCC, signal processing techniques, such as DC removal, pre-emphasis and Cepstral Mean Variance Normalization (CMVN), are investigated in the extraction of GFCC. Being aware the advantages and disadvantages of MFCC and GFCC, an adaptive strategy was proposed to make feature selection based on the quality of speech. Experiments were conducted on TIMIT dataset to evaluate our approach. Compared with ordinary GFCC and MFCC features, our method significantly reduced the EER in speech data with miscellaneous SNRs.

Keywords

Gammatone Frequency Cepstrum Coefficients (GFCC) i-vector Robust speaker recognition Mel-Frequency Cepstrum Coefficient (MFCC) Adaptive feature selection 

References

  1. 1.
    Dehak, N., Kenny, P.J., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)CrossRefGoogle Scholar
  2. 2.
    Burget, L., Plchot, O., Cumani, S., Glembek, O., Matějka, P., Brümmer, N.: Discriminatively trained Probabilistic Linear Discriminant Analysis for speaker verification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 125, pp. 4832–4835. IEEE (2011)Google Scholar
  3. 3.
    Zhao, X., Shao, Y., Wang, D.L.: Casa-based robust speaker identification. IEEE Trans. Audio Speech Lang. Process. 20(5), 1608–1616 (2012)CrossRefGoogle Scholar
  4. 4.
    Shao, Y., Srinivasan, S., Wang, D.L.: Incorporating auditory feature uncertainties in robust speaker identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 4, pp. IV-277–IV-280. IEEE (2007)Google Scholar
  5. 5.
    Das, P., Bhattacharjee, U.: Robust speaker verification using GFCC and joint factor analysis. In: International Conference on Computing, Communication and Networking Technologies, pp. 1–4. IEEE (2014)Google Scholar
  6. 6.
    Shi, X., Yang, H., Zhou, P.: Robust speaker recognition based on improved GFCC. In: IEEE International Conference on Computer and Communications, pp. 1927–1931. IEEE (2017)Google Scholar
  7. 7.
    Jeevan, M., Dhingra, A., Hanmandlu, M., Panigrahi, B.K.: Robust speaker verification using GFCC based i-vectors. In: Proceedings of the International Conference on Signal, Networks, Computing, and Systems, pp. 85–91. Springer India (2017)Google Scholar
  8. 8.
    Zhao, X., Wang, D.: Analyzing noise robustness of MFCC and GFCC features in speaker identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 7204–7208. IEEE (2013)Google Scholar
  9. 9.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Dig. Signal Process. 10, (1–3), 19–41 (2000)CrossRefGoogle Scholar
  10. 10.
    Zhiyi, L.I., Liang, H.E., Zhang, W., Liu, J.: Speaker recognition based on discriminant i-vector local distance preserving projection. J. Tsinghua Univ. (Sci. Technol.) 52(5), 598–601 (2012)zbMATHGoogle Scholar
  11. 11.
    Lamel, L.: Speech database development: design and analysis of the acoustic-phonetic corpus. In: Proceedings of DARPA Speech Recognition Workshop (1986)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Army Engineering UniversityNanjingChina

Personalised recommendations