Skip to main content

A Survey on Automatic Speaker Recognition Systems

  • Conference paper
Signal Processing and Multimedia (MulGraB 2010, SIP 2010)

Abstract

Human listeners are capable of identifying a speaker, over the telephone or an entryway out of sight, by listening to the voice of the speaker. Achieving this intrinsic human specific capability is a major challenge for Voice Biometrics. Like human listeners, voice biometrics uses the features of a person’s voice to ascertain the speaker’s identity. The best-known commercialized forms of voice Biometrics is Speaker Recognition System (SRS). Speaker recognition is the computing task of validating a user’s claimed identity using characteristics extracted from their voices. This literature survey paper gives brief introduction on SRS, and then discusses general architecture of SRS, biometric standards relevant to voice/speech, typical applications of SRS, and current research in Speaker Recognition Systems. We have also surveyed various approaches for SRS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. SANS Information Security Reading Room, http://www.sans.org

  2. Biometrics.gov, http://www.biometrics.gov/Documents/SpeakerRec.pdf

  3. Chowdhury, M.F.A., Alam, M.J., Alam, M.F.A., O’Shaughnessy, D.: Perceptually weighted multi-band spectral subtraction speech enhancement technique. In: International Conference on Electrical and Computer Engineering, ICECE 2008, pp. 395–399 (2008)

    Google Scholar 

  4. Jun, L., He, Z.: Spectral Subtraction Speech Enhancement Technology Based on Fast Noise Estimation. In: ICIECS (2009)

    Google Scholar 

  5. Hansen, J.H.L., Radhakrishnan, V., Arehart, K.H.: Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System. IEEE Transactions on Audio, Speech, and Language Processing 14(6), 2049–2063 (2006)

    Article  Google Scholar 

  6. Hasan, T., Hasan, M.K.: MMSE estimator for speech enhancement considering the constructive and destructive interference of noise. Signal Processing, IET 4(1), 1–11 (2010)

    Article  Google Scholar 

  7. Lev-Ari, H., Ephraim, Y.: Extension of the signal subspace speech enhancement approach to colored noise. IEEE Signal Processing Letters 10(4), 104–106 (2003)

    Article  Google Scholar 

  8. Wu, W., Zheng, T.F., Xu, M.: Cohort-Based Speaker Model Synthesis for Channel Robust Speaker Recognition. In: ICASSP (2006)

    Google Scholar 

  9. Han, J., Gao, R.: Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features. In: IJECSE (2010)

    Google Scholar 

  10. Calvo, J.R., Fernandez, R., Hernandez, G.: Channel / Handset Mismatch Evaluation in a Biometric Speaker Verification Using Shifted Delta Cepstral Features. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 96–105. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  11. Prahallad, K., Varanasi, S., Veluru, R., Bharat Krishna, M., Roy, D.S.: Significance of Formants from Difference Spectrum for Speaker Identification. In: INTERSPEECH 2006 (2006)

    Google Scholar 

  12. Chakroborty, S., Saha, G.: Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter. In: IJSP (2009)

    Google Scholar 

  13. Revathi, A., Ganapathy, R., Venkataramani, Y.: Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach. In: IJCSIT, vol. 1(2) (2009)

    Google Scholar 

  14. Huang, W., Chao, J., Zhang, Y.: Combination of Pitch and MFCC GMM Supervectors for Speaker. In: ICALIP (2008)

    Google Scholar 

  15. Deshpande, M.S., Holambe, R.S.: Speaker Identification Using Admissible Wavelet Packet Based Decomposition. International Journal of Signal Processing 6, 1 (2010)

    Google Scholar 

  16. Campbell, W.M., Campbell, J.P., Gleason, T.P., Reynolds, D.A., Shen, W.: Speaker Verification using Support Vector Machines and High-Level Feature. IEEE Transactions on Audio, Speech, And Language Processing 15(7) (2007)

    Google Scholar 

  17. Baker, B., Vogt, R., Sridharan, S.: Gaussian Mixture Modeling of Broad Phonetic and Syllabic Events for Text-Independent Speaker Verification. In: Euro speech (2005)

    Google Scholar 

  18. Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. ELSEVIER Speech Communication 50, 782–796 (2008)

    Article  Google Scholar 

  19. Dehak, N., Dumouchel, P., Kenny, P.: Modeling Prosodic Features with Joint Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing 15(7), 2095–2103 (2007)

    Article  Google Scholar 

  20. Aronowitz, H., Burshtein, D.: Efficient Speaker Identification and Retrieval. In: Proc. Interspeech 2005, pp. 2433–2436 (2005)

    Google Scholar 

  21. Zamalloayz, M., Rodriguez-Fuentesy, L.J., Penagarikanoy, M., Bordely, G., Uribez, J.P.: Feature Dimensionality Reduction Through Genetic Algorithms For Faster Speaker Recognition. In: EUSIPCO 2008 16th European Signal Processing Conference (2008)

    Google Scholar 

  22. Aronowitz, H., Burshtein, D.: Efficient Speaker Recognition Using Approximated Cross Entropy (ACE). IEEE Transactions on Audio, Speech, and Language Processing 15(7) (2007)

    Google Scholar 

  23. Apsingekar, V.R., De Leon, P.L.: Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications. IEEE Transactions on Audio, Speech, and Language Processing 17(4) (2009)

    Google Scholar 

  24. Puente, L., Poza, M., Ruiz, B., García-Crespo, A.: Score Normalization for Multimodal Recognition Systems. In: JIAS (2010)

    Google Scholar 

  25. Guo, W., Dai, L., Wang, R.: Double Gauss Based Unsupervised Score Normalization in Speaker Verification. In: ISCSLP 2008, pp. 165–168 (2008)

    Google Scholar 

  26. Castro, D.R., Fierrez-Aguilar, J., Gonzalez-Rodriguez, J., Ortega-Garcia, J.: Speaker Verification using Speaker and Test Dependent Fast Score Normalization. Pattern Recognition Letters 28, 90–98 (2007)

    Article  Google Scholar 

  27. Zajíc, Z., Vaněk, J., Machlica, L., Padrta, A.: A Cohort Method for Score Normalization in Speaker Verification System, Acceleration of On-line Cohort Methods. In: SPECOM (2007)

    Google Scholar 

  28. Sturim, D.E., Reynolds, D.A.: Speaker Adaptive Cohort Selection for Tnorm in Text-independent Speaker Verification. In: Proceedings of ICASSP (2005)

    Google Scholar 

  29. Gupta, C.S.: Significance of Source Feature for Speaker Recognition. In: A M.S Thesis IIIT Madras (2003)

    Google Scholar 

  30. He, L., Zhang, W., Shan, Y., Liu, J.: Channel Compensation Technology in Differential GSV–SVM Speaker Verification System. In: APCCAS (2008)

    Google Scholar 

  31. Neville, K., Jusak, J., Hussain, Z.M., Lech, M.: Performance of a Text-Independent Remote Speaker Recognition Algorithm over Communication Channels with Blind Equali sation. In: Proceedings of TENCON (2005)

    Google Scholar 

  32. Campbell, W.M., Campbell, J.P., Reynolds, D.A., Jones, D.A., Leek, T.R.: Phonetic Speaker Recognition with Support Vector Machines. In: Proc. NIPS (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Saquib, Z., Salam, N., Nair, R.P., Pandey, N., Joshi, A. (2010). A Survey on Automatic Speaker Recognition Systems. In: Kim, Th., Pal, S.K., Grosky, W.I., Pissinou, N., Shih, T.K., Ślęzak, D. (eds) Signal Processing and Multimedia. MulGraB SIP 2010 2010. Communications in Computer and Information Science, vol 123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17641-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17641-8_18

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17640-1

  • Online ISBN: 978-3-642-17641-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics