A Survey on Automatic Speaker Recognition Systems

Saquib, Zia; Salam, Nirmala; Nair, Rekha P.; Pandey, Nipun; Joshi, Akanksha

doi:10.1007/978-3-642-17641-8_18

Zia Saquib⁷,
Nirmala Salam⁷,
Rekha P. Nair⁷,
Nipun Pandey⁷ &
…
Akanksha Joshi⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 123))

Included in the following conference series:

International Conference on Multimedia, Computer Graphics, and Broadcasting
International Conference on Signal Processing, Image Processing, and Pattern Recognition

1236 Accesses
16 Citations

Abstract

Human listeners are capable of identifying a speaker, over the telephone or an entryway out of sight, by listening to the voice of the speaker. Achieving this intrinsic human specific capability is a major challenge for Voice Biometrics. Like human listeners, voice biometrics uses the features of a person’s voice to ascertain the speaker’s identity. The best-known commercialized forms of voice Biometrics is Speaker Recognition System (SRS). Speaker recognition is the computing task of validating a user’s claimed identity using characteristics extracted from their voices. This literature survey paper gives brief introduction on SRS, and then discusses general architecture of SRS, biometric standards relevant to voice/speech, typical applications of SRS, and current research in Speaker Recognition Systems. We have also surveyed various approaches for SRS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

SANS Information Security Reading Room, http://www.sans.org
Biometrics.gov, http://www.biometrics.gov/Documents/SpeakerRec.pdf
Chowdhury, M.F.A., Alam, M.J., Alam, M.F.A., O’Shaughnessy, D.: Perceptually weighted multi-band spectral subtraction speech enhancement technique. In: International Conference on Electrical and Computer Engineering, ICECE 2008, pp. 395–399 (2008)
Google Scholar
Jun, L., He, Z.: Spectral Subtraction Speech Enhancement Technology Based on Fast Noise Estimation. In: ICIECS (2009)
Google Scholar
Hansen, J.H.L., Radhakrishnan, V., Arehart, K.H.: Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System. IEEE Transactions on Audio, Speech, and Language Processing 14(6), 2049–2063 (2006)
Article Google Scholar
Hasan, T., Hasan, M.K.: MMSE estimator for speech enhancement considering the constructive and destructive interference of noise. Signal Processing, IET 4(1), 1–11 (2010)
Article Google Scholar
Lev-Ari, H., Ephraim, Y.: Extension of the signal subspace speech enhancement approach to colored noise. IEEE Signal Processing Letters 10(4), 104–106 (2003)
Article Google Scholar
Wu, W., Zheng, T.F., Xu, M.: Cohort-Based Speaker Model Synthesis for Channel Robust Speaker Recognition. In: ICASSP (2006)
Google Scholar
Han, J., Gao, R.: Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features. In: IJECSE (2010)
Google Scholar
Calvo, J.R., Fernandez, R., Hernandez, G.: Channel / Handset Mismatch Evaluation in a Biometric Speaker Verification Using Shifted Delta Cepstral Features. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 96–105. Springer, Heidelberg (2007)
Chapter Google Scholar
Prahallad, K., Varanasi, S., Veluru, R., Bharat Krishna, M., Roy, D.S.: Significance of Formants from Difference Spectrum for Speaker Identification. In: INTERSPEECH 2006 (2006)
Google Scholar
Chakroborty, S., Saha, G.: Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter. In: IJSP (2009)
Google Scholar
Revathi, A., Ganapathy, R., Venkataramani, Y.: Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach. In: IJCSIT, vol. 1(2) (2009)
Google Scholar
Huang, W., Chao, J., Zhang, Y.: Combination of Pitch and MFCC GMM Supervectors for Speaker. In: ICALIP (2008)
Google Scholar
Deshpande, M.S., Holambe, R.S.: Speaker Identification Using Admissible Wavelet Packet Based Decomposition. International Journal of Signal Processing 6, 1 (2010)
Google Scholar
Campbell, W.M., Campbell, J.P., Gleason, T.P., Reynolds, D.A., Shen, W.: Speaker Verification using Support Vector Machines and High-Level Feature. IEEE Transactions on Audio, Speech, And Language Processing 15(7) (2007)
Google Scholar
Baker, B., Vogt, R., Sridharan, S.: Gaussian Mixture Modeling of Broad Phonetic and Syllabic Events for Text-Independent Speaker Verification. In: Euro speech (2005)
Google Scholar
Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. ELSEVIER Speech Communication 50, 782–796 (2008)
Article Google Scholar
Dehak, N., Dumouchel, P., Kenny, P.: Modeling Prosodic Features with Joint Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing 15(7), 2095–2103 (2007)
Article Google Scholar
Aronowitz, H., Burshtein, D.: Efficient Speaker Identification and Retrieval. In: Proc. Interspeech 2005, pp. 2433–2436 (2005)
Google Scholar
Zamalloayz, M., Rodriguez-Fuentesy, L.J., Penagarikanoy, M., Bordely, G., Uribez, J.P.: Feature Dimensionality Reduction Through Genetic Algorithms For Faster Speaker Recognition. In: EUSIPCO 2008 16th European Signal Processing Conference (2008)
Google Scholar
Aronowitz, H., Burshtein, D.: Efficient Speaker Recognition Using Approximated Cross Entropy (ACE). IEEE Transactions on Audio, Speech, and Language Processing 15(7) (2007)
Google Scholar
Apsingekar, V.R., De Leon, P.L.: Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications. IEEE Transactions on Audio, Speech, and Language Processing 17(4) (2009)
Google Scholar
Puente, L., Poza, M., Ruiz, B., García-Crespo, A.: Score Normalization for Multimodal Recognition Systems. In: JIAS (2010)
Google Scholar
Guo, W., Dai, L., Wang, R.: Double Gauss Based Unsupervised Score Normalization in Speaker Verification. In: ISCSLP 2008, pp. 165–168 (2008)
Google Scholar
Castro, D.R., Fierrez-Aguilar, J., Gonzalez-Rodriguez, J., Ortega-Garcia, J.: Speaker Verification using Speaker and Test Dependent Fast Score Normalization. Pattern Recognition Letters 28, 90–98 (2007)
Article Google Scholar
Zajíc, Z., Vaněk, J., Machlica, L., Padrta, A.: A Cohort Method for Score Normalization in Speaker Verification System, Acceleration of On-line Cohort Methods. In: SPECOM (2007)
Google Scholar
Sturim, D.E., Reynolds, D.A.: Speaker Adaptive Cohort Selection for Tnorm in Text-independent Speaker Verification. In: Proceedings of ICASSP (2005)
Google Scholar
Gupta, C.S.: Significance of Source Feature for Speaker Recognition. In: A M.S Thesis IIIT Madras (2003)
Google Scholar
He, L., Zhang, W., Shan, Y., Liu, J.: Channel Compensation Technology in Differential GSV–SVM Speaker Verification System. In: APCCAS (2008)
Google Scholar
Neville, K., Jusak, J., Hussain, Z.M., Lech, M.: Performance of a Text-Independent Remote Speaker Recognition Algorithm over Communication Channels with Blind Equali sation. In: Proceedings of TENCON (2005)
Google Scholar
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Jones, D.A., Leek, T.R.: Phonetic Speaker Recognition with Support Vector Machines. In: Proc. NIPS (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

CDAC-Mumbai, Gulmohar Cross Road No.9, Juhu, Mumbai, 400049, India
Zia Saquib, Nirmala Salam, Rekha P. Nair, Nipun Pandey & Akanksha Joshi

Authors

Zia Saquib
View author publications
You can also search for this author in PubMed Google Scholar
Nirmala Salam
View author publications
You can also search for this author in PubMed Google Scholar
Rekha P. Nair
View author publications
You can also search for this author in PubMed Google Scholar
Nipun Pandey
View author publications
You can also search for this author in PubMed Google Scholar
Akanksha Joshi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Hannam University, Daejeon, South Korea
Tai-hoon Kim
Indian Statistical Institute, Kolkata, India
Sankar K. Pal
Department of Computer and Information Science, University of Michigan – Dearborn, 4901 Evergreen Road, 48128, Dearborn, MI, USA
William I. Grosky
Florida International University, EC 2910, 10555 W. Flagler Street, 33174, Miami, FL, USA
Niki Pissinou
National Taipei University of Education, No.134, Sec. 2, Heping E. Rd., Da-an District, 106, Taipei City, Taiwan (R.O.C.)
Timothy K. Shih
University of Warsaw & Infobright Inc., Poland
Dominik Ślęzak

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saquib, Z., Salam, N., Nair, R.P., Pandey, N., Joshi, A. (2010). A Survey on Automatic Speaker Recognition Systems. In: Kim, Th., Pal, S.K., Grosky, W.I., Pissinou, N., Shih, T.K., Ślęzak, D. (eds) Signal Processing and Multimedia. MulGraB SIP 2010 2010. Communications in Computer and Information Science, vol 123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17641-8_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-17641-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17640-1
Online ISBN: 978-3-642-17641-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics