Abstract
Human listeners are capable of identifying a speaker, over the telephone or an entryway out of sight, by listening to the voice of the speaker. Achieving this intrinsic human specific capability is a major challenge for Voice Biometrics. Like human listeners, voice biometrics uses the features of a person’s voice to ascertain the speaker’s identity. The best-known commercialized forms of voice Biometrics is Speaker Recognition System (SRS). Speaker recognition is the computing task of validating a user’s claimed identity using characteristics extracted from their voices. This literature survey paper gives brief introduction on SRS, and then discusses general architecture of SRS, biometric standards relevant to voice/speech, typical applications of SRS, and current research in Speaker Recognition Systems. We have also surveyed various approaches for SRS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
SANS Information Security Reading Room, http://www.sans.org
Biometrics.gov, http://www.biometrics.gov/Documents/SpeakerRec.pdf
Chowdhury, M.F.A., Alam, M.J., Alam, M.F.A., O’Shaughnessy, D.: Perceptually weighted multi-band spectral subtraction speech enhancement technique. In: International Conference on Electrical and Computer Engineering, ICECE 2008, pp. 395–399 (2008)
Jun, L., He, Z.: Spectral Subtraction Speech Enhancement Technology Based on Fast Noise Estimation. In: ICIECS (2009)
Hansen, J.H.L., Radhakrishnan, V., Arehart, K.H.: Speech Enhancement Based on Generalized Minimum Mean Square Error Estimators and Masking Properties of the Auditory System. IEEE Transactions on Audio, Speech, and Language Processing 14(6), 2049–2063 (2006)
Hasan, T., Hasan, M.K.: MMSE estimator for speech enhancement considering the constructive and destructive interference of noise. Signal Processing, IET 4(1), 1–11 (2010)
Lev-Ari, H., Ephraim, Y.: Extension of the signal subspace speech enhancement approach to colored noise. IEEE Signal Processing Letters 10(4), 104–106 (2003)
Wu, W., Zheng, T.F., Xu, M.: Cohort-Based Speaker Model Synthesis for Channel Robust Speaker Recognition. In: ICASSP (2006)
Han, J., Gao, R.: Text-independent Speaker Identification Based on MAP Channel Compensation and Pitch-dependent Features. In: IJECSE (2010)
Calvo, J.R., Fernandez, R., Hernandez, G.: Channel / Handset Mismatch Evaluation in a Biometric Speaker Verification Using Shifted Delta Cepstral Features. In: Rueda, L., Mery, D., Kittler, J. (eds.) CIARP 2007. LNCS, vol. 4756, pp. 96–105. Springer, Heidelberg (2007)
Prahallad, K., Varanasi, S., Veluru, R., Bharat Krishna, M., Roy, D.S.: Significance of Formants from Difference Spectrum for Speaker Identification. In: INTERSPEECH 2006 (2006)
Chakroborty, S., Saha, G.: Improved Text-Independent Speaker Identification using Fused MFCC & IMFCC Feature Sets based on Gaussian Filter. In: IJSP (2009)
Revathi, A., Ganapathy, R., Venkataramani, Y.: Text Independent Speaker Recognition and Speaker Independent Speech Recognition Using Iterative Clustering Approach. In: IJCSIT, vol. 1(2) (2009)
Huang, W., Chao, J., Zhang, Y.: Combination of Pitch and MFCC GMM Supervectors for Speaker. In: ICALIP (2008)
Deshpande, M.S., Holambe, R.S.: Speaker Identification Using Admissible Wavelet Packet Based Decomposition. International Journal of Signal Processing 6, 1 (2010)
Campbell, W.M., Campbell, J.P., Gleason, T.P., Reynolds, D.A., Shen, W.: Speaker Verification using Support Vector Machines and High-Level Feature. IEEE Transactions on Audio, Speech, And Language Processing 15(7) (2007)
Baker, B., Vogt, R., Sridharan, S.: Gaussian Mixture Modeling of Broad Phonetic and Syllabic Events for Text-Independent Speaker Verification. In: Euro speech (2005)
Mary, L., Yegnanarayana, B.: Extraction and representation of prosodic features for language and speaker recognition. ELSEVIER Speech Communication 50, 782–796 (2008)
Dehak, N., Dumouchel, P., Kenny, P.: Modeling Prosodic Features with Joint Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing 15(7), 2095–2103 (2007)
Aronowitz, H., Burshtein, D.: Efficient Speaker Identification and Retrieval. In: Proc. Interspeech 2005, pp. 2433–2436 (2005)
Zamalloayz, M., Rodriguez-Fuentesy, L.J., Penagarikanoy, M., Bordely, G., Uribez, J.P.: Feature Dimensionality Reduction Through Genetic Algorithms For Faster Speaker Recognition. In: EUSIPCO 2008 16th European Signal Processing Conference (2008)
Aronowitz, H., Burshtein, D.: Efficient Speaker Recognition Using Approximated Cross Entropy (ACE). IEEE Transactions on Audio, Speech, and Language Processing 15(7) (2007)
Apsingekar, V.R., De Leon, P.L.: Speaker Model Clustering for Efficient Speaker Identification in Large Population Applications. IEEE Transactions on Audio, Speech, and Language Processing 17(4) (2009)
Puente, L., Poza, M., Ruiz, B., García-Crespo, A.: Score Normalization for Multimodal Recognition Systems. In: JIAS (2010)
Guo, W., Dai, L., Wang, R.: Double Gauss Based Unsupervised Score Normalization in Speaker Verification. In: ISCSLP 2008, pp. 165–168 (2008)
Castro, D.R., Fierrez-Aguilar, J., Gonzalez-Rodriguez, J., Ortega-Garcia, J.: Speaker Verification using Speaker and Test Dependent Fast Score Normalization. Pattern Recognition Letters 28, 90–98 (2007)
Zajíc, Z., Vaněk, J., Machlica, L., Padrta, A.: A Cohort Method for Score Normalization in Speaker Verification System, Acceleration of On-line Cohort Methods. In: SPECOM (2007)
Sturim, D.E., Reynolds, D.A.: Speaker Adaptive Cohort Selection for Tnorm in Text-independent Speaker Verification. In: Proceedings of ICASSP (2005)
Gupta, C.S.: Significance of Source Feature for Speaker Recognition. In: A M.S Thesis IIIT Madras (2003)
He, L., Zhang, W., Shan, Y., Liu, J.: Channel Compensation Technology in Differential GSV–SVM Speaker Verification System. In: APCCAS (2008)
Neville, K., Jusak, J., Hussain, Z.M., Lech, M.: Performance of a Text-Independent Remote Speaker Recognition Algorithm over Communication Channels with Blind Equali sation. In: Proceedings of TENCON (2005)
Campbell, W.M., Campbell, J.P., Reynolds, D.A., Jones, D.A., Leek, T.R.: Phonetic Speaker Recognition with Support Vector Machines. In: Proc. NIPS (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saquib, Z., Salam, N., Nair, R.P., Pandey, N., Joshi, A. (2010). A Survey on Automatic Speaker Recognition Systems. In: Kim, Th., Pal, S.K., Grosky, W.I., Pissinou, N., Shih, T.K., Ślęzak, D. (eds) Signal Processing and Multimedia. MulGraB SIP 2010 2010. Communications in Computer and Information Science, vol 123. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17641-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-17641-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17640-1
Online ISBN: 978-3-642-17641-8
eBook Packages: Computer ScienceComputer Science (R0)