Application to Speaker Recognition

Holambe, Raghunath S.; Deshpande, Mangesh S.

doi:10.1007/978-1-4614-1505-3_6

Raghunath S. Holambe³ &
Mangesh S. Deshpande⁴

Part of the book series: SpringerBriefs in Electrical and Computer Engineering ((BRIEFSSPEECHTECH))

633 Accesses

Abstract

Speaker recognition refers to a task of recognizing people by their voices. In speaker recognition, one is interested in extracting and characterizing the speaker-specific information embedded in speech signal. In a larger context, speaker recognition belongs to the field of biometrics, which refers to authenticating persons based on their physical and/or learned characteristics. There has long been a desire to be able to identify a person on the basis of his or her voice. For many years, judges, lawyers, detectives and law enforcement agencies have wanted to use forensic voice authentication to investigate a suspect or to confirm a judgment of guilt or innocence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Prabhakar S, Pankanti S, Jain A (2003) Biometric recognition: security and privacy concerns. IEEE Secur Priv Mag 1:32–34
Article Google Scholar
Jain AK, Ross A, Prabhakar S (2004) An introduction to biometric recognition. IEEE trans Circuits Syst Video Technol 14(1):4–20
Article Google Scholar
Campbell JP, Shen W, Campbell WM, Schwartz R, Bonastre JF, Mastrouf D (2009) Forensic speaker recognition: a need for caution. IEEE Signal Process Mag 26(2):95–103
Article Google Scholar
Wu JD, Lin BF (2009) Speaker identification using discrete wavelet packet transform technique with irregular decomposition. Expert Syst Appl 36:3136–3143
Article MathSciNet Google Scholar
Hayakawa S, Itakura F (1994) Text-dependent speaker recognition using the information in the higher frequency band. In: Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP’94), Adelaide, pp 137–140
Google Scholar
Mishra H, Ikbal S, Yegnanarayana B (2003) Speaker specific mapping for text-independent speaker recognition. Speech Commun 39:301–310
Article Google Scholar
Rabiner LR, Juang BH (1993) Fundamentals of speech recognition. Prentice-Hall, India
Google Scholar
Patil HA, Basu TK (2004) Teager energy mel cepstrum for identification of twins in Marathi. In: IEEE India annual conference INDICON, vol 64, pp 58–61
Google Scholar
Teager HM (1980) Some observations on oral air flow during phonation. IEEE Trans Speech Audio Process 28(5):599–601
Article Google Scholar
Gish H, Schmidt M (1994) Text independent speaker identification. IEEE Signal Process Mag 11(4):18–32
Article Google Scholar
Huggins M, Grieco J (2002) Confidence metrics for speaker identification. In: Proceedings of the international conference on spoken language processing (ICSLP’02), Denver, CO, pp 1381–1384
Google Scholar
Luck JE (1969) Automatic speaker verification using cepstral measurements. J Acoust Soc Am 46(2):1026–1032
Article Google Scholar
Pruzansky S (1963) Pattern matching procedure for automatic talker recognition. J Acoust Soc Am 35(3):354–358
Article Google Scholar
Atal BS (1974) Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. J Acoust Soc Am 55:1304–1312
Article Google Scholar
Sambur MR (1975) Selection of acoustic features for speaker identification. IEEE Trans Acoust Speech Signal Process 23(2):176–182
Article Google Scholar
Rosenberg AE, Sambur MR (1975) New techniques for automatic speaker verification. IEEE Trans Acoust Speech Signal Process 23(2):169–176
Article Google Scholar
Sambur MR (1976) Speaker recognition using orthogonal linear prediction. IEEE Trans Acoust Speech Signal Process 24(4):283–289
Article Google Scholar
Furui S (1986) Speaker independent isolated word recognition using dynamic features of speech spectrum. IEEE Trans Acoust Speech Signal Process 34:52–59
Article Google Scholar
Furui S (1981) Cepstral analysis technique for automatic speaker verification. IEEE Trans Acoust Speech Signal Process 29(2):254–272
Article Google Scholar
Plumpe MD, Quatieri TF, Reynolds DA (1999) Modeling of the glottal flow derivative waveform with application to speaker identification. IEEE Trans Speech Audio Process 7(5):569–585
Article Google Scholar
Burton D (1987) Text-dependent speaker verification using vector quantization source coding. IEEE Trans Acoust Speech Signal Process 35(2):133–143
Article Google Scholar
He J, Liu L, Palm G (1999) A discriminative training algorithm for VQ-based speaker identification. IEEE Trans Acoust Speech Signal Process 7(3):353–356
Google Scholar
Kinnunen T, Karpov E, Franti P (2006) Real-time speaker identification and verification. IEEE Trans Audio Speech Lang Process 14(1):277–288
Google Scholar
Soong F, Rosenberg A (1988) On the use of instantaneous and transitional spectral information in speaker recognition. IEEE Trans Acoust Speech Signal Process 36(6):871–879
Article MATH Google Scholar
Linde Y, Buzo A, Gray M (1980) An algorithm for vector quantization. IEEE Trans Commun 28(1):84–95
Google Scholar
Soong F, Rosenberg A, Rabiner L, Juang B (1985) A vector quantization approach to speaker recognition. In: Proceedings of the international conference on acoustics, speech, and signal processing, vol 1, Tampa, FL, pp 387–390
Google Scholar
Kinnunen T, Saastamoinen J, Hautamaki V, Vini M, Franti P (2009) Comparative evaluation of maximum a posteriori vector quantization and Gaussian mixture models in speaker verification. Pattern Recognit Lett 30(4):341–347
Article Google Scholar
Bannani G, Gallinari P (1995) Neural networks for discrimination and modelization of speakers. Speech Commun 17:159–175
Article Google Scholar
Yegnanarayana B (1999) Artificial neural networks. Prentice-Hall, India
Google Scholar
Lipmann RP (1989) An introduction to computing with neural nets. IEEE Trans Acoust Speech Signal Process 4:4–22
Google Scholar
Prasanna SRM, Gupta CS, Yegnanarayana B (2006) Extraction of speaker-specific excitation information from linear perdiction residual of speech. Speech Commun 48:1243–1261
Article Google Scholar
Yegnanarayana B, Prasanna SRM, Zachariach JM, Gupta SC (2005) Combining evidences from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Trans Speech Audio Process 13(4):575–582
Article Google Scholar
Murthy KSR, Yegnanarayana B (2006) Combining evidence from residual phase and MFCC features for speaker recognition. IEEE Signal Process Lett 13(1):52–56
Article Google Scholar
Yegnanarayana B, Reddy KS, Kishore SP (2001) Source and system features for speaker recognition using AANN models. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, Salt Lake city, Utah, pp 409–412
Google Scholar
Reynolds DA, Rose R (1995) Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans Speech Audio Process 3(1):72–83
Article Google Scholar
Reynolds DA (1995) Speaker identification and verification using Gaussian mixture speaker models. Speech Commun 17:91–108
Article Google Scholar
Reynolds DA, Quateri TF, Dunn RB (2000) Speaker verification using adapted Gaussian mixture speaker models. Digit Signal Process 10:19–41
Article Google Scholar
Rosenberg AE, Parthasarathy S (1996) Speaker recognition models for conected digit password speaker verification. In: Proceedings of the international conference on acoustics, speech, and signal processing (ICASSP’96), Atlanta, GA, pp 81–84
Google Scholar
Matsui T, Furui S (1994) Comparison of text-independent speaker recognition methods using VQ-distortion and discrete/continuous HMMs. IEEE Trans Speech Audio Process 2(3):456–459
Article Google Scholar
Kimball O, Schmidt M, Gish H, Waterman J (1997) Speaker verification with limited enrollment data. In: Proceedings of the European conference on speech communication and technology (EUROSPEECH’97), Rhodes, pp 967–970
Google Scholar
Deshpande MS, Holambe RS (2008) Text-independent speaker identification using hidden markov model. In: Proceedings of first IEEE international conference on emerging trends in engineering and technology (ICETET’08), Nagpur, pp 641–644
Google Scholar
Wan V, Renals S (2002) Evaluation of kernel methods for speaker verification and identification. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 1, pp 669–672
Google Scholar
Wan V, Renals S (2005) Speaker verification using sequence discriminant support vector machines. IEEE Trans Speech Audio Process 12:203–210
Article Google Scholar
Campbell W, Campbell J, Reynolds D, Singer E, Torres-Carrasquillo P (2006) Support vector machines for speaker and language recognition. Comput Speech Lang 20(2):210–229
Article Google Scholar
Campbell W, Sturim D, Reynolds D (2006) Support vector machines using GMM supervectors for speaker verification. IEEE Signal Process Lett 13(5):308–311
Article Google Scholar
Quatieri TF (2004) Discrete-time speech signal processing principles and practice. Pearson Education, Upper Saddle River
Google Scholar
Rabiner LR, Shafer RW (1989) Digital signal processing of speech signals. Prentice-Hall, Englewood Cliffs
Google Scholar
Harris F (1978) On the use of windows for harmonic analysis with the discrete Fourier transform. Proc IEEE 66(1):51–84
Article Google Scholar
Hansen J, Proakis J (2000) Discrete-time processing of speech signals, 2nd edn. IEEE Press, New York
Google Scholar
Proakis J, Manolakis D (1992) Digital signal prosessing: principles, algorithms and applications, 2nd edn. Macmillan Publishing Company, New York
Google Scholar
Oppenheim A, Schafer R (1975) Digital signal processing. Prentice Hall, Englewood Cliffs
MATH Google Scholar
Lu X, Dang J (2007) Physiological feature extraction for text independent speaker identification using non-uniform subband processing. In: Proceedings of the IEEE international conference on acoustic speech and signal processing (ICASSP’07), Adelaide, pp IV-461–464
Google Scholar
Lu X, Dang J (2008) An investigation of dependencies between frequency components and speaker characteristics for text-independent speaker identification. Speech Commun 50:312–322
Article Google Scholar
Kvedalen E (2003) Signal processing using the Teager Energy Operator and other nonlinear operators. Candies Scientific Thesis, University of Oslo, Norway
Google Scholar
Jankowski CR (1996) Signal processing using the Teager energy operator and other nonlinear operators. Ph.D. thesis, MIT, USA
Google Scholar
Jankowski CR, Quatieri TF, Reynolds DA (1995) Measuring fine structure in speech: application to speaker identification. In: Proceedings of the IEEE international conference acoustics, speech, and, signal processing, pp 325–328
Google Scholar
Jabloun F, Cetin AE, Erzin E (1999) Teager energy based feature parameters for speech recognition in car noise. IEEE Signal Process Lett 6(10):159–261
Article Google Scholar
Noisex-92. http://www.speech.cs.cmu.edu/comp.speech/Section1/Data/noisex.html
Potamianos A, Maragos P (1996) Speech formant frequency and bandwidth tracking using multiband energy demodulation. J Acoust Soc Am 99(6):3795–3806
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Instrumentation, SGGS Institute of Engineering and Technology, Vishnupuri, Nanded, 431606, India
Raghunath S. Holambe
Department of E & TC Engineering, SRES College of Engineering, Kopargaon, 423603, India
Mangesh S. Deshpande

Authors

Raghunath S. Holambe
View author publications
You can also search for this author in PubMed Google Scholar
Mangesh S. Deshpande
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Holambe, R.S., Deshpande, M.S. (2012). Application to Speaker Recognition. In: Advances in Non-Linear Modeling for Speech Processing. SpringerBriefs in Electrical and Computer Engineering(). Springer, Boston, MA. https://doi.org/10.1007/978-1-4614-1505-3_6

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1505-3_6
Published: 21 February 2012
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4614-1504-6
Online ISBN: 978-1-4614-1505-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics