This article presents the concept of conversational biometrics; the combination of acoustic voice matching (traditional speaker verification) with other conversation-related information sources (such as knowledge) to perform identity verification. The interaction between the user and the verification system is orchestrated by a state-based policy modeled within a probabilistic framework. The verification process may be accomplished in an interactive manner (active validation) or as a “listen-in” background process (passive validation). In many system configurations, the verification may be performed transparently to the caller.
For an interactive environment evaluation with uninformed impostors, it is shown that very high performance can be attained by combining the evidence from acoustics and question–answer pairs. In addition, the study demonstrates the biometrics system to be robust against fully informed impostors, a challenge yet to be addressed with existing widespread knowledge-only verification practices. Our view of conversational biometrics emphasizes the importance of incorporating multiple sources of information conveyed in speech.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Atal, B. Automatic recognition of speakers from their voices. Proc. IEEE, 64: 460-475, 1976.
Auckenthaler, R., Carey, M., and Lloyd-Thomas, H. Score normalization for text-independent speaker verification systems. Digital Signal Processing, 10 (1-3):42-54, January/April/July 2000.
Brummer, N. and du Preez, J. Application-independent evaluation of speaker detection. Computer Speech and Language, 20 (Issues 2-3): 230-275, 2006.
Campbell, J. Automatic speech and speaker recognition, advanced topics. In Lee, C., Soong, F., and Paliwal, K, editors, Speaker Recognition. Kluwer Academic, Norwell, MA, 1996.
Campbell, W. Generalized linear discriminant sequence kernels for speaker recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, pages 161-164, 2002.
Canseco-Rodriguez, L., Lamel, L., and Gauvain, J. Towards using STT for broadcast news speaker diarization. DARPA Rich Transcription Workshop, 2004.
Chaudhari, U., Navrátil, J., and Maes, S. Transformation enhanced multi-grained modeling for text-independent speaker recognition. In Proc. of the International Conference on Spoken Language Processing (ICSLP), Beijing, China, October 2000.
Chaudhari, U., Navrátil, J., Maes, S., and Ramaswamy, G. Very large popu-lation text-independent speaker identification. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2001.
Davies, K. and et al. The IBM conversational telephony system for financial applications. In Proc. Eurospeech, 1999.
DeGroot, M. Optimal Statistical Decisions. McGraw-Hill Inc, New York, 1970.
Doddington, G. Speaker recognition - identifying people by their voices. Proc. IEEE, 76(11):1651-1664, 1985.
Doddington, G. Speaker recognition based on idiolectal differences between speakers. In Eurospeech, volume 4, pages 2521-2524, 2001.
Farell, K., Mammone, R., and Assaleh, K. Speaker recognition using neural networks and conventional classifiers. IEEE Trans. on Acoustics, Speech, and Signal Processing, 2(1):194-205, January 1994.
Fine, S., Navrátil, J., and Gopinath, R. A hybrid GMM/SVM approach to speaker identification. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2001.
Furui, S. Recent advances in speaker recognition. In J. Bigun, G. Chollet, and G. Borgefors, editors, Proc. Audio- and Video-based biometric person authentication, pages 237-252. Springer-Verlag, Newyork, 1997.
Kenny, P. Joint factor analysis of speaker and session variability: Theory and algorithms. Online: http://www.crim.ca/perso/patrick.kenny, 2006.
Maes, S. Conversational biometrics. In Proc. of the European Conference on Speech Communication and Technology (EUROSPEECH), Budapest, Hun-gary, 1999.
Maes, S. and Beigi, H. Open Sesame! Speech password or key to secure your door. In Proc. ACCV, 1998. invited paper.
Maes, S., Navrátil, J., and Chaudhari, U. Conversational speech biometrics. In E-Commerce Agents, Marketplace - Solutions, Security Issues, and Supply Demand, LNAI 2033. Springer Verlag, Newyork, 2001.
Martin, A. NIST-evaluations for automatic language identification systems. Technical report, National Institute of Standards and Technology, Gaithersburg, MD, 1993-96.
Navrátil, J. and Ramaswamy, G. The awe and mystery of T-norm. In Proc. of the European Conference on Speech Communication and Technology (EUROSPEECH), Geneve, Switzerland, September 2003.
Navrátil, J., Kleindienst, J., and Maes, S. An instantiable speech biometrics module with natural language interface: Implementation in the telephony environment. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, June 2000. IEEE.
O’Shaughnessy, D. Speaker recognition. IEEE ASSP Magazine, 3(4):pp. 4-17, October 1986.
Papineni, K., Roukos, S., and Ward, R. Free-flow dialog management using forms. In Proc. Eurospeech, 1999.
Pelecanos, J. and Sridharan, S. Feature warping for robust speaker verification. In Proc. Speaker Odyssey 2001, Crete, Greece, June 2001.
Pelecanos, J., Povey, D., and Ramaswamy, G. Secondary classification for GMM based speaker recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, 2006.
Ramaswamy, G., Navrátil, J., Chaudhari, U., and Zilca, R. The IBM sys-tem for the NIST 2002 cellular speaker verification evaluation. International Conference on Acoustics, Speech and Signal Processing, 2:61-64, 2003a.
Ramaswamy, G., Zilca, R., and Alecksandrovich, O. A programmable policy manager for conversational biometrics. Eurospeech, 3:1957-1960, 2003b.
Reynolds, D. Comparison of background normalization methods for text-independent speaker verification. In Proc. Eurospeech, volume 2, pages 963-966,1997.
Reynolds, D., Quatieri, T., and Dunn, R. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing,10(1/2/3):19-41, 2000.
Shriberg, E., Ferrer, L., Venkataraman, A., and Kajarekar, S. SVM modeling of “SNERF-grams” for speaker recognition. International Conference on Spoken Language Processing, 2004.
Trantor, S. Who really spoke when? Finding speaker turns and identities in broadcast news audio. International Conference on Acoustics, Speech and Signal Processing, 1:1013-1016, 2006.
Vogt, R., Baker, B., and Sridharan, S. Modelling session variability in textindependent speaker verification. Interspeech, pages 3117-3120, 2005.
Vogt, R., Pelecanos, J., and Sridharan, S. Dependence of GMM adaptation on feature post-processing for speaker recognition. Eurospeech, 2003. Wald, A. Sequential Analysis. Wiley, New York, 1947.
Zviran, M. and Haga, W. User authentication by cognitive passwords: An empirical assessment. IEEE, 1990.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag London Limited
About this chapter
Cite this chapter
Pelecanos, J., Navrátil, J., Ramaswamy, G.N. (2008). Conversational Biometrics: A Probabilistic View. In: Ratha, N.K., Govindaraju, V. (eds) Advances in Biometrics. Springer, London. https://doi.org/10.1007/978-1-84628-921-7_11
Download citation
DOI: https://doi.org/10.1007/978-1-84628-921-7_11
Publisher Name: Springer, London
Print ISBN: 978-1-84628-920-0
Online ISBN: 978-1-84628-921-7
eBook Packages: Computer ScienceComputer Science (R0)