Conversational Biometrics: A Probabilistic View

Pelecanos, Jason; Navrátil, Jiří; Ramaswamy, Ganesh N.

doi:10.1007/978-1-84628-921-7_11

Jason Pelecanos³,
Jiří Navrátil³ &
Ganesh N. Ramaswamy³

1862 Accesses
1 Citations

This article presents the concept of conversational biometrics; the combination of acoustic voice matching (traditional speaker verification) with other conversation-related information sources (such as knowledge) to perform identity verification. The interaction between the user and the verification system is orchestrated by a state-based policy modeled within a probabilistic framework. The verification process may be accomplished in an interactive manner (active validation) or as a “listen-in” background process (passive validation). In many system configurations, the verification may be performed transparently to the caller.

For an interactive environment evaluation with uninformed impostors, it is shown that very high performance can be attained by combining the evidence from acoustics and question–answer pairs. In addition, the study demonstrates the biometrics system to be robust against fully informed impostors, a challenge yet to be addressed with existing widespread knowledge-only verification practices. Our view of conversational biometrics emphasizes the importance of incorporating multiple sources of information conveyed in speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Atal, B. Automatic recognition of speakers from their voices. Proc. IEEE, 64: 460-475, 1976.
Article Google Scholar
Auckenthaler, R., Carey, M., and Lloyd-Thomas, H. Score normalization for text-independent speaker verification systems. Digital Signal Processing, 10 (1-3):42-54, January/April/July 2000.
Article Google Scholar
Brummer, N. and du Preez, J. Application-independent evaluation of speaker detection. Computer Speech and Language, 20 (Issues 2-3): 230-275, 2006.
Article Google Scholar
Campbell, J. Automatic speech and speaker recognition, advanced topics. In Lee, C., Soong, F., and Paliwal, K, editors, Speaker Recognition. Kluwer Academic, Norwell, MA, 1996.
Google Scholar
Campbell, W. Generalized linear discriminant sequence kernels for speaker recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, pages 161-164, 2002.
Google Scholar
Canseco-Rodriguez, L., Lamel, L., and Gauvain, J. Towards using STT for broadcast news speaker diarization. DARPA Rich Transcription Workshop, 2004.
Google Scholar
Chaudhari, U., Navrátil, J., and Maes, S. Transformation enhanced multi-grained modeling for text-independent speaker recognition. In Proc. of the International Conference on Spoken Language Processing (ICSLP), Beijing, China, October 2000.
Google Scholar
Chaudhari, U., Navrátil, J., Maes, S., and Ramaswamy, G. Very large popu-lation text-independent speaker identification. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), May 2001.
Google Scholar
Davies, K. and et al. The IBM conversational telephony system for financial applications. In Proc. Eurospeech, 1999.
Google Scholar
DeGroot, M. Optimal Statistical Decisions. McGraw-Hill Inc, New York, 1970.
MATH Google Scholar
Doddington, G. Speaker recognition - identifying people by their voices. Proc. IEEE, 76(11):1651-1664, 1985.
Article Google Scholar
Doddington, G. Speaker recognition based on idiolectal differences between speakers. In Eurospeech, volume 4, pages 2521-2524, 2001.
Google Scholar
Farell, K., Mammone, R., and Assaleh, K. Speaker recognition using neural networks and conventional classifiers. IEEE Trans. on Acoustics, Speech, and Signal Processing, 2(1):194-205, January 1994.
Google Scholar
Fine, S., Navrátil, J., and Gopinath, R. A hybrid GMM/SVM approach to speaker identification. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2001.
Google Scholar
Furui, S. Recent advances in speaker recognition. In J. Bigun, G. Chollet, and G. Borgefors, editors, Proc. Audio- and Video-based biometric person authentication, pages 237-252. Springer-Verlag, Newyork, 1997.
Google Scholar
Kenny, P. Joint factor analysis of speaker and session variability: Theory and algorithms. Online: http://www.crim.ca/perso/patrick.kenny, 2006.
Maes, S. Conversational biometrics. In Proc. of the European Conference on Speech Communication and Technology (EUROSPEECH), Budapest, Hun-gary, 1999.
Google Scholar
Maes, S. and Beigi, H. Open Sesame! Speech password or key to secure your door. In Proc. ACCV, 1998. invited paper.
Google Scholar
Maes, S., Navrátil, J., and Chaudhari, U. Conversational speech biometrics. In E-Commerce Agents, Marketplace - Solutions, Security Issues, and Supply Demand, LNAI 2033. Springer Verlag, Newyork, 2001.
Google Scholar
Martin, A. NIST-evaluations for automatic language identification systems. Technical report, National Institute of Standards and Technology, Gaithersburg, MD, 1993-96.
Google Scholar
Navrátil, J. and Ramaswamy, G. The awe and mystery of T-norm. In Proc. of the European Conference on Speech Communication and Technology (EUROSPEECH), Geneve, Switzerland, September 2003.
Google Scholar
Navrátil, J., Kleindienst, J., and Maes, S. An instantiable speech biometrics module with natural language interface: Implementation in the telephony environment. In Proc. of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Istanbul, Turkey, June 2000. IEEE.
Google Scholar
O’Shaughnessy, D. Speaker recognition. IEEE ASSP Magazine, 3(4):pp. 4-17, October 1986.
Article Google Scholar
Papineni, K., Roukos, S., and Ward, R. Free-flow dialog management using forms. In Proc. Eurospeech, 1999.
Google Scholar
Pelecanos, J. and Sridharan, S. Feature warping for robust speaker verification. In Proc. Speaker Odyssey 2001, Crete, Greece, June 2001.
Google Scholar
Pelecanos, J., Povey, D., and Ramaswamy, G. Secondary classification for GMM based speaker recognition. IEEE International Conference on Acoustics, Speech and Signal Processing, 2006.
Google Scholar
Ramaswamy, G., Navrátil, J., Chaudhari, U., and Zilca, R. The IBM sys-tem for the NIST 2002 cellular speaker verification evaluation. International Conference on Acoustics, Speech and Signal Processing, 2:61-64, 2003a.
Google Scholar
Ramaswamy, G., Zilca, R., and Alecksandrovich, O. A programmable policy manager for conversational biometrics. Eurospeech, 3:1957-1960, 2003b.
Google Scholar
Reynolds, D. Comparison of background normalization methods for text-independent speaker verification. In Proc. Eurospeech, volume 2, pages 963-966,1997.
Google Scholar
Reynolds, D., Quatieri, T., and Dunn, R. Speaker verification using adapted Gaussian mixture models. Digital Signal Processing,10(1/2/3):19-41, 2000.
Article Google Scholar
Shriberg, E., Ferrer, L., Venkataraman, A., and Kajarekar, S. SVM modeling of “SNERF-grams” for speaker recognition. International Conference on Spoken Language Processing, 2004.
Google Scholar
Trantor, S. Who really spoke when? Finding speaker turns and identities in broadcast news audio. International Conference on Acoustics, Speech and Signal Processing, 1:1013-1016, 2006.
Google Scholar
Vogt, R., Baker, B., and Sridharan, S. Modelling session variability in textindependent speaker verification. Interspeech, pages 3117-3120, 2005.
Google Scholar
Vogt, R., Pelecanos, J., and Sridharan, S. Dependence of GMM adaptation on feature post-processing for speaker recognition. Eurospeech, 2003. Wald, A. Sequential Analysis. Wiley, New York, 1947.
Google Scholar
Zviran, M. and Haga, W. User authentication by cognitive passwords: An empirical assessment. IEEE, 1990.
Google Scholar

Download references

Author information

Authors and Affiliations

Conversational Biometrics Group, IBM Thomas J. Watson Research Center, 1101 Kitchawan Road, 10598, Yorktown Heights, NY, USA
Jason Pelecanos, Jiří Navrátil & Ganesh N. Ramaswamy

Authors

Jason Pelecanos
View author publications
You can also search for this author in PubMed Google Scholar
Jiří Navrátil
View author publications
You can also search for this author in PubMed Google Scholar
Ganesh N. Ramaswamy
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IBM Thomas J. Watson Research Center, Hawthorne, NY, USA
Nalini K. Ratha BTech, MTech, PhD
Department of Computer Science and Engineering, University of Buffalo, NY, USA
Venu Govindaraju BTech, MS, PhD

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Pelecanos, J., Navrátil, J., Ramaswamy, G.N. (2008). Conversational Biometrics: A Probabilistic View. In: Ratha, N.K., Govindaraju, V. (eds) Advances in Biometrics. Springer, London. https://doi.org/10.1007/978-1-84628-921-7_11

Download citation

DOI: https://doi.org/10.1007/978-1-84628-921-7_11
Publisher Name: Springer, London
Print ISBN: 978-1-84628-920-0
Online ISBN: 978-1-84628-921-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics