A Dual-Factor Authentication System Featuring Speaker Verification and Token Technology

  • Purdy Ho
  • John Armington
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2688)


This paper presents a secure voice authentication system combining speaker verification and token technology. The dual-factor authentication system is especially designed to counteract imposture by pre-recorded speech and the text-to-speech voice cloning (TTSVC) technology, as well as to regulate the inconsistency of audio characteristics among different handsets. The token device generates and prompts a onetime passcode (OTP) to the user. The spoken OTP is then forwarded simultaneously to both a speaker verification module, which verifies the user’s voice, and a speech recognition module, which converts the spoken OTP to text and validates it. Thus, the OTP protects against recorded speech or voice cloning attacks and speaker verification protects against the use of a lost or stolen token device. We show the preliminary results of our Support Vector Machine (SVM)-based speaker verification algorithm, handset identification algorithm, and the system architecture of our design.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    C. Burges. A tutorial on support vector machines for pattern recognition. Bell Laboratories, Lucent Technologies, 1998.Google Scholar
  2. [2]
    J.P. Campbell. Speaker recognition: A tutorial. Proceedings of the IEEE, 85(9), 1997.Google Scholar
  3. [3]
    Y. Gu and T. Thomas. A text-independent speaker verification system usingsupport vector machines classifier. Eurospeech, 2001.Google Scholar
  4. [4]
    L.P. Heck, Y. Konig, M.K. Sönmez, and M. Weintraub. Robustness to telephone handset distortion in speaker recognition by discriminative feature design. Speech Communication, 31, 2000.Google Scholar
  5. [5]
    L.P. Heck and M. Weintraub. Handset-dependent background models for robust text-independent speaker recognition. IEEE ICASSP, pages 1071–1074, 1997.Google Scholar
  6. [6]
    S.P. Kishore and B. Yegnanarayana. Identification of handset type using autoassociative neural networks. The 4th International Conference on Advances in Pattern Recognition and Digital Techniques, 1999.Google Scholar
  7. [7]
    J.M. Naik. Speaker verification: A tutorial. IEEE Communications Magazine, 1990.Google Scholar
  8. [8]
    F. Nolan. The Phonetic Bases of Speaker Recognition. Cambridge University Press, 1983.Google Scholar
  9. [9]
    Purdy Ho. A Handset Identifier Using Support Vector Machines. In IEEE International Conference on Spoken Language Processing, Denver, CO, USA, 2002.Google Scholar
  10. [10]
    D.A. Reynolds. HTIMIT and LLHDB: Speech corpora for the study of handset transducer effects. IEEE ICASSP, pages 1535–1538, 1997.Google Scholar
  11. [11]
    M. Slaney. Auditory toolbox, version 2. Technical Report, Interval Research Corproation, 1998.Google Scholar
  12. [12]
    V. Vapnik. Statistical learning theory. John Wiley and Sons, New York, 1998.zbMATHGoogle Scholar
  13. [13]
    V. Wan and W. Campbell. Support vector machines for speaker verification and identification. IEEE Proceeding, 2000.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Purdy Ho
    • 1
  • John Armington
    • 1
  1. 1.Hewlett-PackardUSA

Personalised recommendations