Advertisement

Voice Biometrics

  • Joaquín González-Rodríguez
  • Doroteo Torre Toledano
  • Javier Ortega-García

Keywords

Speech Signal Speaker Recognition Speaker Model Linear Predictive Code Biometric Trait 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. A. G. Adami and H. Hermansky. Segmentation of speech for speaker and language recognition. In Proceedings of Interspeech, pages 841–844, 2003.Google Scholar
  2. C.G.G. Aitken and F. Taroni. Statistics and the Evaluation of Evidence for Forensic Scientists. John Wiley and Sons, 2 edition, 2004.Google Scholar
  3. N. Brummer and J. Preez. Application-independent evaluation of speaker detection. Computer, Speech and Language, 20:230–275, 2006.CrossRefGoogle Scholar
  4. D. K. Burton. Text-dependent speaker verification using vector quantization source coding. IEEE Transactions on Acoustics, Speech and Signal Processing, 35:133–143, 1987.CrossRefGoogle Scholar
  5. J. Campbell and A. Higgins. Yoho speaker verification (ldc94s16). http://www.ldc.upenn.edu.Google Scholar
  6. J. P. Campbell. Testing with the yoho cd-rom voice verification corpus. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 341–344, 1995.Google Scholar
  7. W. M. Campbell, J. P. Campbell, D. A. Reynolds, D. A. Jones, and T. R. Leek. High-level speaker verification with support vector machines. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 73–76, 2004.Google Scholar
  8. W.M. Campbell. Generalized linear discriminant sequence kernels for speaker recognition. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing, pages 161–164, 2002.Google Scholar
  9. W.M. Campbell, D.E. Sturim, and D.A. Reynolds. Support vector machines using gmm supervectors for speaker verification. IEEE Signal Processing Letters, 13:308–311, 2006.CrossRefGoogle Scholar
  10. P. Carr. English Phonetics and Phonology: An Introduction. Blackwell Publishing, Incorporated, 1999.Google Scholar
  11. Voice Biometrics Conference. http://www.voicebiocon.com.Google Scholar
  12. G. Doddington. Speaker recognition based on idiolectal difierences between speakers. In Proceedings of Interspeech, volume 4, pages 2517–2520, 2001.Google Scholar
  13. A. G. Adami et al. Modeling prosodic dynamics for speaker recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume IV, pages 788–791, 2003.Google Scholar
  14. B. Yegnanarayana et. al. Combining evidence from source, suprasegmental and spectral features for a fixed-text speaker verification system. IEEE Transactions on Speech and Audio Processing, 13:575–582, 2005.CrossRefGoogle Scholar
  15. D. Reynolds et al. Supersid project: Exploiting high-level information for high-accuracy speaker recognition. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume IV, pages 784–787, April 2003.Google Scholar
  16. M. Wagner et al. An evaluation of ’commercial ofi-the-shelf’ speaker verification systems. In Proceedings of IEEE Odyssey, 2006.Google Scholar
  17. V. Ramasubramanian et. al. Text-dependent speaker-recognition systems based on one-pass dynamic programming algorithm. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 901–904, 2006.Google Scholar
  18. K. R. Farrell, R. J. Mammone, and K. T. Assaleh. Speaker recognition using neural networks and conventional classifiers. IEEE Transactions on Speech and Audio Processing, 2:194–205, 1994.CrossRefGoogle Scholar
  19. J. Fierrez-Aguilar, J. Ortega-Garcia, D. T. Toledano, and J. Gonzalez-Rodriguez. Biosec baseline corpus: A multimodal biometric database. Pattern Recognition, 40:1389–1392, 2007.CrossRefGoogle Scholar
  20. S. Furui. Cepstral analysis technique for automatic speaker verification. IEEE Transactions on Acoustics, Speech and Signal Processing, 29:254–272, 1981.CrossRefGoogle Scholar
  21. J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallet, N. L. Dahlgren, and V. Zue. Timit acoustic-phonetic continuous speech corpus (ldc93s1). http://www.ldc.upenn.edu.Google Scholar
  22. J. Gonzalez-Rodriguez, A. Drygajlo, D. Ramos-Castro, M. Garcia-Gomar, and J. Ortega-Garcia. Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition. Computer, Speech and Language, 20:331–335, 2006.CrossRefGoogle Scholar
  23. J. Gonzalez-Rodriguez, D. Ramos-Castro, D. T. Toledano, A. Montero-Asenjo, J. Gonzalez-Dominguez, I. Lopez-Moreno, J. Fierrez-Aguilar, D. Garcia-Romero, and J. Ortega-Garcia. Speaker recognition: the atvs-uam system at nist sre 05. IEEE AES Magazine, 22:15–21, 2007.Google Scholar
  24. A. O. Hatch, B. Peskin, and A. Stolcke. Improved phonetic speaker recognition using lattice decoding. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pages 165–168, 2005.Google Scholar
  25. H. Hermansky, B. Hanson, and H. Wakita. Perceptually based linear predictive analysis of speech. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 10, pages 509–512, 1985.Google Scholar
  26. H. Hermansky and N. Morgan. Rasta processing of speech. IEEE Transactions on Speech and Audio Processing, 2(4):578–589, October 1984.CrossRefGoogle Scholar
  27. X. Huang, A. Acero, and H.-W. Hon. Spoken Language Processing: A Guide to Theory, Algorithm and System Development. Prentice Hall PTR, 2001.Google Scholar
  28. F. Itakura. Line spectrum representation of linear predictive coeficients of speech signals. Journal of the Acoustical Society of America, 57:S35, 1975.CrossRefGoogle Scholar
  29. Sachin Kajarekar, Luciana Ferrer, Kemal Sonmez, Jing Zheng, Elizabeth Shriberg, and Andreas Stolcke. Modeling NERFs for speaker recognition. In Proceedings of IEEE Odyssey, pages 51–56, Toledo, Spain, June 2004.Google Scholar
  30. P. Kenny, G. Boulianne, and P. Dumouchel. Eigenvoice modeling with sparse training data. IEEE Transactions on Speech and Audio Processing, 13:345–354, 2005.CrossRefGoogle Scholar
  31. P. Kenny and P. Dumouchel. Disentangling speaker and channel efiects in speaker verification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 37–40, 2004.Google Scholar
  32. C. J. Leggetter and P. C. Woodland. Maximum likelihood linear regression for speaker adaptation of continuous density hidden markov models. Computer, Speech and Language, 9:171–185, 1995.CrossRefGoogle Scholar
  33. R. G. Leonard and G. Doddington. Tidigits (ldc93s10). http://www.ldc.upenn.edu.Google Scholar
  34. T. Matsui and S. Furui. Comparison of text-independent speaker recognition methods using vq-distortion and discrete/continuous hmms. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, pages 157–160, 1992.Google Scholar
  35. Nist speaker recognition evaluation. http://www.nist.gov/speech/tests/spk/.Google Scholar
  36. J. Ortega-Garcia, J. Bigun, D. Reynolds, and J. Gonzalez-Rodriguez. Authentication gets personal with biometrics. IEEE Signal Processing Magazine, 21:50–62, 2004.CrossRefGoogle Scholar
  37. Matejka Pavel, Schwarz Petr, Cernock Jan, and Chytil Pavel. Phonotactic language identification using high quality phoneme recognition. In Proceedings of InterSpeech, pages 2237–2240, 2005.Google Scholar
  38. CAVE Project. Cave-the european caller verification project. http://www.ptttelecom.nl/cave/.Google Scholar
  39. M. A. Przybocki, A. F. Martin, and A. N. Le. Nist speaker recognition evaluation chronicles part 2. In Proceedings of IEEE Odyssey, 2006.Google Scholar
  40. L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE, 77:257–286, 1989.Google Scholar
  41. L. R. Rabiner and R. W. Schafer. Digital Processing of Speech Signals. Prentice Hall, 1978.Google Scholar
  42. D. Ramos-Castro, J. Gonzalez-Rodriguez, and J. Ortega-Garcia. Likelihood ratio calibration in a transparent and testable forensic speaker recognition framework. In Proceedings of IEEE Odyssey, 2006.Google Scholar
  43. D. Reynolds, T. Quatieri, and R. Dunn. Speaker verification using adapted gaussian mixture models. Digital Signal Processing, 10:19–41, 2000.CrossRefGoogle Scholar
  44. D. A. Reynolds. Channel robust speaker verification via feature mapping. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, volume 2, pages 53–56, 2003.Google Scholar
  45. P. Rose. Forensic Speaker Identification. CRC, 1 edition, 2002.Google Scholar
  46. M. J. Saks and J. J. Koehler. The coming paradigm shift in forensic identification science. Science, 309:892–895, 2005.CrossRefGoogle Scholar
  47. B. Scholkopf, S. Kah-Kay, C.J.C. Burges, F. Girosi, P. Niyogi, T. Poggio, and V. Vapnik. Comparing support vector machines with gaussian kernels to radial basis function classifiers. IEEE Transactions on Signal Processing, 45:2758–2765, 1997.CrossRefGoogle Scholar
  48. Elizabeth Shriberg, Luciana Ferrer, Anand Venkataraman, and Sachin Kajarekar. SVM modeling of “SNERF-Grams” for speaker recognition. In Proc. Intl. Conf. Spoken Language Systems, pages 1409–1412, Jeju, Korea, October 2004.Google Scholar
  49. C. Soutar, D. Roberge, A. Stoianov, R. Gilroy, and B.V.K. Vijaya Kumar. Biometric encryption. (Online) http://www.bio-scrypt.com.Google Scholar
  50. K. N. Stevens. Acoustic Phonetics (Current Studies in Linguistics). The MIT Press, 2000.Google Scholar
  51. D. T. Toledano, R. Fernandez-Pozo, A. Hernandez-Trapote, and L. Hernandez-Gomez. Usability evaluation of multi-modal biometric verification systems. Interacting With Computers, 18:1101–1122, 2006.CrossRefGoogle Scholar
  52. D. T. Toledano, C. Fombella, J. Gonzalez-Rodriguez, and L. Hernandez-Gomez. On the relationship between phonetic modeling precision and phonetic speaker recognition accuracy. In Proceedings of InterSpeech, pages 1993–1996, 2005.Google Scholar
  53. D. T. Toledano, L. Hernandez-Gomez, and L. Villarrubia-Grande. Automatic phonetic segmentation. IEEE Transactions on Speech and Audio Processing, 11:617–625, 2003.CrossRefGoogle Scholar
  54. V. Wan and W. Campbell. Support vector machines for speaker verification and identification. In Proceedings of the IEEE Workshop on Neural Networks for Signal Processing, volume 2, pages 775–784, 2000.Google Scholar
  55. R. Woo, A. Park, and T. J. Hazen. The mit mobile device speaker verification corpus: data collection and preliminary experiments. In Proceedings of IEEE Odyssey, 2006.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2008

Authors and Affiliations

  • Joaquín González-Rodríguez
    • 1
  • Doroteo Torre Toledano
    • 1
  • Javier Ortega-García
    • 1
  1. 1.ATVS – UAM, Escuela Politécnica Superior,Universidad Autónoma de MadridMadrid

Personalised recommendations