Abstract
This paper focuses on the analysis of the i-vector paradigm, a compact representation of spoken utterances that is used by most of the state of the art speaker verification systems. This work was mainly motivated by the need to quantify the impact of their steps on the final performance, especially their ability to model data according to a theoretical Gaussian framework. These investigations allow to highlight the key points of the approach, in particular a core conditioning procedure, that lead to the success of the i-vector paradigm.
Chapter PDF
Similar content being viewed by others
Keywords
- Linear Discriminant Analysis
- Conditioning Procedure
- Equal Error Rate
- Speaker Recognition
- Speech Utterance
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech, and Language Processing 19, 788–798 (2011)
Reynolds, D.A.: A Gaussian mixture modeling approach to text-independent speaker identification. PhD thesis, Georgia Institute of Technology (1992)
Dehak, N., Dehak, R., Kenny, P., Brummer, N., Ouellet, P., Dumouchel, P.: Support Vector Machines versus Fast Scoring in the Low-Dimensional Total Variability Space for Speaker Verification. In: International Conference on Speech Communication and Technology, pp. 1559–1562 (2009)
Hatch, A.O., Kajarekar, S., Stolcke, A.: Within-Class Covariance Normalization for SVM-based Speaker Recognition. In: International Conference on Speech Communication and Technology, pp. 1471–1474 (2006)
Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: Speaker and Language Recognition Workshop, IEEE Odyssey (2010)
Prince, S.J., Elder, J.H.: Probabilistic linear discriminant analysis for inferences about identity. In: IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Bousquet, P.M., Larcher, A., Matrouf, D., Bonastre, J.F., Plchot, O.: Variance-Spectra based Normalization for I-vector Standard and Probabilistic Linear Discriminant Analysis. In: Speaker and Language Recognition Workshop, IEEE Odyssey (2012)
Bousquet, P.M., Matrouf, D., Bonastre, J.F.: Intersession compensation and scoring methods in the i-vectors space for speaker recognition. In: International Conference on Speech Communication and Technology, pp. 485–488 (2011)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: International Conference on Speech Communication and Technology, pp. 249–252 (2011)
Brummer, N., de Villiers, E.: The speaker partitioning problem. In: Speaker and Language Recognition Workshop, IEEE Odyssey (2010)
Campbell, W.M., Sturim, D., Borgstrom, B.J., Dunn, R., McCree, A., Quatieri, T.F., Reynolds, D.A.: Exploring the impact of advanced front-end processing on nist speaker recognition microphone tasks. In: Speaker and Language Recognition Workshop, IEEE Odyssey (2012)
Bonastre, J.F., Bousquet, P.M., Matrouf, D., Anguera, X.: Discriminant binary data representation for speaker recognition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, pp. 5284–5287 (2011)
Bonastre, J.F., Anguera, X., Sierra, G.H., Bousquet, P.M.: Speaker modeling using local binary decisions. In: International Conference on Speech Communication and Technology, pp. 485–488 (2011)
Kenny, P.: A small footprint i-vector extractor. In: Speaker and Language Recognition Workshop, IEEE Odyssey (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bousquet, PM., Bonastre, JF., Matrouf, D. (2013). Identify the Benefits of the Different Steps in an i-Vector Based Speaker Verification System. In: Ruiz-Shulcloper, J., Sanniti di Baja, G. (eds) Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. CIARP 2013. Lecture Notes in Computer Science, vol 8259. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41827-3_35
Download citation
DOI: https://doi.org/10.1007/978-3-642-41827-3_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-41826-6
Online ISBN: 978-3-642-41827-3
eBook Packages: Computer ScienceComputer Science (R0)