Abstract
Probabilistic Linear Discriminant Analysis (PLDA) and the concept of i-vectors are state-of-the-art methods used in the speaker recognition. They are based on Factor Analysis, in which a data covariance matrix is decomposed in order to find a low dimensional representation of given feature vectors. More precisely, the Factor Analysis based methods seek for directions/subspaces in which the projected (overall/between/within) variance is highest. In order to train models related to individual methods, development speech corpora comprising various acoustic conditions are utilized. The higher are the variations in some of these acoustic conditions, the more will the model tend to reflect them. Strong data variations in some of the development corpora may suppress conditions present in other corpora. This can lead to poor recognition when acoustic variations in test conditions significantly differ. In this paper techniques alleviating such effects are investigated. The idea is to use several background and i-vector models related to different parts of development data so that several i-vectors are extracted, processed and handed over to the PLDA modelling. PLDA model is then used to utilize all the extracted information and provide the verification result.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Kenny, P.: Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms. Centre de Recherche Informatique de Montréal, CRIM (2006)
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing (2010)
Prince, S., Elder, J.: Probabilistic Linear Discriminant Analysis for Inferences About Identity. In: IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)
Dehak, N.: Discriminative and Generative Approaches for Long- and Short-term Speaker Characteristics Modeling: Application to Speaker Verification. Ph.D. thesis, École de Technologie Supérieure, Université du Québec (2009)
Scheffer, N., Lei, Y., Ferrer, L.: Factor Analysis Back Ends for MLLR Transforms in Speaker Recognition. In: Interspeech 2011, pp. 257–260 (2011)
Garcia-Romero, D., Zhou, X., Zotkin, D., Srinivasan, B., Luo, Y., Ganapathy, S., Thomas, S., et al.: The UMD-JHU 2011 speaker recognition system. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4229–4232 (2012)
Machlica, L., Zajc, Z.: Factor Analysis and Nuisance Attribute Projection Revisited. In: Interspeech 2012 (2012)
Machlica, L., Zajíc, Z.: An Efficient Implementation of Probabilistic Linear Discriminant Analysis. In: ICASSP 2013 (2013)
Senoussaoui, M., Kenny, P., Dehak, N., Dumouchel, P.: An i-vector Extractor Suitable for Speaker Recognition with both Microphone and Telephone Speech. In: Proc. IEEE Odyssey Workshop, Brno, Czech Republic (2010)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of I-vector Length Normalization in Speaker Recognition Systems. In: Interspeech, pp. 249–252 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Machlica, L. (2013). Dealing with Diverse Data Variances in Factor Analysis Based Methods. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_14
Download citation
DOI: https://doi.org/10.1007/978-3-319-01931-4_14
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01930-7
Online ISBN: 978-3-319-01931-4
eBook Packages: Computer ScienceComputer Science (R0)