Skip to main content

Dealing with Diverse Data Variances in Factor Analysis Based Methods

  • Conference paper
  • 1191 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8113))

Abstract

Probabilistic Linear Discriminant Analysis (PLDA) and the concept of i-vectors are state-of-the-art methods used in the speaker recognition. They are based on Factor Analysis, in which a data covariance matrix is decomposed in order to find a low dimensional representation of given feature vectors. More precisely, the Factor Analysis based methods seek for directions/subspaces in which the projected (overall/between/within) variance is highest. In order to train models related to individual methods, development speech corpora comprising various acoustic conditions are utilized. The higher are the variations in some of these acoustic conditions, the more will the model tend to reflect them. Strong data variations in some of the development corpora may suppress conditions present in other corpora. This can lead to poor recognition when acoustic variations in test conditions significantly differ. In this paper techniques alleviating such effects are investigated. The idea is to use several background and i-vector models related to different parts of development data so that several i-vectors are extracted, processed and handed over to the PLDA modelling. PLDA model is then used to utilize all the extracted information and provide the verification result.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Kenny, P.: Joint Factor Analysis of Speaker and Session Variability: Theory and Algorithms. Centre de Recherche Informatique de Montréal, CRIM (2006)

    Google Scholar 

  2. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-End Factor Analysis for Speaker Verification. IEEE Transactions on Audio, Speech and Language Processing (2010)

    Google Scholar 

  3. Prince, S., Elder, J.: Probabilistic Linear Discriminant Analysis for Inferences About Identity. In: IEEE 11th International Conference on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  4. Dehak, N.: Discriminative and Generative Approaches for Long- and Short-term Speaker Characteristics Modeling: Application to Speaker Verification. Ph.D. thesis, École de Technologie Supérieure, Université du Québec (2009)

    Google Scholar 

  5. Scheffer, N., Lei, Y., Ferrer, L.: Factor Analysis Back Ends for MLLR Transforms in Speaker Recognition. In: Interspeech 2011, pp. 257–260 (2011)

    Google Scholar 

  6. Garcia-Romero, D., Zhou, X., Zotkin, D., Srinivasan, B., Luo, Y., Ganapathy, S., Thomas, S., et al.: The UMD-JHU 2011 speaker recognition system. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4229–4232 (2012)

    Google Scholar 

  7. Machlica, L., Zajc, Z.: Factor Analysis and Nuisance Attribute Projection Revisited. In: Interspeech 2012 (2012)

    Google Scholar 

  8. Machlica, L., Zajíc, Z.: An Efficient Implementation of Probabilistic Linear Discriminant Analysis. In: ICASSP 2013 (2013)

    Google Scholar 

  9. Senoussaoui, M., Kenny, P., Dehak, N., Dumouchel, P.: An i-vector Extractor Suitable for Speaker Recognition with both Microphone and Telephone Speech. In: Proc. IEEE Odyssey Workshop, Brno, Czech Republic (2010)

    Google Scholar 

  10. Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of I-vector Length Normalization in Speaker Recognition Systems. In: Interspeech, pp. 249–252 (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this paper

Cite this paper

Machlica, L. (2013). Dealing with Diverse Data Variances in Factor Analysis Based Methods. In: Železný, M., Habernal, I., Ronzhin, A. (eds) Speech and Computer. SPECOM 2013. Lecture Notes in Computer Science(), vol 8113. Springer, Cham. https://doi.org/10.1007/978-3-319-01931-4_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-01931-4_14

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-01930-7

  • Online ISBN: 978-3-319-01931-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics