Skip to main content

PLDA Speaker Verification with Limited Speech Data

  • Conference paper
  • First Online:
Speech and Computer (SPECOM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9319))

Included in the following conference series:

Abstract

In some speaker verification applications the amount of data available for enrolment and verification can be limited. One of the aims of this paper is to study the impact of the volume of enrolment and verification data on the performance of the system. The second aim is focused on the improvement of the speaker verification using PLDA. The PLDA is generally used to model the speaker and channel variability in the i-vector space using data from several recording sessions. In our experiment, only data from single-session per speaker was available. Therefore, we divided the development recordings into shorter segments and these segments were treated as if they were recorded in different sessions. This approach does not model the inter-session speaker variability, nor the channel variability. However, we assumed that statistical modelling of the intra-session speaker variability could bring an improvement to the results of the verification. Different granularity of segmentation was studied at various amount of enrolment and verification data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Active speech data of the remaining 112 speakers was not used due to its insufficient length.

References

  1. Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., Dumouchel, P.: Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, 6–10 September 2009, Brighton, United Kingdom, pp. 1559–1562 (2009)

    Google Scholar 

  2. Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)

    Article  Google Scholar 

  3. Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, 27–31 August 2011, Florence, Italy, pp. 249–252 (2011)

    Google Scholar 

  4. Kanagasundaram, A., Dean, D., Sridharan, S.: Improving PLDA speaker verification with limited development data. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, 4–9 May 2014, Florence, Italy, pp. 1665–1669 (2014)

    Google Scholar 

  5. Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S.: PLDA based speaker recognition on short utterances. In: The Speaker and Language Recognition Workshop, Odyssey 2012, 25–28 June 2012, Singapore, pp. 28–33 (2012)

    Google Scholar 

  6. Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: The Speaker and Language Recognition Workshop, Odyssey 2010, June 28–July 1, 2010, Brno, Czech Republic, p. 14 (2010)

    Google Scholar 

  7. Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)

    Article  Google Scholar 

  8. Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)

    Article  Google Scholar 

  9. Pollk, P., Boudy, J., Choukri, K., Heuvel, H.V.D., Vicsi, K., Virag, A., Siemund, R., Majewski, W., Staroniewicz, P., Tropf, H., Kochanina, J., Ostroukhov, E., Rusko, M., Trnka, M.: SpeechDat(E) - Eastern European telephone speech databases. In: Workshop on Very Large Telephone Speech Databases, XLDB 2000, pp. 20–25 (2000)

    Google Scholar 

  10. Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlek, P., Qian, Y., Schwarz, P., Silovsk, J., Stemmer, G., Vesel, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)

    Google Scholar 

  11. Rusko, M., Trnka, M., Darjaa, S.: MobilDat-SK - a mobile telephone extension to the SpeechDat-E SK telephone speech database in Slovak. In: Speech and Computer - 11th International Conference, SPECOM 2006, 25–29 June 2006, St. Petersburg, Russia, pp. 449–454 (2006)

    Google Scholar 

Download references

Acknowledgments

This research was supported by VEGA grant, number 2/0197/15.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrej Ridzik .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Ridzik, A., Rusko, M. (2015). PLDA Speaker Verification with Limited Speech Data. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-23132-7_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-23131-0

  • Online ISBN: 978-3-319-23132-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics