PLDA Speaker Verification with Limited Speech Data

Ridzik, Andrej; Rusko, Milan

doi:10.1007/978-3-319-23132-7_40

Andrej Ridzik⁷ &
Milan Rusko⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9319))

Included in the following conference series:

International Conference on Speech and Computer

1615 Accesses
1 Citations

Abstract

In some speaker verification applications the amount of data available for enrolment and verification can be limited. One of the aims of this paper is to study the impact of the volume of enrolment and verification data on the performance of the system. The second aim is focused on the improvement of the speaker verification using PLDA. The PLDA is generally used to model the speaker and channel variability in the i-vector space using data from several recording sessions. In our experiment, only data from single-session per speaker was available. Therefore, we divided the development recordings into shorter segments and these segments were treated as if they were recorded in different sessions. This approach does not model the inter-session speaker variability, nor the channel variability. However, we assumed that statistical modelling of the intra-session speaker variability could bring an improvement to the results of the verification. Different granularity of segmentation was studied at various amount of enrolment and verification data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Active speech data of the remaining 112 speakers was not used due to its insufficient length.

References

Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., Dumouchel, P.: Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, 6–10 September 2009, Brighton, United Kingdom, pp. 1559–1562 (2009)
Google Scholar
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Article Google Scholar
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, 27–31 August 2011, Florence, Italy, pp. 249–252 (2011)
Google Scholar
Kanagasundaram, A., Dean, D., Sridharan, S.: Improving PLDA speaker verification with limited development data. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, 4–9 May 2014, Florence, Italy, pp. 1665–1669 (2014)
Google Scholar
Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S.: PLDA based speaker recognition on short utterances. In: The Speaker and Language Recognition Workshop, Odyssey 2012, 25–28 June 2012, Singapore, pp. 28–33 (2012)
Google Scholar
Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: The Speaker and Language Recognition Workshop, Odyssey 2010, June 28–July 1, 2010, Brno, Czech Republic, p. 14 (2010)
Google Scholar
Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)
Article Google Scholar
Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)
Article Google Scholar
Pollk, P., Boudy, J., Choukri, K., Heuvel, H.V.D., Vicsi, K., Virag, A., Siemund, R., Majewski, W., Staroniewicz, P., Tropf, H., Kochanina, J., Ostroukhov, E., Rusko, M., Trnka, M.: SpeechDat(E) - Eastern European telephone speech databases. In: Workshop on Very Large Telephone Speech Databases, XLDB 2000, pp. 20–25 (2000)
Google Scholar
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlek, P., Qian, Y., Schwarz, P., Silovsk, J., Stemmer, G., Vesel, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)
Google Scholar
Rusko, M., Trnka, M., Darjaa, S.: MobilDat-SK - a mobile telephone extension to the SpeechDat-E SK telephone speech database in Slovak. In: Speech and Computer - 11th International Conference, SPECOM 2006, 25–29 June 2006, St. Petersburg, Russia, pp. 449–454 (2006)
Google Scholar

Download references

Acknowledgments

This research was supported by VEGA grant, number 2/0197/15.

Author information

Authors and Affiliations

Institute of Informatics, Slovak Academy of Sciences, Dúbravská Cesta 9, 845 07, Bratislava, Slovakia
Andrej Ridzik & Milan Rusko

Authors

Andrej Ridzik
View author publications
You can also search for this author in PubMed Google Scholar
Milan Rusko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Andrej Ridzik .

Editor information

Editors and Affiliations

SPIIRAS, Saint-Petersburg, Russia
Andrey Ronzhin
Moscow State Linguistic University, Moscow, Russia
Rodmonga Potapova
University of Patras, Patras, Greece
Nikos Fakotakis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ridzik, A., Rusko, M. (2015). PLDA Speaker Verification with Limited Speech Data. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-23132-7_40
Published: 04 September 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics