Abstract
In some speaker verification applications the amount of data available for enrolment and verification can be limited. One of the aims of this paper is to study the impact of the volume of enrolment and verification data on the performance of the system. The second aim is focused on the improvement of the speaker verification using PLDA. The PLDA is generally used to model the speaker and channel variability in the i-vector space using data from several recording sessions. In our experiment, only data from single-session per speaker was available. Therefore, we divided the development recordings into shorter segments and these segments were treated as if they were recorded in different sessions. This approach does not model the inter-session speaker variability, nor the channel variability. However, we assumed that statistical modelling of the intra-session speaker variability could bring an improvement to the results of the verification. Different granularity of segmentation was studied at various amount of enrolment and verification data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Active speech data of the remaining 112 speakers was not used due to its insufficient length.
References
Dehak, N., Dehak, R., Kenny, P., Brümmer, N., Ouellet, P., Dumouchel, P.: Support vector machines versus fast scoring in the low-dimensional total variability space for speaker verification. In: 10th Annual Conference of the International Speech Communication Association, INTERSPEECH 2009, 6–10 September 2009, Brighton, United Kingdom, pp. 1559–1562 (2009)
Dehak, N., Kenny, P., Dehak, R., Dumouchel, P., Ouellet, P.: Front-end factor analysis for speaker verification. IEEE Trans. Audio Speech Lang. Process. 19(4), 788–798 (2011)
Garcia-Romero, D., Espy-Wilson, C.Y.: Analysis of i-vector length normalization in speaker recognition systems. In: 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, 27–31 August 2011, Florence, Italy, pp. 249–252 (2011)
Kanagasundaram, A., Dean, D., Sridharan, S.: Improving PLDA speaker verification with limited development data. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2014, 4–9 May 2014, Florence, Italy, pp. 1665–1669 (2014)
Kanagasundaram, A., Vogt, R., Dean, D., Sridharan, S.: PLDA based speaker recognition on short utterances. In: The Speaker and Language Recognition Workshop, Odyssey 2012, 25–28 June 2012, Singapore, pp. 28–33 (2012)
Kenny, P.: Bayesian speaker verification with heavy-tailed priors. In: The Speaker and Language Recognition Workshop, Odyssey 2010, June 28–July 1, 2010, Brno, Czech Republic, p. 14 (2010)
Kenny, P., Ouellet, P., Dehak, N., Gupta, V., Dumouchel, P.: A study of interspeaker variability in speaker verification. IEEE Trans. Audio Speech Lang. Process. 16(5), 980–988 (2008)
Kinnunen, T., Li, H.: An overview of text-independent speaker recognition: from features to supervectors. Speech Commun. 52(1), 12–40 (2010)
Pollk, P., Boudy, J., Choukri, K., Heuvel, H.V.D., Vicsi, K., Virag, A., Siemund, R., Majewski, W., Staroniewicz, P., Tropf, H., Kochanina, J., Ostroukhov, E., Rusko, M., Trnka, M.: SpeechDat(E) - Eastern European telephone speech databases. In: Workshop on Very Large Telephone Speech Databases, XLDB 2000, pp. 20–25 (2000)
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlek, P., Qian, Y., Schwarz, P., Silovsk, J., Stemmer, G., Vesel, K.: The Kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition and Understanding (2011)
Rusko, M., Trnka, M., Darjaa, S.: MobilDat-SK - a mobile telephone extension to the SpeechDat-E SK telephone speech database in Slovak. In: Speech and Computer - 11th International Conference, SPECOM 2006, 25–29 June 2006, St. Petersburg, Russia, pp. 449–454 (2006)
Acknowledgments
This research was supported by VEGA grant, number 2/0197/15.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Ridzik, A., Rusko, M. (2015). PLDA Speaker Verification with Limited Speech Data. In: Ronzhin, A., Potapova, R., Fakotakis, N. (eds) Speech and Computer. SPECOM 2015. Lecture Notes in Computer Science(), vol 9319. Springer, Cham. https://doi.org/10.1007/978-3-319-23132-7_40
Download citation
DOI: https://doi.org/10.1007/978-3-319-23132-7_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23131-0
Online ISBN: 978-3-319-23132-7
eBook Packages: Computer ScienceComputer Science (R0)