Robust Speaker Identification in a Meeting with Short Audio Segments
The paper proposes a speaker identification scheme for a meeting scenario, that is able to answer the question “is somebody currently talking?”, if yes, “who is it?”. The suggested system has been designed to identify during a meeting conversation the current speaker from a set of pre-trained speaker models. Experimental results on two databases show the robustness of the approach to the overlapping phenomena and the ability of the algorithm to correctly identify a speaker with short audio segments.
KeywordsSpeaker identification Meeting conversation Speaker diarization Overlapping speech
- 1.Biagetti, G., Crippa, P., Curzi, A., Orcioni, S., Turchetti, C.: Speaker identification with short sequences of speech frames. In: Proceedings of the International Conference on Pattern Recognition Applications and Methods, pp. 178–185 (2015)Google Scholar
- 2.Carletta, J., Ashby, S., Bourban, S., Flynn, M., Guillemot, M., Hain, T., Kadlec, J., Karaiskos, V., Kraaij, W., Kronenthal, M., et al.: The AMI meeting corpus: a pre-announcement. Springer, Berlin (2005)Google Scholar
- 3.Friedland, G., Vinyals, O.: Live speaker identification in conversations. In: Proceedings of the 16th ACM International Conference on Multimedia, pp. 1017–1018. ACM (2008)Google Scholar
- 4.Galibert, O.: Methodologies for the evaluation of speaker diarization and automatic speech recognition in the presence of overlapping speech. In: Proceedings of INTERSPEECH, pp. 1131–1134 (2013)Google Scholar
- 8.Luque, J., Hernando, J.: Robust speaker identification for meetings: UPC CLEAR’07 meeting room evaluation system. In: Multimodal Technologies for Perception of Humans, pp. 266–275. Springer (2008)Google Scholar
- 9.NIST: 2000 speaker recognition evaluation—evaluation plan. (2000). http://www.itl.nist.gov/iad/mig/tests/spk/2000/spk-2000-plan-v1.0.htm
- 12.Singh-Miller, N., Collins, M., Hazen, T.J.: Dimensionality reduction for speech recognition using neighborhood components analysis. In: Proceedings of INTERSPEECH, pp. 1158–1161 (2007)Google Scholar
- 14.Yella, S.H., Bourlard, H.: Overlapping speech detection using long-term conversational features for speaker diarization in meeting room conversations. IEEE/ACM Trans. Audio Speech Lang. Process. 22(12), 1688–1700 (2014)Google Scholar
Open Access This chapter is licensed under the terms of the Creative Commons Attribution-NonCommercial 2.5 International License (http://creativecommons.org/licenses/by-nc/2.5/), which permits any noncommercial use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.