Abstract
Information drawn from conversational speech can be useful for enabling intelligent interactions between humans and computers. Speaker information can be obtained from speech signals by performing Speaker Segmentation. In this paper, a method for Speaker Segmentation is presented to address the challenge of identifying speakers even when utterances are very short (0.5sec). This method, involving the selective use of feature vectors, experimentally reduced the relative error rates by 27–42% for groups of 2 to 16 speakers as compared to the conventional approach for Speaker Segmentation. Thus, this new approach offers a way to significantly improve speech-data classification and retrieval systems.
Chapter PDF
Similar content being viewed by others
Keywords
References
Park, J.-H., Yeom, K.-W., Ha, S., Park, M.-W., Kim, L.: An overview of intelligent responsive space in tangible space initiative technology. In: Proc. Internt. Workshop on the Tangible Space Initiative (3rd), pp. 523–531 (2006)
Busso, C., Hernanz, S., Chu, C.-W., Kwon, S., Lee, C., Georgiou, P.G., Cohen, I., Narayanan, S.: Smart room: participant and speaker localization and identification. In: Proc. IEEE Internat. Conf. on Acoustics, Speech, and Signal Processing, 2005, vol. 2, pp. 1117–1120 (2005)
Campbell, J.P.: Speaker recognition: A tutorial. Proc. IEEE 85, 1436–1462 (1997)
Kwon, S., Narayanan, S.: Unsupervised Speaker Indexing Using Generic Models. IEEE Trans. on Speech and Audio Processing 13(5), 1004–1013 (2005)
Nishida, M., Ariki, Y.: Speaker indexing for news articles, debates and drama in broadcasted TV programs. In: Proc. IEEE Internat. Conf. on Multimedia Computing and Systems, vol. 2, pp. 466-471 (1999)
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. on Speech Audio Processing 3(1), 334–337 (1995)
Kwon, S., Narayanan, S.: Robust speaker identification based on selective use of feature vectors. Pattern Recognition Letters 28, 85–89 (2007)
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals, pp. 476–489. Prentice Hall, Englewood Cliffs (1978)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kwon, S. (2007). Speaker Segmentation for Intelligent Responsive Space. In: Jacko, J.A. (eds) Human-Computer Interaction. HCI Intelligent Multimodal Interaction Environments. HCI 2007. Lecture Notes in Computer Science, vol 4552. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73110-8_41
Download citation
DOI: https://doi.org/10.1007/978-3-540-73110-8_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73108-5
Online ISBN: 978-3-540-73110-8
eBook Packages: Computer ScienceComputer Science (R0)