Automatic Motherese Detection for Face-to-Face Interaction Analysis
- 859 Downloads
This paper deals with emotional speech detection in home movies. In this study, we focus on infant-directed speech also called “motherese” which is characterized by higher pitch, slower tempo, and exaggerated intonation. In this work, we show the robustness of approaches to automatic discrimination between infant-directed speech and normal directed speech. Specifically, we estimate the generalization capability of two feature extraction schemes extracted from supra-segmental and segmental information. In addition, two machine learning approaches are considered: k-nearest neighbors (k-NN) and Gaussian mixture models (GMM). Evaluations are carried out on real-life databases: home movies of the first year of an infant.
Keywordsmotherese detection feature and classifier fusion
Unable to display preview. Download preview PDF.
- 1.Muratori, F., Maestro, S.: Autism as a downstream effect of primary difficulties in intersubjectivity interacting with abnormal development of brain connectivity. International Journal for Dialogical Science Fall 2(1), 93–118 (2007)Google Scholar
- 4.Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals. In: Proceedings of Interspeech, pp. 2253–2256 (2007)Google Scholar
- 6.Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn (2000)Google Scholar
- 8.Kim, S., Georgiou, P., Lee, S., Narayanan, S.: Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In: IEEE International Workshop on Multimedia Signal Processing (October 2007)Google Scholar