Abstract
This paper presents an approach to improve emotion recognition from spontaneous speech. We used a wrapper method to reduce an acoustic set of features and feature-level fusion to merge them with a set of linguistic ones. The proposed system was evaluated with the FAU Aibo Corpus. We considered the same emotion set that was proposed in the Interspeech 2009 Emotion Challenge. The main contribution of this work is the improvement, with the reduced set of features, of the results obtained in this Challenge and the combination of the best ones. We built this set with a selection of 28 acoustic and 5 linguistic features and concatenation of the feature vectors from an original set of 389 parameters.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Eyben, F., Wöllmer, M., Schuller, B.: openEAR - Introducing the Munich Open-Source Emotion and Affect Recognition Toolkit. In: 4th International HUMAINE Association Conference on Affective Computing and Intelligent Interaction 2009, Amsterdam, The Netherlands, pp. 576–581 (2009)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th International Joint Conference on Artificial Intelligence, pp. 1022–1029 (1993)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Annals of Statistics 26(2), 451–471 (1998)
Kim, Y., Street, N., Menczer, F.: Feature selection in data mining. In: Wang, J. (ed.) Data Mining: Opportunities and Challenges, pp. 80–105. Idea Group Publishing (2003)
Kockmann, M., Burget, L., Černocký, J.: Brno University of Technology System for Interspeech 2009 Emotion Challenge. In: 10th Annual Conference of the International Speech Communication Association, Brighton, UK, pp. 348–351 (2009)
Kostoulas, T., Ganchev, T., Lazaridis, A., Fakotakis, N.: Enhancing emotion recognition from speech through feature selection. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS, vol. 6231, pp. 338–344. Springer, Heidelberg (2010)
Landwehr, N., Hall, M., Frank, E.: Logistic model trees. Machine Learning 59(1-2), 161–205 (2005)
Lee, C.M., Narayanan, S.S.: Towards detecting emotions in spoken dialogs. IEEE Transactions on Speech and Audio Processing 13, 293–303 (2005)
Picard, R.W., Vyzas, E., Healey, J.: Toward machine emotional intelligence: Analysis of affective physiological state. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(10), 1175–1191 (2001)
Planet, S., Iriondo, I., Socoró, J.C., Monzo, C., Adell, J.: GTM-URL Contribution to the Interspeech 2009 Emotion Challenge. In: 10th Annual Conference of the International Speech Communication Association, Brighton, UK, pp. 316–319 (2009)
Platt, J.: Machines using Sequential Minimal Optimization. In: Schoelkopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods: Support Vector Learning. MIT Press (1998)
Rish, I.: An empirical study of the naive bayes classifier. In: IJCAI 2001 Workshop on Empirical Methods in Artificial Intelligence, vol. 3(22), pp. 41–46 (2001)
Schuller, B., Steidl, S., Batliner, A.: The interspeech 2009 emotion challenge. In: 10th Annual Conference of the International Speech Communication Association, Brighton, UK, pp. 312–315 (2009)
Schuller, B., Batliner, A., Steidl, S., Seppi, D.: Recognising realistic emotions and affect in speech: State of the art and lessons learnt from the first challenge. Speech Communication (in press, corrected proof, 2011)
Slaney, M., McRoberts, G.: Baby ears: a recognition system for affective vocalizations. In: 1998 IEEE International Conference on Acoustics Speech and Signal Processing, pp. 985–988 (1998)
Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus late fusion in semantic video analysis. In: 13th Annual ACM International Conference on Multimedia, pp. 399–402 (2005)
Steidl, S.: Automatic Classification of Emotion-Related User States in Spontaneous Children’s Speech. Logos Verlag (2009)
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 39–58 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Planet, S., Iriondo, I. (2011). Improving Spontaneous Children’s Emotion Recognition by Acoustic Feature Selection and Feature-Level Fusion of Acoustic and Linguistic Parameters. In: Travieso-González, C.M., Alonso-Hernández, J.B. (eds) Advances in Nonlinear Speech Processing. NOLISP 2011. Lecture Notes in Computer Science(), vol 7015. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25020-0_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-25020-0_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25019-4
Online ISBN: 978-3-642-25020-0
eBook Packages: Computer ScienceComputer Science (R0)