Abstract
We present EmoVoice, a framework for emotional speech corpus and classifier creation and for offline as well as real-time online speech emotion recognition. The framework is intended to be used by non-experts and therefore comes with an interface to create an own personal or application specific emotion recogniser. Furthermore, we describe some applications and prototypes that already use our framework to track online emotional user states from voice information.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Ai, H., Litman, D.J., Forbes-Riley, K., Rotaru, M., Tetreault, J., Purandare, A.: Using system and user performance features to improve emotion detection in spoken tutoring dialogs. In: Proceedings of Interspeech 2006 — ICSLP, Pittsburgh, PA, USA (2006)
Batliner, A., Hacker, C., Steidl, S., Nöth, E., D’Arcy, S., Russell, M., Wong, M.: ”You stupid tin box” - children interacting with the AIBO robot: A cross-linguistic emotional speech corpus. In: Proceedings of the 4th International Conference of Language Resources and Evaluation LREC 2004, Lisbon, pp. 171–174 (2004)
Batliner, A., Steidl, S., Schuller, B., Seppi, D., Laskowski, K., Vogt, T., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: Combining efforts for improving automatic classification of emotional user states. In: Proc. IS-LTC 2006, Ljubljana, Slovenia (2006)
Boersma, P., Weenink, D.: Praat: doing phonetics by computer (version 4.5.15) [computer program] (2007) (Retrieved 24.02.2007) http://www.praat.org/
Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005a)
Burkhardt, F., van Ballegooy, M., Englert, R., Huber, R.: An emotion-aware voice portal. In: Electronic Speech Signal Processing Conference, Prague, Czech Republic (2005b)
Chang, C.-C., Lin, C.-J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
Charles, F., Lemercier, S., Vogt, T., Bee, N., Mancini, M., Urbain, J., Price, M., André, E., Pelachaud, C., Cavazza, M.: Affective interactive narrative in the callas project. In: Demo paper in Proceedings of the 4th International Conference on Virtual Storytelling, Saint Malo, France (2007)
de Rosis, F., Pelachaud, C., Poggi, I., Carofiglio, V., de Carolis, B.: From Greta’s mind to her face: modelling the dynamics of affective states in a conversational embodied agent. International Journal of Human-Computer Studies 59, 81–118 (2003)
Fink, G.: Developing HMM-based recognizers with esmeralda. In: Matoušek, V., et al. (eds.). Lecture notes in Artificial Intelligence, vol. 1962, pp. 229–234. Springer, Heidelberg (1999)
Gilroy, S.W., Cavazza, M., Chaignon, R., Mäkelä, S.-M., Niiranen, M., André, E., Vogt, T., Billinghurst, M., Seichter, H., Benayoun, M.: An emotionally responsive AR art installation. In: Proceedings of ISMAR Workshop 2: Mixed Reality Entertainment and Art, Nara, Japan (2007)
Hall, M.A.: Correlation-based feature subset selection for machine learning. Master’s thesis, University of Waikato, New Zealand (1998)
Hegel, F., Spexard, T., Vogt, T., Horstmann, G., Wrede, B.: Playing a different imitation game: Interaction with an empathic android robot. In: Proc. 2006 IEEE-RAS International Conference on Humanoid Robots (Humanoids 2006) (2006)
Jones, C., Deeming, A.: Affective human-robotic interaction. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868. Springer, Heidelberg (2007)
Jones, C., Sutherland, J.: Acoustic emotion recognition for affective computer gaming. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868. Springer, Heidelberg (2007)
Kim, J., André, E., Rehm, M., Vogt, T., Wagner, J.: Integrating information from speech and physiological signals to achieve emotional sensitivity. In: Proceedings of Interspeech 2005, Lisbon, Portugal (2005)
Madan, A.: Jerk-O-Meter: Speech-Feature Analysis Provides Feedback on Your Phone Interactions (2005) (retrieved: 28.06.2007), http://www.media.mit.edu/press/jerk-o-meter/
Oudeyer, P.-Y.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies 59(1–2), 157–183 (2003)
Rehm, M., Vogt, T., Wissner, M., Bee, N.: Dancing the night away — controlling a virtual karaoke dancer by multimodal expressive cues. In: Proceedings of AAMAS 2008 (2008)
Schiel, F., Steininger, S., Türk, U.: The SmartKom multimodal corpus at BAS. In: Proceedings of the 3rd Language Resources & Evaluation Conference (LREC) 2002, Las Palmas, Gran Canaria, Spain, pp. 200–206 (2002)
Schuller, B., Rigoll, G., Grimm, M., Kroschel, K., Moosmayr, T., Ruske, G.: Effects of in-car noise-conditions on the recognition of emotion within speech. In: Proc. of the DAGA 2007, Stuttgart, Germany (2007a)
Schuller, B., Seppi, D., Batliner, A., Maier, A., Steidl, S.: Towards more reality in the recognition of emotional speech. In: IEEE (ed.) Proc. ICASSP 2007, Honolulu, Hawaii, USA, vol. 2, pp. 941–944 (2007b)
Velten, E.: A laboratory task for induction of mood states. Behavior Research & Therapy 6, 473–482 (1968)
Vogt, T., André, E.: Comparing feature sets for acted and spontaneous speech in view of automatic emotion recognition. In: Proceedings of International Conference on Multimedia & Expo, Amsterdam, The Netherlands (2005)
Vogt, T., André, E.: Improving automatic emotion recognition from speech via gender differentiation. In: Proc. Language Resources and Evaluation Conference (LREC 2006), Genoa (2006)
Vogt, T., André, E., Wagner, J.: Automatic recognition of emotions from speech: a review of the literature and recommendations for practical realisation. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868. Springer, Heidelberg (2007)
Wilting, J., Krahmer, E., Swerts, M.: Real vs. acted emotional speech. In: Proceedings of Interspeech 2006 — ICSLP, Pittsburgh, PA, USA (2006)
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools with Java implementations, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Vogt, T., André, E., Bee, N. (2008). EmoVoice — A Framework for Online Recognition of Emotions from Voice. In: André, E., Dybkjær, L., Minker, W., Neumann, H., Pieraccini, R., Weber, M. (eds) Perception in Multimodal Dialogue Systems. PIT 2008. Lecture Notes in Computer Science(), vol 5078. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69369-7_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-69369-7_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69368-0
Online ISBN: 978-3-540-69369-7
eBook Packages: Computer ScienceComputer Science (R0)