Abstract
This paper introduces the integration of the face emotion recognition part and the voice emotion recognition part of our FILTWAM framework that uses webcams and microphones. This framework enables real-time multimodal emotion recognition of learners during game-based learning for triggering feedback towards improved learning. The main goal of this study is to validate the integration of webcam and microphone data for a real-time and adequate interpretation of facial and vocal expressions into emotional states where the software modules are calibrated with end users. This integration aims to improve timely and relevant feedback, which is expected to increase learners’ awareness of their own behavior. Twelve test persons received the same computer-based tasks in which they were requested to mimic specific facial and vocal expressions. Each test person mimicked 80 emotions, which led to a dataset of 960 emotions. All sessions were recorded on video. An overall accuracy of Kappa value based on the requested emotions, expert opinions, and the recognized emotions is 0.61, of the face emotion recognition software is 0.76, and of the voice emotion recognition software is 0.58. A multimodal fusion between the software modules can increase the accuracy to 78 %. In contrast with existing software our software modules allow real-time, continuously and unobtrusively monitoring of learners’ face expressions and voice intonations and convert these into emotional states. This inclusion of learner’s emotional states paves the way for more effective, efficient and enjoyable game-based learning.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Anaraki, F.: Developing an effective and efficient eLearning platform. Int. J. Comput. Internet Manage. 12(2), 57–63 (2004)
Hrastinski, S.: Asynchronous and synchronous e-learning. Educause Q. 31(4), 51–55 (2008)
Pekrun, R.: The impact of emotions on learning and achievement: towards a theory of cognitive/motivational mediators. J. Appl. Psychol. 41, 359–376 (1992)
Bahreini, K., Nadolski, R., Qi, W., Westera, W.: FILTWAM - a framework for online game-based communication skills training - using webcams and microphones for enhancing learner support. In: Felicia, P. (ed.) The 6th European Conference on Games Based Learning (ECGBL), pp. 39–48. Ireland, Cork (2012)
Bahreini, K., Nadolski, R., Westera, W.: FILTWAM and voice emotion recognition. In: De Gloria, A. (ed.) GALA 2013. LNCS, vol. 8605, pp. 116–129. Springer, Heidelberg (2014)
Bahreini, K., Nadolski, R., Westera, W.: FILTWAM - a framework for online affective computing in serious games. In: The 4th International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES 2012). Procedia Computer Science. Genoa, Italy. vol. 15:45–52 (2012)
Kelle, S., Sigurðarson, S., Westera, W., Specht, M.: Game-based life-long learning. In: Magoulas, G.D. (ed.) E-Infrastructures and Technologies for Lifelong Learning: Next Generation Environments, pp. 337–349. IGI Global, Hershey, PA (2011)
Reeves, B., Read, J.L.: Total Engagement: Using Games and Virtual Worlds to Change the Way People Work and Business Compete. Harvard Business Press, Boston (2009)
Gee, J.P.: What Video Games have to Teach us about Learning and Literacy. Palgrave Macmillan, New York (2003)
Connolly, T.M., Boyle, E.A., MacArthur, E., Hainey, T., Boyle, J.M.: A systematic literature review of empirical evidence on computer games and serious games. Comput. Educ. 59(2), 661–686 (2012)
Van Merrienboer, J.J.G., Kirschner, P.A.: Ten Steps to Complex Learning. A systematic approach to four-component instructional design. Routledge, New York (2007)
Hager, P.J., Hager, P., Halliday, J.: Recovering Informal Learning: Wisdom. Judgment and Community. Springer, Dordrecht (2006)
Nadolski, R.J., Hummel, H.G.K., Van den Brink, H.J., Hoefakker, R., Slootmaker, A., Kurvers, H., Storm, J.: EMERGO: methodology and toolkit for efficient development of serious games in higher education. Simul. Gaming 39(3), 338–352 (2008)
Bashyal, S., Venayagamoorthy, G.K.: Recognition of facial expressions using Gabor wavelets and learning vector quantization. Eng. Appl. Artif. Intell. 21(7), 1056–1064 (2008)
Ekman, P., Friesen, W.V.: Facial Action Coding System: Investigator’s Guide. Consulting Psychologists Press, Palo Alto (1978)
Kanade, T.: Picture processing system by computer complex and recognition of human faces. Ph.D. thesis. Kyoto University, Japan (1973)
Petta, P., Pelachaud, C., Cowie, R.: Emotion-Oriented Systems. The Humaine Handbook. Springer-Verlag, Berlin (2011)
Chen, L.S.: Joint Processing of Audio-visual Information for the Recognition of Emotional Expressions in Human-computer Interaction. University of Illinois at Urbana-Champaign. Ph.D. thesis (2000)
Sebe, N., Cohen, I.I., Gevers, T., Huang, T.S.: Emotion recognition based on joint visual and audio cues. In: International Conference on Pattern Recognition. Hong Kong, pp. 1136–1139 (2006)
Song, M., Bu, J., Chen, C., Li, N.: Audio-visual based emotion recognition: a new approach. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition vol. 2 (2004)
Zeng, Z., Pantic, M., Roisman, G.I., Huang, T.S.: A survey of affect recognition methods: Audio, visual, and spontaneous expressions. IEEE Trans. Pattern Anal. Mach. Intell. 31(1), 39–58 (2009)
Sebe, N.: Multimodal interfaces: challenges and perspectives. J. Am. Intell. Smart Environ. 1(1), 23–30 (2009)
Atrey, P.K., Hossain, M.A., El Saddik, A., Kankanhalli, M.: Multimodal fusion for multimedia analysis: a survey. Multimedia Syst. 16(6), 345–379 (2010). Springer-Verlag
Saragih, J., Lucey, S., Cohn, J.: Deformable model fitting by regularized landmark mean-shifts. Int. J. Comput. Vis. (IJCV), 91(2), 200–215 (2011)
Lang, G., van der Molen, H.T.: Psychologische Gespreksvoering. Open University of the Netherlands, Heerlen (2008)
Van der Molen, H.T., Gramsbergen-Hoogland, Y.H.: Communication in Organizations: Basic Skills and Conversation Models. ISBN 978-1-84169-556-3. Psychology Press, New York (2005)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33, 159–174 (1977)
Vogt, T., André, E., Bee, N.: EmoVoice – a framework for online recognition of emotions from voice. In: Proceedings of Workshop on Perception and Interactive Technologies for Speech-Based Systems (2008)
Dai, K., Harriet, J.F., MacAuslan, J.: Recognizing emotion in speech using neural networks. In: Telehealth and Assistive Technologies, pp. 31–38 (2008)
Acknowledgments
We thank our colleagues at Welten Institute of the Open University Netherlands who participated in the integration of the face and voice emotion recognition study. We likewise thank the two raters who helped us to rate the recorded streams. We also thank the Netherlands Laboratory for Lifelong Learning (NELLL) of the Open University Netherlands that sponsors this research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bahreini, K., Nadolski, R., Westera, W. (2015). Improved Multimodal Emotion Recognition for Better Game-Based Learning. In: De Gloria, A. (eds) Games and Learning Alliance. GALA 2014. Lecture Notes in Computer Science(), vol 9221. Springer, Cham. https://doi.org/10.1007/978-3-319-22960-7_11
Download citation
DOI: https://doi.org/10.1007/978-3-319-22960-7_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22959-1
Online ISBN: 978-3-319-22960-7
eBook Packages: Computer ScienceComputer Science (R0)