Abstract
This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the interaction and support the user in his or her activity. The proposed system, which we named Gaze-X, is based on sensing and interpretation of the human part of the computer’s context, known as W5+ (who, where, what, when, why, how). It integrates a number of natural human communicative modalities including speech, eye gaze direction, face and facial expression, and a number of standard HCI modalities like keystrokes, mouse movements, and active software identification, which, in turn, are fed into processes that provide decision making and adapt the HCI to support the user in his or her activity according to his or her preferences. A usability study conducted in an office scenario with a number of users indicates that Gaze-X is perceived as effective, easy to use, useful, and affectively qualitative.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bartsch-Sporl, B., Lez, M., Hubner, A.: Case-based reasoning – survey and future directions. In: Puppe, F. (ed.) XPS 1999. LNCS (LNAI), vol. 1570, pp. 67–89. Springer, Heidelberg (1999)
Bianchi-Berthouze, N., Lisetti, C.L.: Modeling multimodal expression of user’s affective subjective experience. User Modeling and User-Adapted Interaction 12(1), 49–84 (2002)
Bolt, R.A.: Put-that-there: Voice and gesture at the graphics interface. Computer Graphics (Proc. ACM SIGGRAPH’80) 14(3), 262–270 (1980)
Browne, D., Norman, M., Riches, D.: Why Build Adaptive Interfaces? In: Browne, D., Totterdell, P., Norman, M. (eds.) Adaptive User Interfaces, pp. 15–57. Academic Press, London (1990)
Caridakis, G., et al.: Modeling Naturalistic Affective States via Facial and Vocal Expressions Recognition. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 146–154 (2006)
Cohen, P., Oviatt, S.L.: The role of voice input for human-machine communication. Proc. National Academy of Sciences 92, 9921–9927 (1995)
Conati, C.: Probabilistic assessment of user’s emotions in educational games. Applied Artificial Intelligence 16(7-8), 555–575 (2002)
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Analysis and Machine Intelligence 23(6), 681–685 (2001)
Deng, B.L., Huang, X.: Challenges in adopting speech recognition. Communications of the ACM 47(1), 69–75 (2004)
Dey, A.K., Abowd, G.D., Salber, D.: A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. J. Human-Computer Interaction 16(2-4), 97–166 (2001)
Duric, Z., et al.: Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction. Proceedings of the IEEE 90(7), 1272–1289 (2002)
El Kaliouby, R., Robinson, P.: Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures. Proc. Int’l Conf. Computer Vision & Pattern Recognition 3, 154 (2004)
Gu, H., Ji, Q.: An automated face reader for fatigue detection. In: Proc. Int’l Conf. Face & Gesture Recognition, pp. 111–116 (2006)
Gunes, H., Piccardi, M.: Affect Recognition from Face and Body: Early Fusion vs. Late Fusion. In: Proc. Int’l Conf. Systems, Man and Cybernetics, pp. 3437–3443 (2005)
Hauptmann, A.G.: Speech and gestures for graphic image manipulation. In: Proc. ACM Int’l Conf. Human Factors in Computing Systems, pp. 241–245 (1989)
Hoffman, D.L., Novak, T.P., Venkatesh, A.: Has the Internet become indispensable? Communications of the ACM 47(7), 37–42 (2004)
Hudlicka, E.: To feel or not to feel: The role of affect in human-computer interaction. Int’l J. Human-Computer Studies 59(1-2), 1–32 (2003)
Hudlicka, E., McNeese, M.D.: Assessment of user affective/belief states for interface adaptation. User Modeling & User-Adapted Interaction 12(1), 1–47 (2002)
Jaimes, A., Sebe, N.: Multimodal human computer interaction: A survey. In: Proc. IEEE ICCV Int’l Workshop on HCI in conjunction with Int’l Conf. Computer Vision (2005)
Kapoor, A., Picard, R.W.: Multimodal affect recognition in learning environments. In: Proc. ACM Int’l Conf. Multimedia, pp. 677–682 (2005)
Keltner, D., Ekman, P.: Facial expression of emotion. In: Lewis, M., Haviland-Jones, J.M. (eds.) Handbook of Emotions, pp. 236–249. The Guilford Press, New York (2000)
van Kuilenburg, H., Wiering, M., den Uyl, M.: A model-based method for automatic facial expression recognition. In: Gama, J., et al. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 194–205. Springer, Heidelberg (2005)
Lisetti, C.L., Nasoz, F.: MAUI: A multimodal affective user interface. In: Proc. Int’l Conf. Multimedia, pp. 161–170 (2002)
Lisetti, C.L., Schiano, D.J.: Automatic facial expression interpretation: Where human-computer interaction, AI and cognitive science intersect. Pragmatics and Cognition 8(1), 185–235 (2000)
Maat, L., Pantic, M.: Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 171–178 (2006)
Maglio, P.P., et al.: SUITOR: An attentive information system. In: Proc. Int’l Conf. Intelligent User Interfaces, pp. 169–176 (2000)
Marsic, I., Medl, A., Flanagan, J.: Natural communication with information systems. Proceedings of the IEEE 88(8), 1354–1366 (2000)
Nielsen, J.: Multimedia and hypertext: The Internet and beyond. Academic Press, Cambridge (1995)
Nock, H.J., Iyengar, G., Neti, C.: Multimodal processing by finding common cause. Communications of the ACM 47(1), 51–56 (2004)
Oviatt, S.: User-centred modelling and evaluation of multimodal interfaces. Proceedings of the IEEE 91(9), 1457–1468 (2003)
Oviatt, S., Darrell, T., Flickner, M.: Multimodal Interfaces that Flex, Adapt, and Persist. Communications of the ACM 47(1), 30–33 (2004)
Pantic, M., Grootjans, R.J., Zwitserloot, R.: Teaching Ad-hoc Networks using a Simple Agent Framework. In: Proc. Int’l Conf. Information Technology Based Higher Education and Training, pp. 6–11 (2005)
Pantic, M., et al.: Human computing and machine understanding of human behaviour: A survey. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 239–248 (2006)
Pantic, M., Rothkrantz, L.J.M.: Toward an affect-sensitive multimodal human-computer interaction. Proceedings of the IEEE 91(9), 1370–1390 (2003)
Pantic, M., Rothkrantz, L.J.M.: Case-based reasoning for user-profiled recognition of emotions from face images. In: Proc. Int’l Conf. Multimedia and Expo, pp. 391–394 (2004)
Pantic, M., et al.: Affective multimodal human-computer interaction. In: Proc. ACM Int’l Conf. Multimedia, pp. 669–676 (2005)
Pentland, A.: Looking at people: Sensing for ubiquitous and wearable computing. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1), 107–119 (2000)
Pentland, A.: Perceptual intelligence. Communications of the ACM 43(3), 35–44 (2000)
Picard, R.W.: Affective Computing. The MIT Press, Cambridge (1997)
Porta, M.: Vision-based user interfaces: methods and applications. Int’l J. Human-Computer Studies 57(1), 27–73 (2002)
Preece, J., Schneiderman, B.: Survival of the fittest: Evolution of multimedia user interfaces. ACM Computing Surveys 27(4), 557–559 (1995)
Prendinger, H., Ishizuka, M.: The empathic companion: A character-based interface that addresses users’ affective states. Applied Artificial Intelligence 19(3-4), 267–285 (2005)
Reeves, L.M., et al.: Guidelines for multimodal user interface design. Communications of the ACM 47(1), 57–59 (2004)
Ruttkay, Z., Pelachaud, C. (eds.): From brows to trust: Evaluating embodied conversational agents. Kluwer Academic Publishers, Norwell (2004)
Schank, R.C.: Memory based expert systems. AFOSR.TR. 84-0814, Comp. Science Dept., Yale University (1984)
Sharma, R., et al.: Speech-gesture driven multimodal interfaces for crisis management. Proceedings of the IEEE 91(9), 1327–1354 (2003)
Sharp, H., Rogers, Y., Preece, J.: Interaction Design, 2nd edn. John Wiley & Sons, Chichester (2007)
Shneiderman, B.: Universal usability. Communications of the ACM 43(5), 85–91 (2000)
Shneiderman, B.: CUU: Bridging the Digital Divide with Universal Usability. ACM Interactions 8(2), 11–15 (2001)
Shoham, Y.: What we talk about when we talk about software agents. IEEE Intelligent Systems and Their Applications 14(2), 28–31 (1999)
Tennenhouse, D.: Proactive computing. Communications of the ACM 43(5), 43–50 (2000)
Turk, M.: Computer vision in the interface. Communications of the ACM 47(1), 61–67 (2004)
Vo, M.T., Waibel, A.: Multimodal human-computer interaction. In: Proc. Int’l Symposium on Spoken Dialogue (1993)
Waibel, A., et al.: Multimodal Interfaces. Artificial Intelligence Review 10(3-4), 299–319 (1995)
Zeng, Z., et al.: Audio-Visual Emotion Recognition in Adult Attachment Interview. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 139–145 (2006)
Zhang, P., Li, N.: The importance of affective quality. Communications of the ACM 48(9), 105–108 (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Maat, L., Pantic, M. (2007). Gaze-X: Adaptive, Affective, Multimodal Interface for Single-User Office Scenarios. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds) Artifical Intelligence for Human Computing. Lecture Notes in Computer Science(), vol 4451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72348-6_13
Download citation
DOI: https://doi.org/10.1007/978-3-540-72348-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72346-2
Online ISBN: 978-3-540-72348-6
eBook Packages: Computer ScienceComputer Science (R0)