Gaze-X: Adaptive, Affective, Multimodal Interface for Single-User Office Scenarios

Maat, Ludo; Pantic, Maja

doi:10.1007/978-3-540-72348-6_13

Ludo Maat¹ &
Maja Pantic^2,3

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4451))

2385 Accesses
18 Citations

Abstract

This paper describes an intelligent system that we developed to support affective multimodal human-computer interaction (AMM-HCI) where the user’s actions and emotions are modeled and then used to adapt the interaction and support the user in his or her activity. The proposed system, which we named Gaze-X, is based on sensing and interpretation of the human part of the computer’s context, known as W5+ (who, where, what, when, why, how). It integrates a number of natural human communicative modalities including speech, eye gaze direction, face and facial expression, and a number of standard HCI modalities like keystrokes, mouse movements, and active software identification, which, in turn, are fed into processes that provide decision making and adapt the HCI to support the user in his or her activity according to his or her preferences. A usability study conducted in an office scenario with a number of users indicates that Gaze-X is perceived as effective, easy to use, useful, and affectively qualitative.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bartsch-Sporl, B., Lez, M., Hubner, A.: Case-based reasoning – survey and future directions. In: Puppe, F. (ed.) XPS 1999. LNCS (LNAI), vol. 1570, pp. 67–89. Springer, Heidelberg (1999)
Chapter Google Scholar
Bianchi-Berthouze, N., Lisetti, C.L.: Modeling multimodal expression of user’s affective subjective experience. User Modeling and User-Adapted Interaction 12(1), 49–84 (2002)
Article MATH Google Scholar
Bolt, R.A.: Put-that-there: Voice and gesture at the graphics interface. Computer Graphics (Proc. ACM SIGGRAPH’80) 14(3), 262–270 (1980)
Article MathSciNet Google Scholar
Browne, D., Norman, M., Riches, D.: Why Build Adaptive Interfaces? In: Browne, D., Totterdell, P., Norman, M. (eds.) Adaptive User Interfaces, pp. 15–57. Academic Press, London (1990)
Google Scholar
Caridakis, G., et al.: Modeling Naturalistic Affective States via Facial and Vocal Expressions Recognition. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 146–154 (2006)
Google Scholar
Cohen, P., Oviatt, S.L.: The role of voice input for human-machine communication. Proc. National Academy of Sciences 92, 9921–9927 (1995)
Article Google Scholar
Conati, C.: Probabilistic assessment of user’s emotions in educational games. Applied Artificial Intelligence 16(7-8), 555–575 (2002)
Article Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. Pattern Analysis and Machine Intelligence 23(6), 681–685 (2001)
Article Google Scholar
Deng, B.L., Huang, X.: Challenges in adopting speech recognition. Communications of the ACM 47(1), 69–75 (2004)
Article Google Scholar
Dey, A.K., Abowd, G.D., Salber, D.: A conceptual framework and a toolkit for supporting the rapid prototyping of context-aware applications. J. Human-Computer Interaction 16(2-4), 97–166 (2001)
Article Google Scholar
Duric, Z., et al.: Integrating perceptual and cognitive modeling for adaptive and intelligent human-computer interaction. Proceedings of the IEEE 90(7), 1272–1289 (2002)
Article Google Scholar
El Kaliouby, R., Robinson, P.: Real-Time Inference of Complex Mental States from Facial Expressions and Head Gestures. Proc. Int’l Conf. Computer Vision & Pattern Recognition 3, 154 (2004)
Google Scholar
Gu, H., Ji, Q.: An automated face reader for fatigue detection. In: Proc. Int’l Conf. Face & Gesture Recognition, pp. 111–116 (2006)
Google Scholar
Gunes, H., Piccardi, M.: Affect Recognition from Face and Body: Early Fusion vs. Late Fusion. In: Proc. Int’l Conf. Systems, Man and Cybernetics, pp. 3437–3443 (2005)
Google Scholar
Hauptmann, A.G.: Speech and gestures for graphic image manipulation. In: Proc. ACM Int’l Conf. Human Factors in Computing Systems, pp. 241–245 (1989)
Google Scholar
Hoffman, D.L., Novak, T.P., Venkatesh, A.: Has the Internet become indispensable? Communications of the ACM 47(7), 37–42 (2004)
Article Google Scholar
Hudlicka, E.: To feel or not to feel: The role of affect in human-computer interaction. Int’l J. Human-Computer Studies 59(1-2), 1–32 (2003)
Article Google Scholar
Hudlicka, E., McNeese, M.D.: Assessment of user affective/belief states for interface adaptation. User Modeling & User-Adapted Interaction 12(1), 1–47 (2002)
Article MATH Google Scholar
Jaimes, A., Sebe, N.: Multimodal human computer interaction: A survey. In: Proc. IEEE ICCV Int’l Workshop on HCI in conjunction with Int’l Conf. Computer Vision (2005)
Google Scholar
Kapoor, A., Picard, R.W.: Multimodal affect recognition in learning environments. In: Proc. ACM Int’l Conf. Multimedia, pp. 677–682 (2005)
Google Scholar
Keltner, D., Ekman, P.: Facial expression of emotion. In: Lewis, M., Haviland-Jones, J.M. (eds.) Handbook of Emotions, pp. 236–249. The Guilford Press, New York (2000)
Google Scholar
van Kuilenburg, H., Wiering, M., den Uyl, M.: A model-based method for automatic facial expression recognition. In: Gama, J., et al. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 194–205. Springer, Heidelberg (2005)
Chapter Google Scholar
Lisetti, C.L., Nasoz, F.: MAUI: A multimodal affective user interface. In: Proc. Int’l Conf. Multimedia, pp. 161–170 (2002)
Google Scholar
Lisetti, C.L., Schiano, D.J.: Automatic facial expression interpretation: Where human-computer interaction, AI and cognitive science intersect. Pragmatics and Cognition 8(1), 185–235 (2000)
Article Google Scholar
Maat, L., Pantic, M.: Gaze-X: Adaptive affective multimodal interface for single-user office scenarios. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 171–178 (2006)
Google Scholar
Maglio, P.P., et al.: SUITOR: An attentive information system. In: Proc. Int’l Conf. Intelligent User Interfaces, pp. 169–176 (2000)
Google Scholar
Marsic, I., Medl, A., Flanagan, J.: Natural communication with information systems. Proceedings of the IEEE 88(8), 1354–1366 (2000)
Article Google Scholar
Nielsen, J.: Multimedia and hypertext: The Internet and beyond. Academic Press, Cambridge (1995)
Google Scholar
Nock, H.J., Iyengar, G., Neti, C.: Multimodal processing by finding common cause. Communications of the ACM 47(1), 51–56 (2004)
Article Google Scholar
Oviatt, S.: User-centred modelling and evaluation of multimodal interfaces. Proceedings of the IEEE 91(9), 1457–1468 (2003)
Article Google Scholar
Oviatt, S., Darrell, T., Flickner, M.: Multimodal Interfaces that Flex, Adapt, and Persist. Communications of the ACM 47(1), 30–33 (2004)
Article Google Scholar
Pantic, M., Grootjans, R.J., Zwitserloot, R.: Teaching Ad-hoc Networks using a Simple Agent Framework. In: Proc. Int’l Conf. Information Technology Based Higher Education and Training, pp. 6–11 (2005)
Google Scholar
Pantic, M., et al.: Human computing and machine understanding of human behaviour: A survey. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 239–248 (2006)
Google Scholar
Pantic, M., Rothkrantz, L.J.M.: Toward an affect-sensitive multimodal human-computer interaction. Proceedings of the IEEE 91(9), 1370–1390 (2003)
Article Google Scholar
Pantic, M., Rothkrantz, L.J.M.: Case-based reasoning for user-profiled recognition of emotions from face images. In: Proc. Int’l Conf. Multimedia and Expo, pp. 391–394 (2004)
Google Scholar
Pantic, M., et al.: Affective multimodal human-computer interaction. In: Proc. ACM Int’l Conf. Multimedia, pp. 669–676 (2005)
Google Scholar
Pentland, A.: Looking at people: Sensing for ubiquitous and wearable computing. IEEE Trans. Pattern Analysis and Machine Intelligence 22(1), 107–119 (2000)
Article Google Scholar
Pentland, A.: Perceptual intelligence. Communications of the ACM 43(3), 35–44 (2000)
Article Google Scholar
Picard, R.W.: Affective Computing. The MIT Press, Cambridge (1997)
Google Scholar
Porta, M.: Vision-based user interfaces: methods and applications. Int’l J. Human-Computer Studies 57(1), 27–73 (2002)
Article Google Scholar
Preece, J., Schneiderman, B.: Survival of the fittest: Evolution of multimedia user interfaces. ACM Computing Surveys 27(4), 557–559 (1995)
Article Google Scholar
Prendinger, H., Ishizuka, M.: The empathic companion: A character-based interface that addresses users’ affective states. Applied Artificial Intelligence 19(3-4), 267–285 (2005)
Article Google Scholar
Reeves, L.M., et al.: Guidelines for multimodal user interface design. Communications of the ACM 47(1), 57–59 (2004)
Article Google Scholar
Ruttkay, Z., Pelachaud, C. (eds.): From brows to trust: Evaluating embodied conversational agents. Kluwer Academic Publishers, Norwell (2004)
MATH Google Scholar
Schank, R.C.: Memory based expert systems. AFOSR.TR. 84-0814, Comp. Science Dept., Yale University (1984)
Google Scholar
Sharma, R., et al.: Speech-gesture driven multimodal interfaces for crisis management. Proceedings of the IEEE 91(9), 1327–1354 (2003)
Article Google Scholar
Sharp, H., Rogers, Y., Preece, J.: Interaction Design, 2nd edn. John Wiley & Sons, Chichester (2007)
Google Scholar
Shneiderman, B.: Universal usability. Communications of the ACM 43(5), 85–91 (2000)
Article Google Scholar
Shneiderman, B.: CUU: Bridging the Digital Divide with Universal Usability. ACM Interactions 8(2), 11–15 (2001)
Article Google Scholar
Shoham, Y.: What we talk about when we talk about software agents. IEEE Intelligent Systems and Their Applications 14(2), 28–31 (1999)
Article Google Scholar
Tennenhouse, D.: Proactive computing. Communications of the ACM 43(5), 43–50 (2000)
Article Google Scholar
Turk, M.: Computer vision in the interface. Communications of the ACM 47(1), 61–67 (2004)
Article Google Scholar
Vo, M.T., Waibel, A.: Multimodal human-computer interaction. In: Proc. Int’l Symposium on Spoken Dialogue (1993)
Google Scholar
Waibel, A., et al.: Multimodal Interfaces. Artificial Intelligence Review 10(3-4), 299–319 (1995)
Article Google Scholar
Zeng, Z., et al.: Audio-Visual Emotion Recognition in Adult Attachment Interview. In: Proc. Int’l Conf. Multimodal Interfaces, pp. 139–145 (2006)
Google Scholar
Zhang, P., Li, N.: The importance of affective quality. Communications of the ACM 48(9), 105–108 (2005)
Article Google Scholar

Download references

Author information

Authors and Affiliations

EEMCS, Delft University of Technology, Delft, The Netherlands
Ludo Maat
Computing Dept., Imperial Collge London, London, UK
Maja Pantic
EEMCS, University of Twente, Enschede, The Netherlands
Maja Pantic

Authors

Ludo Maat
View author publications
You can also search for this author in PubMed Google Scholar
Maja Pantic
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Thomas S. Huang Anton Nijholt Maja Pantic Alex Pentland

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Maat, L., Pantic, M. (2007). Gaze-X: Adaptive, Affective, Multimodal Interface for Single-User Office Scenarios. In: Huang, T.S., Nijholt, A., Pantic, M., Pentland, A. (eds) Artifical Intelligence for Human Computing. Lecture Notes in Computer Science(), vol 4451. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72348-6_13

Download citation

DOI: https://doi.org/10.1007/978-3-540-72348-6_13
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72346-2
Online ISBN: 978-3-540-72348-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics