Face-Responsive Interfaces: From Direct Manipulation to Perceptive Presence

  • Trevor Darrell
  • Konrad Tollmar
  • Frank Bentley
  • Neal Checka
  • Loius-Phillipe Morency
  • Ali Rahimi
  • Alice Oh
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2498)


Systems for tracking faces using computer vision have recently become practical for human-computer interface applications. We are developing prototype systems for face-responsive interaction, exploring three different interface paradigms: direct manipulation, gazemediated agent dialog, and perceptually-driven remote presence. We consider the characteristics of these types of interactions, and assess the performance of our system on each application. We have found that face pose tracking is a potentially accurate means of cursor control and selection, is seen by users as a natural way to guide agent dialog interaction, and can be used to create perceptually-driven presence artefacts which convey real-time awareness of a remote space.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gaver B. and Martin H. Alternatives: Exploring information appliances through conceptual design proposals. In Proc. of CHI’2000, Den Haag,, 2000.Google Scholar
  2. 2.
    S. Basu, I.A. Essa, and A.P. Pentland. Motion regularization for model-based head tracking. In ICPR96, page C8A.3, 1996.Google Scholar
  3. 3.
    M.J. Black and Y. Yacoob. Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion. In ICCV95, pages 374–381, 1995.Google Scholar
  4. 4.
    V. Blanz and T. Vetter. A morphable model for the synthesis of 3d faces. In SIGGRAPH99, pages 187–194, 1999.Google Scholar
  5. 5.
    A.R. Bruss and B.K.P Horn. Passive navigation. In Computer Graphics and Image Processing, volume 21, pages 3–20, 1983.CrossRefGoogle Scholar
  6. 6.
    J. Cassell. Nudge nudge wink wink: Elements of face-to-face conversation for embodied conversational agents. In Embodied Conversational Agents, 2000.Google Scholar
  7. 7.
    M. Coen. Design principles for intelligent environments. In Fifteenth National Conference on Artificial Intelligence., 1998.Google Scholar
  8. 8.
    T.F. Cootes, G.J. Edwards, and C.J. Taylor. Active appearance models. PAMI, 23(6):681–684, June 2001.Google Scholar
  9. 9.
    J. L. Crowley and F. Berard. Multi-modal tracking of faces for video communications. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR’ 97, San Juan, Puerto Rico, 1997.Google Scholar
  10. 10.
    Videre Design. MEGA-D stereo camera.
  11. 11.
    G.D. Hager and P.N. Belhumeur. Efficient region tracking with parametric models of geometry and illumination. PAMI, 20(10):1025–1039, October 1998.Google Scholar
  12. 12.
    M. Harville, A. Rahimi, T. Darrell, G.G. Gordon, and J. Woodfill. 3d pose tracking with linear depth and brightness constraints. In ICCV99, pages 206–213, 1999.Google Scholar
  13. 13.
    B.K.P. Horn and B.G. Schunck. Determining optical flow. AI, 17:185–203, 1981.Google Scholar
  14. 14.
    InterSense Inc. Intertrax 2.
  15. 15.
    Mouse Vision Inc. Visual Mouse.
  16. 16.
    Tyzx Inc. Deepsea stereo system.
  17. 17.
    H. Ishii and B. Ullmer. Tangible bits: Towards seamless interfaces between people, bits and atoms. In Proc. of CHI’ 97, 1997.Google Scholar
  18. 18.
    R.J.K Jacob. Eye tracking in advanced interface design, pages 258–288. Oxford University Press, 1995.Google Scholar
  19. 19.
    R. Kjeldsen. Head gestures for computer control. In Proc. Second International Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Realtime Systems, pages 62–67, 2001.Google Scholar
  20. 20.
    M. La Cascia, S. Sclaroff, and V. Athitsos. Fast, reliable head tracking under varying illumination: An approach based on registration of textured-mapped 3d models. PAMI, 22(4):322–336, April 2000.Google Scholar
  21. 21.
    Paul P. Maglio, Teenie Matlock, Christopher S. Campbell, Shumin Zhai, and Barton A. Smith. Gaze and speech in attentive user interfaces. In ICMI, pages 1–7, 2000.Google Scholar
  22. 22.
    Louis-Philippe Morency and Trevor Darrell. Stereo tracking using icp and normal flow. In Proceedings Int. Conf. on Pattern Recognition, 2002.Google Scholar
  23. 23.
    Louis-Philippe Morency, Ali Rahimi, Neal Checka, and Trevor Darrell. Fast stereo-based head tracking for interactive environment. In Proceedings of the Int. Conference on Automatic Face and Gesture Recognition, 2002.Google Scholar
  24. 24.
    Ravikanth Pappu and Paul Beardsley. A qualitative approach to classifying gaze direction. In Proceedings of the Third IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, 1998.Google Scholar
  25. 25.
    A. Rahimi, L. Morency, and T. Darrell. Bayesian network for online global pose estimation. In International Conference on Intelligent Robots and Systems (IROS), to appear (September 2002).Google Scholar
  26. 26.
    A. Rahimi, L.P. Morency, and T. Darrell. Reducing drift in parametric motion tracking. In ICCV01, volume 1, pages 315–322, 2001.Google Scholar
  27. 27.
    Brave S. and Dahley A. intouch: A medium for haptic interpersonal communication. In Proceedings of CHI’ 91, 1997.Google Scholar
  28. 28.
    A. Schodl, A. Haro, and I. Essa. Head tracking using a textured polygonal model. In PUI98, 1998.Google Scholar
  29. 29.
    R. Stiefelhagen, M. Finke, J. Yang, and A. Waibel. From gaze to focus of attention. In Proceedings of Workshop on Perceptual User Interfaces: PUI 98, San Francisco, CA, pages 25–30, 1998.Google Scholar
  30. 30.
    R. Stiefelhagen, J. Yang, and A. Waibel. Estimating focus of attention based on gaze and sound. In Workshop on Perceptive User Interfaces (PUI 01)., 2001.Google Scholar
  31. 31.
    K. Toyama. Look,ma-no hands!hands-free cursor control with real-time 3d face tracking. In PUI98, 1998.Google Scholar
  32. 32.
    R. Vertegaal, R. Slagter, G.C. Van der Veer, and A. Nijholtxs. Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In Proc of ACM Conf. on Human Factors in Computing Systems, 2001.Google Scholar
  33. 33.
    Paul Viola and Michael Jones. Rapid object detection using a boosted cascade of simple features. In CVPR, 2001.Google Scholar
  34. 34.
    S. Whittaker, L. Terveen, and et al. The dynamics of mass interaction. In Proceedings of CSCW 98, Seattle, ACM Press, 1998.Google Scholar
  35. 35.
    L. Wiskott, J.M. Fellous, N. Kruger, and C. von der Malsburg. Face recognition by elastic bunch graph matching. PAMI, 19(7):775–779, July 1997.Google Scholar
  36. 36.
    C.R. Wren, A. Azarbayejani, T.J. Darrell, and A.P. Pentland. Pfinder: Real-time tracking of the human body. PAMI, 19(7):780–785, July 1997.Google Scholar
  37. 37.
    S. Zhai, C. Morimoto, and S. Ihde. Manual and gaze input cascaded (magic) pointing. In CHI99, pages 246–253, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Trevor Darrell
    • 1
  • Konrad Tollmar
    • 1
  • Frank Bentley
    • 1
  • Neal Checka
    • 1
  • Loius-Phillipe Morency
    • 1
  • Ali Rahimi
    • 1
  • Alice Oh
    • 1
  1. 1.MIT AI LabCambridge

Personalised recommendations