A Computer Vision Based Human-Robot Interface

  • José M. Buenaposada
  • Luis Baumela
Part of the Studies in Fuzziness and Soft Computing book series (STUDFUZZ, volume 116)


This chapter focuses on the real-time location and tracking of human faces in video sequences. The tracking is based on the cooperation of two low-level trackers based on colour and template information. The colour-based tracker is fast and robust, but it can only compute the 2D location of the face on the image. The template-based tracker, although it is more sensitive to environmental variations and more time consuming, it can compute the position and orientation of a human face in 3D space. As a result of the co-ordination of these two trackers, it emerges a robust real-time tracker that accurately computes face position and orientation in varying environmental conditions.


Colour Constancy Template Image Face Tracking Dynamic Extension Planar Patch 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    H. Asoh, S. Hayamizu, I. Hara, Y. Motomura, S. Akao and T. Matsui. “Socially Embedded Learning of Office Conversant robot jijo-2,” in Proc. Int. Joint Conference on Artificial Intelligence, 1997.Google Scholar
  2. 2.
    S. Waldherr, R. Romero, S. Thrun. “A Gesture-based Interface for HumanRobot Interaction,” Autonomous Robots, 9, pp. 151–173. 2000.CrossRefGoogle Scholar
  3. 3.
    R. Cipolla and N.J. Hollinghurst. “Human-Robot Interface by Pointing with Uncalibrated Stereo Vision,” Image and Vision Computing, 14(3), pp. 171–178. 1996.CrossRefGoogle Scholar
  4. 4.
    D. Perzanowski, A.C. Schultz, W. Adams, E. Marsh, M. Bugajska. “Building a Multimodal Human-Robot Interface,” IEEE Intelligent Systems, pp. 16–21, Jan/Feb 2001.Google Scholar
  5. 5.
    Y. Yoshitomi, S-I. Kim, T. Kawao ans T. Kitazoe. “Effect of Sensor Fusion for Recognition of Emotional States Using Voice, Face Image and Thermal Image of Face,” in Proc. IEEE Int. Workshop on Robot and Human Interactive Communication, Paris, France. 2001.Google Scholar
  6. 6.
    L. Cañamero and J. Fredsiund. “I Show You How I Like You-Can You Read it in My Face,” IEEE Trans. on Systems, Man and Cybernetics-A, 31(5), pp. 454–459. 2001.CrossRefGoogle Scholar
  7. 7.
    H. Kobayashi, Y. Ichikawa and T. Tsuji. “Face Robot - Toward RealtimeRich Facial Expressions,” in Proc. IEEE Int. Workshop on Robot and Human Interactive Communication, Paris, France. 2001.Google Scholar
  8. 8.
    B. Scassellati, “Theory of Mind for a Humanoid Robot,” Autonomous Robots, 12, pp. 13–24. 2002.MATHCrossRefGoogle Scholar
  9. 9.
    W-K. Song D-J. Kim, J-S. Kim and Z. Bien. “Visual Servoing for a User’s Mouth with Effective Intention Reading in a Wheelchair-based Robotic Arm,” in Procceedings fo the IEEE Int. Conference on Robotics and Automation, pp. 3662–3667. Seoul, Korea. 2001.Google Scholar
  10. 10.
    M. Mazo et al. “An Integral System for Assisted Mobility,” IEEE Robotics and Automation Magazine, vol 8, no 1, pp. 46–56. March 2001.CrossRefGoogle Scholar
  11. 11.
    Y. Matsumoto, T. Ino, T. Ogasawara. “Fast image-based tracking by selective pixel integration,” in Proc. IEEE Int. Workshop on Robot and Human Interactive Communication. Paris, France. 2001.Google Scholar
  12. 12.
    K. Toyama. Prolegomena for robust face tracking. MSR-TR-98–65. Microsoft Research, November 1998.Google Scholar
  13. 13.
    G.D. Finlayson, B. Shiele and J.L. Crowley. Comprehensive colour normalization. Proc. European Conf. on Computer Vison (ECCV). Vol. I, 475–490, Freiburg, Germany. 1998.Google Scholar
  14. 14.
    J. Yang, W. Lu, A. Waibel. Skin-color modeling and adaptation. Proc. Third Asian Conference on Computer Vision Vol. II, 142–147. 1998.Google Scholar
  15. 15.
    Y. Raja, S.J. McKenna, S. Gong. Colour model selection and adaptation in dynamic scenes. Proc. European Conference on Computer Vision. Vol. I, 460–474. 1998.Google Scholar
  16. 16.
    Y. Wu, Q. Liu and T.S. Huang. Robust real-time hand localization by selforganizing color segmentation. Proceedings RATFG ’99, 161–166. 1999.Google Scholar
  17. 17.
    D. Berwick and S.W. Lee. A chromaticity space for specularity, illumination color and illumination pose invariant 3-d object recognition. Proc. of the Int. Conf. on Computer Vision. Bombay, India. 1998.Google Scholar
  18. 18.
    M. Störring, H.J. Andersen and E. Granum. Estimation of the illuminant colour from human skin colour. Proc. of the Int. Conference on Automatic Face and Gesture Recognition (FG’00), 64–69, Grenoble. France. 2000.Google Scholar
  19. 19.
    M. D’Zmura and P. Lennie. Mechanisms of colour constancy. Journal of the Optical Society of America A, 3: 1662–1672, 1986.CrossRefGoogle Scholar
  20. 20.
    G. Buchsbaum. A spatial processor model for object colour perception. Journal of the Fanklin Institute, 310: 1–26, 1980.CrossRefGoogle Scholar
  21. 21.
    R. Gershon, A.D. Jepson and J.K. Tsotsos. From [R,G,B] to surface reflectance: Computing color constant descriptors in images. Proc. Int. Joint Conf. on Artificial Intelligence, 755–758, 1987.Google Scholar
  22. 22.
    Y. Cheng. Mean shift, mode seeking and clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, 17: 790–799, 1995.CrossRefGoogle Scholar
  23. 23.
    G. Bradski. Computer Vision face tracking for use in a perceptual user interface. Proc. of Workshop on applications of Computer Vision, WACV’98, 214–219, 1998.Google Scholar
  24. 24.
    J. L. Crowley and J. Schwerdt. Robust tracking and compression for video communication. Proc. of the Int. Workshop on Recognition, Analysis and Tracking of Faces and Gestures in Real-Time (RATFG’99), 2–9, Corfu. Greece. 1999.Google Scholar
  25. 25.
    M. Soriano, B. Martinkauppi, S. Huovinen, M. Laaksonen. Skin detection in video under changing illumination conditions. Proc. of the Int. Conference on Automatic Face and Gesture Recognition (FG’oo), 839–842, Grenoble. France. 2000.Google Scholar
  26. 26.
    F. Lerasle V. Ayala, J.B. Hayet and M. Devy, Visual localization of a mobile robot in indoor environments using planar landmarks. Proceedings Intelligent Robots and Systems, 2000. IEEE, 2000, pp. 275–280.Google Scholar
  27. 27.
    G. Simon, A. Fitzgibbon, and A. Zisserman, Markerless tracking using planar structures in the scene, Proc. Int. Symposium on Augmented Reality, October 2000.Google Scholar
  28. 28.
    M. J. Black and Y. Yacoob. “Recognizing facial expressions in image sequences using local parameterized models of image motion,” Int. Journal of Computer Vision, vol. 25, no. 1, pp. 23–48, 1997.CrossRefGoogle Scholar
  29. 29.
    C. Thorpe F. Dellaert and S. Thrun. “Super-resolved texture tracking of planar surface patches,” in Proceedings Intelligent Robots and Systems. IEEE, 1998, pp. 197–203.Google Scholar
  30. 30.
    M Irani and P. Anandan. “All about direct methods,” in Vision Algorithms: Theory and practice, W. Triggs, A. Zisserman, and R. Szeliski, Eds. SpringerVerlag, 1999.Google Scholar
  31. 31.
    P. H. S. Torr and A. Zisserman. “Feature based methods for structure and motion estimation,” in Vision Algorithms: Theory and practice, W. Triggs, A. Zisserman, and R. Szeliski, Eds. Springer-Verlag, 1999, pp. 278–295.Google Scholar
  32. 32.
    Gregory D. Hager and Peter N. Belhumeur. “Efficient region tracking with parametric models of geometry and illumination,” IEEE Transactions on Pattern Analisys and Machine Intelligence, vol. 20, no. 10, pp. 1025–1039, 1998.CrossRefGoogle Scholar
  33. 33.
    F. Dellaert and R. Collins, “Fast image-based tracking by selective pixel integration,” in ICCV99 Workshop on frame-rate applications, 1999.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • José M. Buenaposada
    • 1
  • Luis Baumela
    • 1
  1. 1.Departamento de Inteligencia ArtificialUniversidad Politécnica de MadridCampus de Montegancedo s/nMadridSpain

Personalised recommendations