Eye Gaze Correction with Stereovision for Video-Teleconferencing

  • Ruigang Yang
  • Zhengyou Zhang
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2351)


The lack of eye contact in desktop video teleconferencing substantially reduces the effectiveness of video contents. While expensive and bulky hardware is available on the market to correct eye gaze, researchers have been trying to provide a practical software-based solution to bring video-teleconferencing one step closer to the mass market. This paper presents a novel approach that is based on stereo analysis combined with rich domain knowledge (a personalized face model). This marriage is mutually beneficial. The personalized face model greatly improved the accuracy and robustness of the stereo analysis by substantially reducing the search range; the stereo techniques, using both feature matching and template matching, allow us to extract 3D information of objects other than the face and to determine the head pose in a much more reliable way than if only one camera is used. Thus we enjoy the versatility of stereo techniques without suffering from their vulnerability. By emphasizing a 3D description of the scene on the face part, we synthesize virtual views that maintain eye contact using graphics hardware. Our current system is able to generate an eye-gaze corrected video stream at about 5 frames per second on a commodity PC.


Stereoscopic vision Eye-gaze correction Structure from motion 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    A. Azarbayejani, B. Horowitz, and A. Pentland. Recursive Estimation of Structure and Motion Using the Relative Orientation Constraint. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 70–75, 1993.Google Scholar
  2. 2.
    S. Basu, I. Essa, and A. Pentland. Motion Regularization for Model-based Head Tracking. In Proceedings of International Conference on Pattern Recognition, pages 611–616, Vienna, Austria, 1996.Google Scholar
  3. 3.
    R. Bellman. Dynamic Programming. Princeton University Press, Princeton, New Jersey, 1957.Google Scholar
  4. 4.
    M. J. Black and Y. Yacoob. Tracking and Recognizing Rigid and Non-Rigid Facial Motions Using Local Parametric Model of Image Motion. In Proceedings of International Conference on Computer Vision, pages 374–381, Cambridge, MA, 1995.Google Scholar
  5. 5.
    P. Burt and B. Julesz. A gradient limit for binocular fusion. Science, 208:615–617, 1980.CrossRefGoogle Scholar
  6. 6.
    T.J. Cham and M. Jones. Gaze Correction for Video Conferencing. Compaq Cambridge Research Laboratory,
  7. 7.
    D. DeCarlo and D. Metaxas. Optical Flow Constraints on Deformable Models with Applications to Face Tracking. International Journal of Computer Vision, 38(2):99–127, July 2001.Google Scholar
  8. 8.
    D.H. Douglas and T.K. Peucker. Algorithms for the Reduction of the Number of Points Required to Represent a Digitized Line or Its Caricature. Canadian Cartographer, 10(2):112–122, 1973.Google Scholar
  9. 9.
    O. Faugeras. Three-Dimensional Computer Vision: A Geometric Viewpoint. MIT Press, 1993.Google Scholar
  10. 10.
    J. Gemmell, C.L. Zitnick, T. Kang, K. Toyama, and S. Seitz. Gaze-awareness for Videocon ferencing: A Software Approach. IEEE Multimedia, 7(4):26–35, October 2000.Google Scholar
  11. 11.
    T. Horprasert. Computing 3-D Head Orientation from a Monocular Image. In International Conference of Automatic Face and Gesture Recognition, pages 242–247, 1996.Google Scholar
  12. 12.
    Michael Jones. Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes. International Journal of Computer Vision, 29(2): 107–131, Auguest 1998.Google Scholar
  13. 13.
    M. Kass, A. Witkin, and D. Terzopoulos. Snake: Active Contour Models. International Journal of Computer Vision, 1(4):321–331, 1987.CrossRefGoogle Scholar
  14. 14.
    R. Kollarits, C. Woodworth, J. Ribera, and R. Gitlin. An Eye-Contact Camera/Display System for Videophone Applications Using a Conventional Direct-View LCD. SID Digest, 1995.Google Scholar
  15. 15.
    J. Liu, I. Beldie, and M. Wopking. A Computational Approach to Establish Eye-contact in Videocommunication. In the International Workshop on Stereoscopic and Three Dimensional Imaging (IWS3DI), pages 229–234, Santorini, Greece, 1995.Google Scholar
  16. 16.
    Z. Liu, Z. Zhang, C. Jacobs, and M. Cohen. Rapid Modeling of Animated Faces From Video. Journal of Visualization and Compute Animation, 12(4):227–240, 2001.zbMATHCrossRefGoogle Scholar
  17. 17.
    C. Loop and Z. Zhang. Computing Rectifying Homographies for Stereo Vision. In IEEE Conf. Computer Vision and Pattern Recognition, volume I, pages 125–131, June 1999.Google Scholar
  18. 18.
    L. Mhlbach, B. Kellner, A. Prussog, and G. Romahn. The Importance of Eye Contact in a Videotelephone Service. In 11th Interational Symposium on Human Factors in Telecommunications, Cesson Sevigne, France, 1985.Google Scholar
  19. 19.
    M. Ott, J. Lewis, and I. Cox. Teleconferencing Eye Contact Using a Virtual Camera. In INTERCHI’ 93, pages 119–110, 1993.Google Scholar
  20. 20.
    S. Pollard, J. Porrill, J. Mayhew, and J. Frisby. Disparity Gradient, Lipschitz Continuity, and Computing Binocular Correspondance. In O.D. Faugeras and G. Giralt, editors, Robotics Research: The Third International Symposium, volume 30, pages 19–26. MIT Press, 1986.Google Scholar
  21. 21.
    S.M. Seitz and C.R. Dyer. View Morphing. In SIGGRAPH 96 Conference Proceedings, volume 30 of Annual Conference Series, pages 21–30, New Orleans, Louisiana, 1996. ACM SIGGRAPH, Addison Wesley.Google Scholar
  22. 22.
    J. Shi and C. Tomasi. Good Features to Track. In the IEEE Conferecne on Computer Vision and Pattern Recognition, pages 593–600, Washington, June 1994.Google Scholar
  23. 23.
    R.R. Stokes. Human Factors and Appearance Design Considerations of the Mod PICTUREPHONE Station Set. IEEE Trans. on Communication Technology, COM-17(2), April 1969.Google Scholar
  24. 24.
    Z. Zhang. A flexible new technique for camera calibration. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11): 1330–1334, 2000.CrossRefGoogle Scholar
  25. 25.
    Z. Zhang and Y. Shan. A Progressive Scheme for Stereo Matching. In M. Pollefeys et al., editor, Springer LNCS 2018:3D Structure from Images-SMILE 2000, pages 68–85. Springer-Verlag, 2001.CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2002

Authors and Affiliations

  • Ruigang Yang
    • 1
  • Zhengyou Zhang
    • 2
  1. 1.Dept. of Computer ScienceUniversity of North Carolina at Chapel HillUSA
  2. 2.Microsoft ResearchRedmondUSA

Personalised recommendations