Abstract
Facial image computing has been an extensively studied topic since its wide applications in human-centered computing, human-avatar interaction, virtual reality, and multimedia communication. Successful systems have been equipped with realistic face models, efficient compression algorithms, reliable animation techniques and user friendly interaction schemes. In this chapter, we will mainly focus on techniques, algorithms, models, applications and real-world systems. Comprehensive summarization of the state-of-the-art works will be presented as well as our experiences and contributions in the field, especially several real prototype systems developed in our group, such as the online interactive gaming system hMouse, humanoid emotive audio-visual avatar, and 3D face/head tracking based video compression. Performances of these three systems are also illustrated based on standard evaluations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Ahlberg, J.: CANDIDE-3 – An updated parameterized face. Dept. of Electrical Engineering, Linkping University, Sweden, Tech. Rep. LiTH-ISY-R-2326 (2001)
Azarbayejani, A., Horowitz, B., Pentland, A.: Recursive estimation of structure and motion using relative orientation constraints. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 294–299 (1993)
Aizawa, K., Huang, T.S.: Model based image coding: Advanced video coding techniques for very low bit-rate applications. Proc. IEEE 83, 259–271 (1995)
Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. IEEE WACV, 214–219 (1998)
Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technology Journal (1998)
Brand, M., Bhotika, R.: Flexible flow for 3D nonrigid tracking and shape recovery. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 315–322 (2001)
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth, Belmont (1984)
Betke, M., Gips, J., Fleming, P.: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. on NSRE 10(1), 1–10 (2002)
Bergasa, L.M., Mazo, M., Gardel, A., Barea, R., Boquete, L.: Commands generation by face movements applied to the guidance of a wheelchair for handicapped people. In: ICPR, pp. 660–663 (2000)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, 187–194 (1999)
CameraMouse, Inc., http://www.cameramouse.com/
Chen, Y.L.: Application of tilt sensors in human-computer mouse interface for people with disabilities. IEEE Trans. on NSRE 9(3), 289–294 (2001)
Cascia, M.L., Sclaroff, S., Athitsos, V.: Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models. IEEE Trans. on PAMI 22(4), 322–336 (2000)
Chang, C.C., Tsai, W.H.: Determination of head pose and facial expression from a single perspective view by successive scaled orthographic approximations. Intl. Journal of Computer Vision 46(3), 179–199 (2002)
Cootes, T.F., Taylor, C.J.: Active shape model search using local grey-level models: A quantitative evaluation. In: 4th British Machine Vision Conference, pp. 639–648 (1993)
DAZ3D, http://www.daz3d.com/
Dutoit, T.: An introduction to text-to-speech synthesis. Kluwer Academic Publishers, Dordrecht (1997)
DeMenthon, D., Davis, L.: Model-based object psoe in 25 lines of codes. In: European Conference on Computer Vision, pp. 335–343 (1992)
DeCarlo, D., Metaxas, D.: Optical flow constraints on deformable models with applications to face tracking. International Journal of Computer Vision 38(2), 99–127 (2000)
Dalong, J., Zhiguo, L., Zhaoqi, W., Wen, G.: Animating 3d facial models with MPEG-4 FaceDefTables. In: Proc. of the 35th Annual Simulation Symposium, p. 395 (2002)
Evans, D.G., Drew, R., Blenkhorn, P.: Controlling mouse pointer position using an infrared head-operated joystick. IEEE Trans. on NSRE 8(1), 107–117 (2000)
Ekman, P., Friesen, W.V.: The facial action coding system. Consult. Psychological Press, Palo Alto (1978)
Eisert, P., Wiegand, T., Girod, B.: Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding. IEEE Transactions on Circuits and Systems for Video Technology 10(3), 344–358 (2000)
Fisher, C.G.: Confusions among visually perceived consonants. Journal of Speech and Hearing Research 11(4), 796–804 (1968)
Fu, Y., Huang, T.S.: hMouse: head tracking driven virtual Computer Mouse. IEEE WACV (2007)
Grauman, K., Betke, M., Gips, J., Bradski, G.R.: Communication via eye blinks—detection and duration analysis in real time. IEEE CVPR 1, 1010–1017 (2001)
Guo, H., Jiang, J., Zhang, L.: Building a 3D morphable face model by using thin plate splines for face reconstruction. In: Advances in Biometric Person Authentication, pp. 258–267. Springer, Heidelberg (2004)
Gorodnichy, D.O., Malik, S., Roth, G.: Nouse ‘use your nose as a mouse’ - a new technology for hands-free games and interfaces, vol. VI, pp. 354–360 (2002)
H.264: Advanced video coding for generic audiovisual services. International Telecommunication Union, Tech. Rep. (2007)
Hilgert, C.: Pingu Throw, YetiSports 1, http://www.yetisports.org/
Hilgert, C.: Orca Slap, YetiSports 2, http://www.yetisports.org/
Hong, P., Huang, T.S.: Natural mouse-a novel human computer interface. IEEE ICIP 1, 653–656 (1999)
Hong, P., Wen, Z., Huang, T.S.: iFACE: a 3D synthetic talking face. International Journal of Image and Graphics 1(1), 1–8 (2001)
Hong, P., Wen, Z., Huang, T.S.: Real-time speech-driven expressive synthetic talking faces using neural networks. IEEE Transaction on Neural Networks 13(4), 916–927 (2002)
Java mobile 3D, http://www.jcp.org/en/jsr/detail?id=184
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. of the ASME–Journal of Basic Engineering 82(D), 35–45 (1960)
Kampmann, M.: Automatic 3-d face model adaptation for model-based coding of videophone sequences. IEEE Trans. on Circuits and Systems for Video Technology 12(3), 172–182 (2002)
Lapack, LAPACK – Linear Algebra Package, http://www.netlib.org/lapack/
Logitech, Inc., http://www.logitech.com/
Li, R.X.: A technical report on Motorola avatar framework, TR-2006, Applications Research Center, Motorola Labs, Schaumburg, IL (2006)
Lu, L., Dai, X., Hager, G.: A particle filter without dynamics for robust 3D face tracking. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, vol. 5, p. 70 (2004)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. IEEE ICIP 1, 900–903 (2002)
Lee, Y.C., Terzopoulos, D., Waters, K.: Realistic modeling for facial animation. In: SIGGRAPH, pp. 55–62 (1995)
MouseVision, Inc., http://www.mousevision.com/
Mehrabian, A.: Communication without words. Psychology Today 2(4), 53–56 (1968)
Massaro, D.W.: Perceiving talking faces: from speech perception to a behavioral principle. MIT Press, Cambridge (1998)
Musmann, H.: A layered coding system for very low bit rate video coding. Signal Processing: Image Communication 7, 267–278 (1995)
Morris, T., Chauhan, V.: Facial feature tracking for cursor control. Journal of Network and Computer Applications 29(1), 62–80 (2006)
Nanda, H., Fujimura, K.: A robust elliptical head tracker. In: IEEE FGR, pp. 469–474 (2004)
Open source computer vision library, Intel Corporation, http://www.intel.com/technology/computing/opencv/index.htm
Parke, F.I., Waters, K.: Computer facial animation. AK Peters, Ltd., Wellesley (1996)
Roy, B., Bouyssou, D.: Video coding for low bitrate communication. ITU-T Recommendation H.263 (1996)
Schroder, M.: Speech and emotion research: an overview of research frameworks and a dimensional approach to emotional speech synthesis (Ph.D thesis). Phonus, Research Report of the Institute of Phonetics, 7, Saarland University (2004)
Salas, S.L., Hille, E.: Computational imaging and vision. In: A survey on 3D modeling of human faces for face recognition, ch. 2. Springer, Netherlands (2007)
Tu, J.: Visual face tracking and its applications (Ph.D thesis), University of Illinois at Urbana-Champaign (2007)
Toyama, K.: ‘Look, ma - no hands!’ hands-free cursor control with real-time 3D face tracking. In: Proc. Workshop on Perceptual User Interfaces, San Fransisco (1998)
Tang, H., Fu, Y., Tu, J., Huang, T.S., Hasegawa-Johnson, M.: EAVA: a 3D emotive audio-visual avatar. In: 2008 IEEE Workshop on Applications of Computer Vision, Copper Mountain, CO (2008)
Tang, H., Fu, Y., Tu, J., Hasegawa-Johnson, M., Huang, T.S.: Humanoid audio-visual avatar with emotive text-to-speech synthesis. IEEE Transactions on Multimedia 10(6), 969–981 (2008)
Tang, H., Hu, Y., Fu, Y., Hasegawa-Johnson, M., Huang, T.S.: Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar. In: 2008 IEEE International Conference on Multimedia & Expo., Hannover, Germany (2008)
Tao, H., Huang, T.S.: Explanation-based facial motion tracking using a piecewise Bézier volume deformation model. IEEE CVPR 1, 611–617 (1999)
Tu, J., Huang, T.S., Tao, H.: Face as mouse through visual face tracking. In: IEEE CRV, pp. 339–346 (2005)
Tu, J., Tao, H., Huang, T.S.: Face as mouse through visual face tracking. CVIU Special Issue on Vision for Human-Computer Interaction 108(1-2), 35–40 (2007)
Tu, J., Wen, Z., Tao, H., Huang, T.S.: Coding face at very low bit rate via visual face tracking. In: Proc. of Picture Coding Symposium, Saint Malo, France, pp. 301–304 (2003)
Tang, H., Zhou, X., Odisio, M., Hasegawa-Johnson, M., Huang, T.S.: Two-stage prosody prediction for emotional text-to-speech synthesis. In: INTERSPEECH 2008, Brisbane, Australia (2008)
Tu, J., Zhang, Z., Zeng, Z., Huang, T.S.: Face Localization via Hierarchical CONDENSATION with Fisher Boosting Feature Selection. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 719–724 (2004)
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE CVPR, 511–518 (2001)
Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision 747 (2002)
Vaccetti, L., Lepetit, V., Fua, P.: Fusing online and o2ine information for stable 3D tracking in real-time. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. II: 241–248 (2003)
Wenzel, M.T., Schiffmann, W.H.: Head pose estimation of partially occluded faces. IEEE CRV, 353–360 (2005)
Wang, Y.: VisGenie, http://www.ee.columbia.edu/~ywang/Research/VisGenie/
Waters, K.: A muscle model for animating three-dimensional facial expressions. Computer Graphics 21(4), 17–24 (1987)
Watkinson, J.: The MPEG handbook: MPEG1, MPEG2, MPEG4. Focal Press (2001)
Wen, Z., Huang, T.S.: 3D face processing: modeling, analysis, and synthesis. Kluwer Academic Publishers, Boston (2004)
Welsh, W.J., Searby, S., Waite, J.B.: Model-based image coding. British Telecom Technology Journal 8(3), 94–106 (1990)
Xiao, B., Zhang, Q., Wei, X.: A NURBS facial model based on MPEG-4. In: 16th International Conference on Artificial Reality and Telexistence–Workshops, pp. 491–495 (2006)
Yang, T., Li, S.Z., Pan, Q., Li, J., Zhao, C.H.: Reliable and fast tracking of faces under varying pose. IEEE FGR, 421–428 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Fu, Y., Tang, H., Tu, J., Tao, H., Huang, T.S. (2010). Human-Centered Face Computing in Multimedia Interaction and Communication. In: Chen, C.W., Li, Z., Lian, S. (eds) Intelligent Multimedia Communication: Techniques and Applications. Studies in Computational Intelligence, vol 280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11686-5_16
Download citation
DOI: https://doi.org/10.1007/978-3-642-11686-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11685-8
Online ISBN: 978-3-642-11686-5
eBook Packages: EngineeringEngineering (R0)