Human-Centered Face Computing in Multimedia Interaction and Communication

Fu, Yun; Tang, Hao; Tu, Jilin; Tao, Hai; Huang, Thomas S.

doi:10.1007/978-3-642-11686-5_16

Yun Fu⁵,
Hao Tang⁶,
Jilin Tu⁷,
Hai Tao⁸ &
…
Thomas S. Huang⁶

Part of the book series: Studies in Computational Intelligence ((SCI,volume 280))

768 Accesses
1 Citations

Abstract

Facial image computing has been an extensively studied topic since its wide applications in human-centered computing, human-avatar interaction, virtual reality, and multimedia communication. Successful systems have been equipped with realistic face models, efficient compression algorithms, reliable animation techniques and user friendly interaction schemes. In this chapter, we will mainly focus on techniques, algorithms, models, applications and real-world systems. Comprehensive summarization of the state-of-the-art works will be presented as well as our experiences and contributions in the field, especially several real prototype systems developed in our group, such as the online interactive gaming system hMouse, humanoid emotive audio-visual avatar, and 3D face/head tracking based video compression. Performances of these three systems are also illustrated based on standard evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ahlberg, J.: CANDIDE-3 – An updated parameterized face. Dept. of Electrical Engineering, Linkping University, Sweden, Tech. Rep. LiTH-ISY-R-2326 (2001)
Google Scholar
Azarbayejani, A., Horowitz, B., Pentland, A.: Recursive estimation of structure and motion using relative orientation constraints. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 294–299 (1993)
Google Scholar
Aizawa, K., Huang, T.S.: Model based image coding: Advanced video coding techniques for very low bit-rate applications. Proc. IEEE 83, 259–271 (1995)
Article Google Scholar
Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. IEEE WACV, 214–219 (1998)
Google Scholar
Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technology Journal (1998)
Google Scholar
Brand, M., Bhotika, R.: Flexible flow for 3D nonrigid tracking and shape recovery. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 315–322 (2001)
Google Scholar
Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth, Belmont (1984)
MATH Google Scholar
Betke, M., Gips, J., Fleming, P.: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. on NSRE 10(1), 1–10 (2002)
Google Scholar
Bergasa, L.M., Mazo, M., Gardel, A., Barea, R., Boquete, L.: Commands generation by face movements applied to the guidance of a wheelchair for handicapped people. In: ICPR, pp. 660–663 (2000)
Google Scholar
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, 187–194 (1999)
Google Scholar
CameraMouse, Inc., http://www.cameramouse.com/
Chen, Y.L.: Application of tilt sensors in human-computer mouse interface for people with disabilities. IEEE Trans. on NSRE 9(3), 289–294 (2001)
Google Scholar
Cascia, M.L., Sclaroff, S., Athitsos, V.: Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models. IEEE Trans. on PAMI 22(4), 322–336 (2000)
Google Scholar
Chang, C.C., Tsai, W.H.: Determination of head pose and facial expression from a single perspective view by successive scaled orthographic approximations. Intl. Journal of Computer Vision 46(3), 179–199 (2002)
Article MATH Google Scholar
Cootes, T.F., Taylor, C.J.: Active shape model search using local grey-level models: A quantitative evaluation. In: 4th British Machine Vision Conference, pp. 639–648 (1993)
Google Scholar
DAZ3D, http://www.daz3d.com/
Dutoit, T.: An introduction to text-to-speech synthesis. Kluwer Academic Publishers, Dordrecht (1997)
Google Scholar
DeMenthon, D., Davis, L.: Model-based object psoe in 25 lines of codes. In: European Conference on Computer Vision, pp. 335–343 (1992)
Google Scholar
DeCarlo, D., Metaxas, D.: Optical flow constraints on deformable models with applications to face tracking. International Journal of Computer Vision 38(2), 99–127 (2000)
Article MATH Google Scholar
Dalong, J., Zhiguo, L., Zhaoqi, W., Wen, G.: Animating 3d facial models with MPEG-4 FaceDefTables. In: Proc. of the 35th Annual Simulation Symposium, p. 395 (2002)
Google Scholar
Evans, D.G., Drew, R., Blenkhorn, P.: Controlling mouse pointer position using an infrared head-operated joystick. IEEE Trans. on NSRE 8(1), 107–117 (2000)
Google Scholar
Ekman, P., Friesen, W.V.: The facial action coding system. Consult. Psychological Press, Palo Alto (1978)
Google Scholar
Eisert, P., Wiegand, T., Girod, B.: Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding. IEEE Transactions on Circuits and Systems for Video Technology 10(3), 344–358 (2000)
Article Google Scholar
Fisher, C.G.: Confusions among visually perceived consonants. Journal of Speech and Hearing Research 11(4), 796–804 (1968)
Google Scholar
Fu, Y., Huang, T.S.: hMouse: head tracking driven virtual Computer Mouse. IEEE WACV (2007)
Google Scholar
Grauman, K., Betke, M., Gips, J., Bradski, G.R.: Communication via eye blinks—detection and duration analysis in real time. IEEE CVPR 1, 1010–1017 (2001)
Google Scholar
Guo, H., Jiang, J., Zhang, L.: Building a 3D morphable face model by using thin plate splines for face reconstruction. In: Advances in Biometric Person Authentication, pp. 258–267. Springer, Heidelberg (2004)
Google Scholar
Gorodnichy, D.O., Malik, S., Roth, G.: Nouse ‘use your nose as a mouse’ - a new technology for hands-free games and interfaces, vol. VI, pp. 354–360 (2002)
Google Scholar
H.264: Advanced video coding for generic audiovisual services. International Telecommunication Union, Tech. Rep. (2007)
Google Scholar
Hilgert, C.: Pingu Throw, YetiSports 1, http://www.yetisports.org/
Hilgert, C.: Orca Slap, YetiSports 2, http://www.yetisports.org/
Hong, P., Huang, T.S.: Natural mouse-a novel human computer interface. IEEE ICIP 1, 653–656 (1999)
Google Scholar
Hong, P., Wen, Z., Huang, T.S.: iFACE: a 3D synthetic talking face. International Journal of Image and Graphics 1(1), 1–8 (2001)
Article Google Scholar
Hong, P., Wen, Z., Huang, T.S.: Real-time speech-driven expressive synthetic talking faces using neural networks. IEEE Transaction on Neural Networks 13(4), 916–927 (2002)
Article Google Scholar
Java mobile 3D, http://www.jcp.org/en/jsr/detail?id=184
Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. of the ASME–Journal of Basic Engineering 82(D), 35–45 (1960)
Google Scholar
Kampmann, M.: Automatic 3-d face model adaptation for model-based coding of videophone sequences. IEEE Trans. on Circuits and Systems for Video Technology 12(3), 172–182 (2002)
Article Google Scholar
Lapack, LAPACK – Linear Algebra Package, http://www.netlib.org/lapack/
Logitech, Inc., http://www.logitech.com/
Li, R.X.: A technical report on Motorola avatar framework, TR-2006, Applications Research Center, Motorola Labs, Schaumburg, IL (2006)
Google Scholar
Lu, L., Dai, X., Hager, G.: A particle filter without dynamics for robust 3D face tracking. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, vol. 5, p. 70 (2004)
Google Scholar
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. IEEE ICIP 1, 900–903 (2002)
Google Scholar
Lee, Y.C., Terzopoulos, D., Waters, K.: Realistic modeling for facial animation. In: SIGGRAPH, pp. 55–62 (1995)
Google Scholar
MouseVision, Inc., http://www.mousevision.com/
Mehrabian, A.: Communication without words. Psychology Today 2(4), 53–56 (1968)
Google Scholar
Massaro, D.W.: Perceiving talking faces: from speech perception to a behavioral principle. MIT Press, Cambridge (1998)
Google Scholar
Musmann, H.: A layered coding system for very low bit rate video coding. Signal Processing: Image Communication 7, 267–278 (1995)
Article Google Scholar
Morris, T., Chauhan, V.: Facial feature tracking for cursor control. Journal of Network and Computer Applications 29(1), 62–80 (2006)
Article Google Scholar
Nanda, H., Fujimura, K.: A robust elliptical head tracker. In: IEEE FGR, pp. 469–474 (2004)
Google Scholar
Open source computer vision library, Intel Corporation, http://www.intel.com/technology/computing/opencv/index.htm
Parke, F.I., Waters, K.: Computer facial animation. AK Peters, Ltd., Wellesley (1996)
Google Scholar
Roy, B., Bouyssou, D.: Video coding for low bitrate communication. ITU-T Recommendation H.263 (1996)
Google Scholar
Schroder, M.: Speech and emotion research: an overview of research frameworks and a dimensional approach to emotional speech synthesis (Ph.D thesis). Phonus, Research Report of the Institute of Phonetics, 7, Saarland University (2004)
Google Scholar
Salas, S.L., Hille, E.: Computational imaging and vision. In: A survey on 3D modeling of human faces for face recognition, ch. 2. Springer, Netherlands (2007)
Google Scholar
Tu, J.: Visual face tracking and its applications (Ph.D thesis), University of Illinois at Urbana-Champaign (2007)
Google Scholar
Toyama, K.: ‘Look, ma - no hands!’ hands-free cursor control with real-time 3D face tracking. In: Proc. Workshop on Perceptual User Interfaces, San Fransisco (1998)
Google Scholar
Tang, H., Fu, Y., Tu, J., Huang, T.S., Hasegawa-Johnson, M.: EAVA: a 3D emotive audio-visual avatar. In: 2008 IEEE Workshop on Applications of Computer Vision, Copper Mountain, CO (2008)
Google Scholar
Tang, H., Fu, Y., Tu, J., Hasegawa-Johnson, M., Huang, T.S.: Humanoid audio-visual avatar with emotive text-to-speech synthesis. IEEE Transactions on Multimedia 10(6), 969–981 (2008)
Article Google Scholar
Tang, H., Hu, Y., Fu, Y., Hasegawa-Johnson, M., Huang, T.S.: Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar. In: 2008 IEEE International Conference on Multimedia & Expo., Hannover, Germany (2008)
Google Scholar
Tao, H., Huang, T.S.: Explanation-based facial motion tracking using a piecewise Bézier volume deformation model. IEEE CVPR 1, 611–617 (1999)
Google Scholar
Tu, J., Huang, T.S., Tao, H.: Face as mouse through visual face tracking. In: IEEE CRV, pp. 339–346 (2005)
Google Scholar
Tu, J., Tao, H., Huang, T.S.: Face as mouse through visual face tracking. CVIU Special Issue on Vision for Human-Computer Interaction 108(1-2), 35–40 (2007)
Google Scholar
Tu, J., Wen, Z., Tao, H., Huang, T.S.: Coding face at very low bit rate via visual face tracking. In: Proc. of Picture Coding Symposium, Saint Malo, France, pp. 301–304 (2003)
Google Scholar
Tang, H., Zhou, X., Odisio, M., Hasegawa-Johnson, M., Huang, T.S.: Two-stage prosody prediction for emotional text-to-speech synthesis. In: INTERSPEECH 2008, Brisbane, Australia (2008)
Google Scholar
Tu, J., Zhang, Z., Zeng, Z., Huang, T.S.: Face Localization via Hierarchical CONDENSATION with Fisher Boosting Feature Selection. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 719–724 (2004)
Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE CVPR, 511–518 (2001)
Google Scholar
Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision 747 (2002)
Google Scholar
Vaccetti, L., Lepetit, V., Fua, P.: Fusing online and o2ine information for stable 3D tracking in real-time. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. II: 241–248 (2003)
Google Scholar
Wenzel, M.T., Schiffmann, W.H.: Head pose estimation of partially occluded faces. IEEE CRV, 353–360 (2005)
Google Scholar
Wang, Y.: VisGenie, http://www.ee.columbia.edu/~ywang/Research/VisGenie/
Waters, K.: A muscle model for animating three-dimensional facial expressions. Computer Graphics 21(4), 17–24 (1987)
Article MathSciNet Google Scholar
Watkinson, J.: The MPEG handbook: MPEG1, MPEG2, MPEG4. Focal Press (2001)
Google Scholar
Wen, Z., Huang, T.S.: 3D face processing: modeling, analysis, and synthesis. Kluwer Academic Publishers, Boston (2004)
MATH Google Scholar
Welsh, W.J., Searby, S., Waite, J.B.: Model-based image coding. British Telecom Technology Journal 8(3), 94–106 (1990)
Google Scholar
Xiao, B., Zhang, Q., Wei, X.: A NURBS facial model based on MPEG-4. In: 16th International Conference on Artificial Reality and Telexistence–Workshops, pp. 491–495 (2006)
Google Scholar
Yang, T., Li, S.Z., Pan, Q., Li, J., Zhao, C.H.: Reliable and fast tracking of faces under varying pose. IEEE FGR, 421–428 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, University at Buffalo, 201 Bell Hall, NY, 14260, USA
Yun Fu
Beckman Institute, University of Illinois, Urbana-Champaign, IL, 61801, USA
Hao Tang & Thomas S. Huang
Visualization and Computer Vision Lab, General Electric, Niskayuna, NY, 12308, USA
Jilin Tu
Department of Computer Engineering, Baskin School of Engineering, University of California, Santa Cruz, CA, 95064
Hai Tao

Authors

Yun Fu
View author publications
You can also search for this author in PubMed Google Scholar
Hao Tang
View author publications
You can also search for this author in PubMed Google Scholar
Jilin Tu
View author publications
You can also search for this author in PubMed Google Scholar
Hai Tao
View author publications
You can also search for this author in PubMed Google Scholar
Thomas S. Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University at Buffalo, Buffalo, NY, USA
Chang Wen Chen
Hong Kong Polytechnic University, China
Zhu Li
France Telecom R&D, Beijing, China
Shiguo Lian

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Fu, Y., Tang, H., Tu, J., Tao, H., Huang, T.S. (2010). Human-Centered Face Computing in Multimedia Interaction and Communication. In: Chen, C.W., Li, Z., Lian, S. (eds) Intelligent Multimedia Communication: Techniques and Applications. Studies in Computational Intelligence, vol 280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11686-5_16

Download citation

DOI: https://doi.org/10.1007/978-3-642-11686-5_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-11685-8
Online ISBN: 978-3-642-11686-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics