Skip to main content

Human-Centered Face Computing in Multimedia Interaction and Communication

  • Chapter
Intelligent Multimedia Communication: Techniques and Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 280))

Abstract

Facial image computing has been an extensively studied topic since its wide applications in human-centered computing, human-avatar interaction, virtual reality, and multimedia communication. Successful systems have been equipped with realistic face models, efficient compression algorithms, reliable animation techniques and user friendly interaction schemes. In this chapter, we will mainly focus on techniques, algorithms, models, applications and real-world systems. Comprehensive summarization of the state-of-the-art works will be presented as well as our experiences and contributions in the field, especially several real prototype systems developed in our group, such as the online interactive gaming system hMouse, humanoid emotive audio-visual avatar, and 3D face/head tracking based video compression. Performances of these three systems are also illustrated based on standard evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ahlberg, J.: CANDIDE-3 – An updated parameterized face. Dept. of Electrical Engineering, Linkping University, Sweden, Tech. Rep. LiTH-ISY-R-2326 (2001)

    Google Scholar 

  2. Azarbayejani, A., Horowitz, B., Pentland, A.: Recursive estimation of structure and motion using relative orientation constraints. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 294–299 (1993)

    Google Scholar 

  3. Aizawa, K., Huang, T.S.: Model based image coding: Advanced video coding techniques for very low bit-rate applications. Proc. IEEE 83, 259–271 (1995)

    Article  Google Scholar 

  4. Bradski, G.R.: Real time face and object tracking as a component of a perceptual user interface. IEEE WACV, 214–219 (1998)

    Google Scholar 

  5. Bradski, G.R.: Computer vision face tracking for use in a perceptual user interface. Intel Technology Journal (1998)

    Google Scholar 

  6. Brand, M., Bhotika, R.: Flexible flow for 3D nonrigid tracking and shape recovery. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 315–322 (2001)

    Google Scholar 

  7. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Wadsworth, Belmont (1984)

    MATH  Google Scholar 

  8. Betke, M., Gips, J., Fleming, P.: The camera mouse: visual tracking of body features to provide computer access for people with severe disabilities. IEEE Trans. on NSRE 10(1), 1–10 (2002)

    Google Scholar 

  9. Bergasa, L.M., Mazo, M., Gardel, A., Barea, R., Boquete, L.: Commands generation by face movements applied to the guidance of a wheelchair for handicapped people. In: ICPR, pp. 660–663 (2000)

    Google Scholar 

  10. Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, 187–194 (1999)

    Google Scholar 

  11. CameraMouse, Inc., http://www.cameramouse.com/

  12. Chen, Y.L.: Application of tilt sensors in human-computer mouse interface for people with disabilities. IEEE Trans. on NSRE 9(3), 289–294 (2001)

    Google Scholar 

  13. Cascia, M.L., Sclaroff, S., Athitsos, V.: Fast, reliable head tracking under varying illumination: an approach based on registration of texture-mapped 3D models. IEEE Trans. on PAMI 22(4), 322–336 (2000)

    Google Scholar 

  14. Chang, C.C., Tsai, W.H.: Determination of head pose and facial expression from a single perspective view by successive scaled orthographic approximations. Intl. Journal of Computer Vision 46(3), 179–199 (2002)

    Article  MATH  Google Scholar 

  15. Cootes, T.F., Taylor, C.J.: Active shape model search using local grey-level models: A quantitative evaluation. In: 4th British Machine Vision Conference, pp. 639–648 (1993)

    Google Scholar 

  16. DAZ3D, http://www.daz3d.com/

  17. Dutoit, T.: An introduction to text-to-speech synthesis. Kluwer Academic Publishers, Dordrecht (1997)

    Google Scholar 

  18. DeMenthon, D., Davis, L.: Model-based object psoe in 25 lines of codes. In: European Conference on Computer Vision, pp. 335–343 (1992)

    Google Scholar 

  19. DeCarlo, D., Metaxas, D.: Optical flow constraints on deformable models with applications to face tracking. International Journal of Computer Vision 38(2), 99–127 (2000)

    Article  MATH  Google Scholar 

  20. Dalong, J., Zhiguo, L., Zhaoqi, W., Wen, G.: Animating 3d facial models with MPEG-4 FaceDefTables. In: Proc. of the 35th Annual Simulation Symposium, p. 395 (2002)

    Google Scholar 

  21. Evans, D.G., Drew, R., Blenkhorn, P.: Controlling mouse pointer position using an infrared head-operated joystick. IEEE Trans. on NSRE 8(1), 107–117 (2000)

    Google Scholar 

  22. Ekman, P., Friesen, W.V.: The facial action coding system. Consult. Psychological Press, Palo Alto (1978)

    Google Scholar 

  23. Eisert, P., Wiegand, T., Girod, B.: Model-aided coding: A new approach to incorporate facial animation into motion-compensated video coding. IEEE Transactions on Circuits and Systems for Video Technology 10(3), 344–358 (2000)

    Article  Google Scholar 

  24. Fisher, C.G.: Confusions among visually perceived consonants. Journal of Speech and Hearing Research 11(4), 796–804 (1968)

    Google Scholar 

  25. Fu, Y., Huang, T.S.: hMouse: head tracking driven virtual Computer Mouse. IEEE WACV (2007)

    Google Scholar 

  26. Grauman, K., Betke, M., Gips, J., Bradski, G.R.: Communication via eye blinks—detection and duration analysis in real time. IEEE CVPR 1, 1010–1017 (2001)

    Google Scholar 

  27. Guo, H., Jiang, J., Zhang, L.: Building a 3D morphable face model by using thin plate splines for face reconstruction. In: Advances in Biometric Person Authentication, pp. 258–267. Springer, Heidelberg (2004)

    Google Scholar 

  28. Gorodnichy, D.O., Malik, S., Roth, G.: Nouse ‘use your nose as a mouse’ - a new technology for hands-free games and interfaces, vol. VI, pp. 354–360 (2002)

    Google Scholar 

  29. H.264: Advanced video coding for generic audiovisual services. International Telecommunication Union, Tech. Rep. (2007)

    Google Scholar 

  30. Hilgert, C.: Pingu Throw, YetiSports 1, http://www.yetisports.org/

  31. Hilgert, C.: Orca Slap, YetiSports 2, http://www.yetisports.org/

  32. Hong, P., Huang, T.S.: Natural mouse-a novel human computer interface. IEEE ICIP 1, 653–656 (1999)

    Google Scholar 

  33. Hong, P., Wen, Z., Huang, T.S.: iFACE: a 3D synthetic talking face. International Journal of Image and Graphics 1(1), 1–8 (2001)

    Article  Google Scholar 

  34. Hong, P., Wen, Z., Huang, T.S.: Real-time speech-driven expressive synthetic talking faces using neural networks. IEEE Transaction on Neural Networks 13(4), 916–927 (2002)

    Article  Google Scholar 

  35. Java mobile 3D, http://www.jcp.org/en/jsr/detail?id=184

  36. Kalman, R.E.: A new approach to linear filtering and prediction problems. Trans. of the ASME–Journal of Basic Engineering 82(D), 35–45 (1960)

    Google Scholar 

  37. Kampmann, M.: Automatic 3-d face model adaptation for model-based coding of videophone sequences. IEEE Trans. on Circuits and Systems for Video Technology 12(3), 172–182 (2002)

    Article  Google Scholar 

  38. Lapack, LAPACK – Linear Algebra Package, http://www.netlib.org/lapack/

  39. Logitech, Inc., http://www.logitech.com/

  40. Li, R.X.: A technical report on Motorola avatar framework, TR-2006, Applications Research Center, Motorola Labs, Schaumburg, IL (2006)

    Google Scholar 

  41. Lu, L., Dai, X., Hager, G.: A particle filter without dynamics for robust 3D face tracking. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition Workshop, vol. 5, p. 70 (2004)

    Google Scholar 

  42. Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. IEEE ICIP 1, 900–903 (2002)

    Google Scholar 

  43. Lee, Y.C., Terzopoulos, D., Waters, K.: Realistic modeling for facial animation. In: SIGGRAPH, pp. 55–62 (1995)

    Google Scholar 

  44. MouseVision, Inc., http://www.mousevision.com/

  45. Mehrabian, A.: Communication without words. Psychology Today 2(4), 53–56 (1968)

    Google Scholar 

  46. Massaro, D.W.: Perceiving talking faces: from speech perception to a behavioral principle. MIT Press, Cambridge (1998)

    Google Scholar 

  47. Musmann, H.: A layered coding system for very low bit rate video coding. Signal Processing: Image Communication 7, 267–278 (1995)

    Article  Google Scholar 

  48. Morris, T., Chauhan, V.: Facial feature tracking for cursor control. Journal of Network and Computer Applications 29(1), 62–80 (2006)

    Article  Google Scholar 

  49. Nanda, H., Fujimura, K.: A robust elliptical head tracker. In: IEEE FGR, pp. 469–474 (2004)

    Google Scholar 

  50. Open source computer vision library, Intel Corporation, http://www.intel.com/technology/computing/opencv/index.htm

  51. Parke, F.I., Waters, K.: Computer facial animation. AK Peters, Ltd., Wellesley (1996)

    Google Scholar 

  52. Roy, B., Bouyssou, D.: Video coding for low bitrate communication. ITU-T Recommendation H.263 (1996)

    Google Scholar 

  53. Schroder, M.: Speech and emotion research: an overview of research frameworks and a dimensional approach to emotional speech synthesis (Ph.D thesis). Phonus, Research Report of the Institute of Phonetics, 7, Saarland University (2004)

    Google Scholar 

  54. Salas, S.L., Hille, E.: Computational imaging and vision. In: A survey on 3D modeling of human faces for face recognition, ch. 2. Springer, Netherlands (2007)

    Google Scholar 

  55. Tu, J.: Visual face tracking and its applications (Ph.D thesis), University of Illinois at Urbana-Champaign (2007)

    Google Scholar 

  56. Toyama, K.: ‘Look, ma - no hands!’ hands-free cursor control with real-time 3D face tracking. In: Proc. Workshop on Perceptual User Interfaces, San Fransisco (1998)

    Google Scholar 

  57. Tang, H., Fu, Y., Tu, J., Huang, T.S., Hasegawa-Johnson, M.: EAVA: a 3D emotive audio-visual avatar. In: 2008 IEEE Workshop on Applications of Computer Vision, Copper Mountain, CO (2008)

    Google Scholar 

  58. Tang, H., Fu, Y., Tu, J., Hasegawa-Johnson, M., Huang, T.S.: Humanoid audio-visual avatar with emotive text-to-speech synthesis. IEEE Transactions on Multimedia 10(6), 969–981 (2008)

    Article  Google Scholar 

  59. Tang, H., Hu, Y., Fu, Y., Hasegawa-Johnson, M., Huang, T.S.: Real-time conversion from a single 2D face image to a 3D text-driven emotive audio-visual avatar. In: 2008 IEEE International Conference on Multimedia & Expo., Hannover, Germany (2008)

    Google Scholar 

  60. Tao, H., Huang, T.S.: Explanation-based facial motion tracking using a piecewise Bézier volume deformation model. IEEE CVPR 1, 611–617 (1999)

    Google Scholar 

  61. Tu, J., Huang, T.S., Tao, H.: Face as mouse through visual face tracking. In: IEEE CRV, pp. 339–346 (2005)

    Google Scholar 

  62. Tu, J., Tao, H., Huang, T.S.: Face as mouse through visual face tracking. CVIU Special Issue on Vision for Human-Computer Interaction 108(1-2), 35–40 (2007)

    Google Scholar 

  63. Tu, J., Wen, Z., Tao, H., Huang, T.S.: Coding face at very low bit rate via visual face tracking. In: Proc. of Picture Coding Symposium, Saint Malo, France, pp. 301–304 (2003)

    Google Scholar 

  64. Tang, H., Zhou, X., Odisio, M., Hasegawa-Johnson, M., Huang, T.S.: Two-stage prosody prediction for emotional text-to-speech synthesis. In: INTERSPEECH 2008, Brisbane, Australia (2008)

    Google Scholar 

  65. Tu, J., Zhang, Z., Zeng, Z., Huang, T.S.: Face Localization via Hierarchical CONDENSATION with Fisher Boosting Feature Selection. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 719–724 (2004)

    Google Scholar 

  66. Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. IEEE CVPR, 511–518 (2001)

    Google Scholar 

  67. Viola, P., Jones, M.: Robust real-time object detection. International Journal of Computer Vision 747 (2002)

    Google Scholar 

  68. Vaccetti, L., Lepetit, V., Fua, P.: Fusing online and o2ine information for stable 3D tracking in real-time. In: Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. II: 241–248 (2003)

    Google Scholar 

  69. Wenzel, M.T., Schiffmann, W.H.: Head pose estimation of partially occluded faces. IEEE CRV, 353–360 (2005)

    Google Scholar 

  70. Wang, Y.: VisGenie, http://www.ee.columbia.edu/~ywang/Research/VisGenie/

  71. Waters, K.: A muscle model for animating three-dimensional facial expressions. Computer Graphics 21(4), 17–24 (1987)

    Article  MathSciNet  Google Scholar 

  72. Watkinson, J.: The MPEG handbook: MPEG1, MPEG2, MPEG4. Focal Press (2001)

    Google Scholar 

  73. Wen, Z., Huang, T.S.: 3D face processing: modeling, analysis, and synthesis. Kluwer Academic Publishers, Boston (2004)

    MATH  Google Scholar 

  74. Welsh, W.J., Searby, S., Waite, J.B.: Model-based image coding. British Telecom Technology Journal 8(3), 94–106 (1990)

    Google Scholar 

  75. Xiao, B., Zhang, Q., Wei, X.: A NURBS facial model based on MPEG-4. In: 16th International Conference on Artificial Reality and Telexistence–Workshops, pp. 491–495 (2006)

    Google Scholar 

  76. Yang, T., Li, S.Z., Pan, Q., Li, J., Zhao, C.H.: Reliable and fast tracking of faces under varying pose. IEEE FGR, 421–428 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Fu, Y., Tang, H., Tu, J., Tao, H., Huang, T.S. (2010). Human-Centered Face Computing in Multimedia Interaction and Communication. In: Chen, C.W., Li, Z., Lian, S. (eds) Intelligent Multimedia Communication: Techniques and Applications. Studies in Computational Intelligence, vol 280. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-11686-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-11686-5_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-11685-8

  • Online ISBN: 978-3-642-11686-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics