Multimedia Tools and Applications

, Volume 47, Issue 1, pp 163–187 | Cite as

A nonparametric regression model for virtual humans generation

  • Yun-Feng Chou
  • Zen-Chung ShihEmail author


In this paper, we propose a novel nonparametric regression model to generate virtual humans from still images for the applications of next generation environments (NG). This model automatically synthesizes deformed shapes of characters by using kernel regression with elliptic radial basis functions (ERBFs) and locally weighted regression (LOESS). Kernel regression with ERBFs is used for representing the deformed character shapes and creating lively animated talking faces. For preserving patterns within the shapes, LOESS is applied to fit the details with local control. The results show that our method effectively simulates plausible movements for character animation, including body movement simulation, novel views synthesis, and expressive facial animation synchronized with input speech. Therefore, the proposed model is especially suitable for intelligent multimedia applications in virtual humans generation.


Image deformation Nonparametric regression Elliptic radial basis functions Functional approximation Locally weighted regression 



This work is supported partially by the National Science Council, Republic of China, under grant NSC 98-2221-E-009-123-MY3. We would like to thank Prof. Sang-Soo Yeo and reviewers for their helpful suggestions.


  1. 1.
    Alexa M, Cohen-Or D, Levin D (2000) As-rigid-as-possible shape interpolation. In SIGGRAPH ’00 157–164Google Scholar
  2. 2.
    Arad N, Dyn N, Reisfeld D, Yeshurun Y (1994) Image warping by radial basis functions: applications to facial expressions. CVGIP Graph Models Image Process 56(2):161–172CrossRefGoogle Scholar
  3. 3.
    Baker S, Scharstein D, Lewis JP, Roth S, Black MJ, Szeliski R (2007) A database and evaluation methodology for optical flow. In IEEE International Conference on Computer Vision 1–8Google Scholar
  4. 4.
    Blanz V, Basso C, Poggio T, Vetter T (2003) Reanimating faces in images and video. Comput Graph Forum 22(3):641–650CrossRefGoogle Scholar
  5. 5.
    Botsch M, Sorkine O (2008) On linear variational surface deformation methods. IEEE Trans Vis Comput Graph 14(1):213–230CrossRefGoogle Scholar
  6. 6.
    Brand M (1999) Voice puppetry. In SIGGRAPH ’99 21–28Google Scholar
  7. 7.
    Bruce HT, Calder P (1995) Animating direct manipulation interfaces. In the 8th ACM Symposium on User Interface Software and Technology 3–12Google Scholar
  8. 8.
    Busso C, Deng Z, Grimm M, Neumann U, Narayanan SS (2007) Rigid head motion in expressive speech animation: analysis and synthesis. IEEE Trans Audio Speech Lang Process 15(8):1075–1086CrossRefGoogle Scholar
  9. 9.
    Busso C, Narayanan SS (2007) Interrelation between speech and facial gestures in emotional utterances: a single subject study. IEEE Trans Audio Speech Lang Process 15(8):2331–2347CrossRefGoogle Scholar
  10. 10.
    Chan TF (2001) Active contours without edges. IEEE Trans Image Process 10(2):266–277zbMATHCrossRefGoogle Scholar
  11. 11.
    Chen SE, William L (1993) View interpolation for image synthesis. In SIGGRAPH ’93 279–288Google Scholar
  12. 12.
    Chuang Y-Y, Goldman DB, Zheng KC, Curless B, Salesin D, Szeliski R (2005) Animating pictures with stochastic motion textures. ACM Trans Graph 24(3):853–860CrossRefGoogle Scholar
  13. 13.
    Deng Z, Neumann U (2006) efase: expressive facial animation synthesis and editing with phoneme-isomap controls. In SIGGRAPH/Eurographics Symposium on Computer Animation 251–260Google Scholar
  14. 14.
    Ezzat TF, Geiger G, Poggio T (2002) Trainable video realistic speech animation. ACM Trans Graph 21(3):388–398CrossRefGoogle Scholar
  15. 15.
    Forstmann S, Ohya J, Krohn-Grimberghe A, McDougall R (2007) Deformation styles for spline-based skeletal animation. In SIGGRAPH/Eurographics Symposium on Computer Animation 141–150Google Scholar
  16. 16.
    Fu T, Foroosh H (2004) Expression morphing from distant viewpoints. In International Conference on Image Processing 3519–3522Google Scholar
  17. 17.
    Glocker B, Paragios N, Komodakis K, Tziritas G, Navab N (2008) Optical flow estimation with uncertainties through dynamic MRFs. In IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
  18. 18.
    Goldstein E, Gotsman C (1995) Polygon morphing using a multiresolution representation. In Graphics Interface ’95 247–254Google Scholar
  19. 19.
    Herbrich R (2002) Learning kernel classifiers theory and algorithms. The MIT PressGoogle Scholar
  20. 20.
    Hornung A, Dekkers E, Kobbelt L (2007) Character animation from 2D pictures and 3D motion data. ACM Transaction on Graphics 26(1) Article No. 1Google Scholar
  21. 21.
    Igarashi T, Moscovich T, Hughes JF (2005) As-rigid-as-possible shape manipulation. ACM Trans Graph 24(3):1134–1141CrossRefGoogle Scholar
  22. 22.
    Jang Y, Botchen RP, Lauser A, Ebert DS, Gaither KP, Ertl T (2006) Enhancing the interactive visualization of procedurally encoded multifield data with ellipsoidal basis functions. Comput Graph Forum 25(3):587–596CrossRefGoogle Scholar
  23. 23.
    Lempitsky L, Roth S, Rother C (2008) FusionFlow: discrete-continuous optimization for optical flow estimation. In IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
  24. 24.
    Li Y, Huttenlocher D (2008) Learning for optical flow using stochastic optimization. In the 10th European Conference on Computer Vision 2:379–391Google Scholar
  25. 25.
    Litwinowicz P, Williams L (1994) Animating images with drawings. In SIGGRAPH ’94 409–412Google Scholar
  26. 26.
    Mahajan D, Huang F-C, Matusik W, Ramamoorthi R, Belhumeur P (2009) Moving gradients: a path-based method for plausible image interpolation. ACM Transaction on Graphics 28(3) Article No. 42Google Scholar
  27. 27.
    McGurk H, MacDonald J (1976) Hearing lips and seeing voices. Nature 746–748Google Scholar
  28. 28.
    Montgomery DC, Peck EA, Vining GG (2006) Introduction to linear regression analysis. WileyGoogle Scholar
  29. 29.
    Mukundan R, Ong SH, Lee PA (2001) Image analysis by tchebichef moments. IEEE Trans Image Process 10(9):1357–1364zbMATHCrossRefMathSciNetGoogle Scholar
  30. 30.
    Ngo T, Cutrell D, Dan J, Donald B, Loeb L, Zhu S (2000) Accessible animation and customizable graphics via simplicial configuration modeling. In SIGGRAPH ’00 403–410Google Scholar
  31. 31.
    Park J, Sandberg WI (1993) Nonlinear approximations using elliptic basis function networks. In 32nd Conference on Decision and Control 3700–3705Google Scholar
  32. 32.
    Rabiner LR (1990) A tutorial on hidden markov models and selected applications in speech recognition. Readings in speech recognition 267–296Google Scholar
  33. 33.
    Ranjan V, Fournier A (1996) Matching and interpolation of shapes using unions of circles. Comput Graph Forum 15(3):129–142CrossRefGoogle Scholar
  34. 34.
    Ren X (2008) Local grouping for optical flow. In IEEE Conference on Computer Vision and Pattern RecognitionGoogle Scholar
  35. 35.
    Rother C, Kolmogorov V, Blake A (2004) “GrabCut”: interactive foreground extraction using iterated graph cuts. ACM Trans Graph 23(3):309–314CrossRefGoogle Scholar
  36. 36.
    Ruprecht D, Müller H (1995) Image warping with scattered data interpolation. IEEE Comput Graph Appl 15(2):37–43CrossRefGoogle Scholar
  37. 37.
    Schaefer S, Mcphail T, Warren J (2006) Image deformation using moving least squares. ACM Trans Graph 25(3):533–540CrossRefGoogle Scholar
  38. 38.
    Sederberg T, Greenwood E (1992) A physically based approach to 2D shape blending. In SIGGRAPH ’92 25–34Google Scholar
  39. 39.
    Seitz SM, Dyer CR (1996) View morphing. In SIGGRAPH ’96 21–30Google Scholar
  40. 40.
    Sethian JA (1996) Level set methods. Cambridge University PressGoogle Scholar
  41. 41.
    Sethian JA (1999) Level set methods and fast marching methods: evolving interfaces in computational geometry, fluid mechanics, computer vision, and materials science. Cambridge University PressGoogle Scholar
  42. 42.
    Sun D, Roth S, Lewis JP, Black MJ (2008) Learning optical flow. In the 10th European Conference on Computer Vision 3:83–97Google Scholar
  43. 43.
    Trobin W, Pock T, Cremers D, Bischof H (2008) Continuous energy minimization via repeated binary fusion. In the 10th European Conference on Computer Vision 4:677–690Google Scholar
  44. 44.
    Vedula S, Baker S, Kanade T (2005) Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Trans Graph 24(2):240–261CrossRefGoogle Scholar
  45. 45.
    Vorobyov SA, Cichocki A (2001) Hyper radial basis function neural networks for interference cancellation with nonlinear processing of reference signal. Digit Signal Process 11(3):204–221CrossRefGoogle Scholar
  46. 46.
    Wang Y, Xu K, Xiong Y, Cheng Z-Q (2008) 2D shape deformation based on rigid square matching. Computer Animation and Virtual Worlds 19(3–4):411–420CrossRefGoogle Scholar
  47. 47.
    Weber O, Ben-Chen M, Gotsman C (2009) Complex barycentric coordinates with applications to planar shape deformation. Comput Graph Forum 28(2):587–397CrossRefGoogle Scholar
  48. 48.
    Wolberg G (1998) Image morphing: a survey. Vis Comput 14(8):360–372CrossRefGoogle Scholar
  49. 49.
    Xu L, Chen J, Jia J (2008) Segmentation based variational model for accurate optical flow estimation. In the 10th European Conference on Computer Vision 1:671–684Google Scholar
  50. 50.
    Yan H-B, Hu S-M, Martin RR, Yang Y-L (2008) Shape deformation using a skeleton to drive simplex transformations. IEEE Trans Vis Comput Graph 14(3):693–706CrossRefGoogle Scholar
  51. 51.
    Yotsukura T, Morishima S, Nakamura S (2003) Model-based talking face synthesis for anthropomorphic spoken dialog agent system. In the 11th ACM International Conference on Multimedia 351–354Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2009

Authors and Affiliations

  1. 1.Department of Computer ScienceNational Chiao Tung UniversityHsinchu CityTaiwan

Personalised recommendations