Modeling and Synthesis of Facial Motion Driven by Speech

Saisan, Payam; Bissacco, Alessandro; Chiuso, Alessandro; Soatto, Stefano

doi:10.1007/978-3-540-24672-5_36

Modeling and Synthesis of Facial Motion Driven by Speech

Payam Saisan¹⁶,
Alessandro Bissacco¹⁶,
Alessandro Chiuso¹⁷ &
…
Stefano Soatto¹⁶

Conference paper

2752 Accesses
4 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 3023))

Abstract

We introduce a novel approach to modeling the dynamics of human facial motion induced by the action of speech for the purpose of synthesis. We represent the trajectories of a number of salient features on the human face as the output of a dynamical system made up of two subsystems, one driven by the deterministic speech input, and a second driven by an unknown stochastic input. Inference of the model (learning) is performed automatically and involves an extension of independent component analysis to time-depentend data. Using a shape-texture decompositional representation for the face, we generate facial image sequences reconstructed from synthesized feature point positions.

Download to read the full chapter text

Chapter PDF

References

Blanz, V., Vetter, T.: A morphable model for synthesis of 3d faces. In: Proceedings of ACM SIGGRAPH, pp. 187–194 (1999)
Google Scholar
Brand, M.: Voice Puppetry. In: Proceedings of ACM SIGGRAPH 1999, pp. 21–28 (1999)
Google Scholar
Bregler, C., Covell, M., Slaney, M.: Video Rewrite: Driving Visual Speech with Audio. In: Proceedings of ACM SIGGRAPH, pp. 353–360 (1997)
Google Scholar
Chiuso, A., Picci, G.: Subspace identification by orthogonal decomposition. In: Proc. 14th IFAC World Congress, vol. I, pp. 241–246 (1999)
Google Scholar
Chiuso, A., Picci, G.: Subspace identification by data orthogonalization and model decoupling (2003) (submitted to Automatica)
Google Scholar
Chiuso, A., Picci, G.: Asymptotic variance of subspace methods by data orthogonalization and model decoupling. In: Proc. of the IFAC Int. Symposium on System Identification (SYSID), Rotterdam (August 2003)
Google Scholar
Chuang, E., Bregler, C.: Facialexpression space learning. To appear in Pacifica Graphics (2002)
Google Scholar
Comon, P.: Independent component analysis, a new concept? Signal Processing 36, 287–314 (1994)
Article MATH Google Scholar
Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active Appearance Models. In: Burkhardt, H., Neumann, B. (eds.) ECCV 1998. LNCS, vol. 1407, p. 484. Springer, Heidelberg (1998)
Chapter Google Scholar
Ezzat, T., Geiger, G., Poggio, T.: Trainable Videorealistic Speech Animation. In: Proceedings of ACM SIGGRAPH 2002, pp. 388–398 (2002)
Google Scholar
Giannakis, B., Mendel, J.: Identification of nonminimum phase systems using higher order statistics. IEEE Trans. Acoustic Speech and Signal Processing 37(3), 360–377 (1989)
Article MATH MathSciNet Google Scholar
Granger, C.W.J.: Economic processes involving feedback. Information and Control 6, 28–48 (1963)
Article MATH MathSciNet Google Scholar
Grenander, U.: Elements of Pattern Thoery. The Johns Hopkins University Press, Baltimore (1996)
Google Scholar
Hyvärinen, A.: Independent component analysis for time-dependent stochastic processes (1998)
Google Scholar
Jin, H., Favaro, P., Soatto, S.: Real-time Feature Tracking and Outlier Rejection with Changes in Illumination. In: Proc. of the Intl. Conf. on Computer Vision (July 2001)
Google Scholar
Ljung, L.: System indentification: theory for the user. Prentice-Hall, Inc., Englewood Cliffs (1986) ISBN 0-138-81640-9
Google Scholar
Matthews, I., Baker, S.: Active Appearance Models Revisited. International Journal of Computer Vision (2004)
Google Scholar
Picci, G., Katayama, T.: Stochastic realization with exogenous inputs and “subspace methods” identification. Signal Processing 52, 145–160 (1996)
Article MATH Google Scholar
Saisan, P., Bissacco, A.: Image-based modeling of human gaits with higher-order statistics. In: Proc. of the Intl. Workshop on Dynamic Scene Analysis, Kopenhagen (June 2002)
Google Scholar
Shi, J., Tomasi, C.: Good Features to Track. In: CVPR (1994)
Google Scholar
Van Overschee, P., De Moor, B.: Subspace algorithms for the stochastic identification problem. Automatica 29, 649–660 (1993)
Article MATH Google Scholar
Zhang, L, Cichocki, A.: Blind deconvolution of Dynamical Systems: A State-Space Approach. In: Proceedings of the IEEE Workshop on NNSP 1998, pp. 123–131 (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

University of California, Los Angeles, CA, 90095, USA
Payam Saisan, Alessandro Bissacco & Stefano Soatto
University of Padova, Italy, 35131
Alessandro Chiuso

Authors

Payam Saisan
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Bissacco
View author publications
You can also search for this author in PubMed Google Scholar
Alessandro Chiuso
View author publications
You can also search for this author in PubMed Google Scholar
Stefano Soatto
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Center for Machine Perception, Department of Cybernetics, Faculty of Electrical Engineering, Czech Technical University, Prague 6, Czech Republic
Tomáš Pajdla
Center for Machine Perception, Dept. of Cybernetics, Faculty of Elec. Eng., Czech Technical University in Prague, Karlovo nám. 13, 121 35, Prague, Czech Rep
Jiří Matas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Saisan, P., Bissacco, A., Chiuso, A., Soatto, S. (2004). Modeling and Synthesis of Facial Motion Driven by Speech. In: Pajdla, T., Matas, J. (eds) Computer Vision - ECCV 2004. ECCV 2004. Lecture Notes in Computer Science, vol 3023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24672-5_36

Download citation

DOI: https://doi.org/10.1007/978-3-540-24672-5_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21982-8
Online ISBN: 978-3-540-24672-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics