A Study on 2D Photo-Realistic Facial Animation Generation Using 3D Facial Feature Points and Deep Neural Networks
This paper proposes a technique for generating a 2D photo-realistic facial animation from an input text. The technique is based on the mapping from 3D facial feature points with deep neural networks (DNNs). Our previous approach was based only on a 2D space using hidden Markov models (HMMs) and DNNs. However, this approach has a disadvantage that generated 2D facial pixels are sensitive to the rotation of the face in the training data. In this study, we alleviate the problem using 3D facial feature points obtained by Kinect. The information of the face shape and color is parameterized by the 3D facial feature points. The relation between the labels from texts and face-model parameters are modeled by DNNs in the model training. As a preliminary experiment, we show that the proposed technique can generate the 2D facial animation from arbitrary input texts.
KeywordsPhoto-realistic facial animation Face image synthesis Deep neural network Kinect
Part of this work was supported by JSPS KAKENHI Grant Number JP15H02720 and JP26280055.
- 1.Anderson, R., Stenger, B., Wan, V., Cipolla, R.: Expressive visual text-to-speech using active appearance models. In: Proceedings of the Computer Vision and Pattern Recognition (CVPR), pp. 3382–3389 (2013)Google Scholar
- 2.Besl, P.J., McKay, N.D.: Method for registration of 3-D shapes. In: Robotics-DL tentative, pp. 586–606. International Society for Optics and Photonics (1992)Google Scholar
- 5.Kinect for Windows SDK 2.0 Programming Guide: High definition face tracking. https://msdn.microsoft.com/en-us/library/dn785525.aspx
- 10.Sako, S., Tokuda, K., Masuko, T., Kobayashi, T., Kitamura, T.: HMM-based text-to-audio-visual speech synthesis. In: Proceedings of the INTERSPEECH, pp. 25–28 (2000)Google Scholar
- 11.Sato, K., Nose, T., Ito, A.: Synthesis of photo-realistic facial animation from text based on HMM and DNN with animation unit. In: Proceeding of the Twelfth International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), pp. 29–36 (2017)Google Scholar
- 12.Zen, H., Senior, A., Schuster, M.: Statistical parametric speech synthesis using deep neural networks. In: Proceedings of the ICASSP, pp. 7962–7966 (2013)Google Scholar