Abstract
This paper proposes a speech synchronized tongue animation system from text or speech. Firstly, an anatomically accurate physiological tongue model is built, and then produces tremendous tongue deformation samples according to the randomly input muscle activation samples. Secondly, these input and output samples are used to train a neural network for establishing the relationship between the muscle activation and tongue contour deformation. Thirdly, the neural network is used to estimate the non-rigid tongue movement parameters, namely tongue muscle activations, from a collected X-ray tongue movement image database of Mandarin Chinese phonemes after removing the rigid tongue movement, and then the estimation results are used for constructing the tongue physeme (the sequences of the tongue muscle activations and the rigid movement) database corresponding to the Mandarin Chinese phoneme database. Finally, the physemes corresponding to the phonemes extracted from input text or speech are blended to drive the physiological tongue model for producing the speech synchronized tongue animation according to the durations of phonemes. Simulation results demonstrate that the synthesized tongue animations are visually realistic and approximate the tongue medical data well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Parke, F.I.: Computer generated animation of faces. In: Proceedings ACM National Conference, pp. 451–457. ACM: New York (1972)
Waters, K.: A muscle model for animating three dimensional facial expression. In: Stone, M.C. (ed.) Computer Graphics, vol. 21, pp. 17–24. Anaheim, CA (1987)
Sanguineti, V., Laboissiere, R., Payan, Y.: A control model of human tongue movements in speech. Biol. Cybern. 77(1), 11–22 (1997)
Fujita, S., Dang, J., Suzuki, N., et al.: A computational tongue model and its clinical application. Oral Sci. Int. 4(2), 97–109 (2007)
Modeling coarticulation in synthetic visual speech
Badin, P., Bailly, G., et al.: Three-dimensional linear articulatory modeling of tongue, lips and face, based on MRI and video images. J. Phonetics 30(3), 533–553 (2002)
Engwall, O.: A 3D tongue model based on MRI data. In: INTERSPEECH, pp. 901–904 (2000)
Wilhelms-Tricarico, R.: Physiological modeling of speech production: methods for modeling soft -tissue articulators. JASA 97(5), 3085–3098 (1995)
King, S.A., Parent, R.E.: A 3D parametric tongue model for animated speech. J. Vis. Comput. Anim. 12(3), 107–115 (2001)
Ilie, M.D., Negrescu, C., Stanomir, D.: An efficient parametric model for real-time 3D tongue skeletal animation. In: ICC, pp. 129–132 (2012)
Engwall, O., Combining, M.R.I.: EMA and EPG measurements in a three-dimensional tongue model. Speech Commun. 41(2), 303–329 (2003)
Miyawaki, K.: A study of the musculature of the human tongue. Annu. Bull. Res. Inst. Logopedics Phoniatrics 8, 23–50 (1974)
Agur, A.M.R., et al.: Grant’s Atlas of Anatomy. Lippincott Williams & Wilkins, Baltimore (2009)
Mac Neilage, P.F., Sholes, G.N.: An electromyographic study of the tongue during vowel production. J. Speech Lang. Hear. Res. 7(3), 209–232 (1964)
Shewchuk, J.R.: Constrained Delaunay Tetrahedronlizations and provably good boundary recovery. In: IMR, pp. 193–204 (2002)
Takemoto, H.: Morphological analyses of the human tongue musculature for three-dimensional modeling. JSLHR 44(1), 95–107 (2001)
Weiss, J.A., Maker, B.N., Govindjee, S.: Finite element implementation of incompressible, transversely isotropic hyperelasticity. CMAME 135(1), 107–128 (1996)
Sifakis, E., Neverov, I., Fedkiw, R.: Automatic determination of facial muscle activations from sparse motion capture marker data. TOG ACM 24(3), 417–425 (2005)
Simo, J.C., Taylor, R.L.: Quasi-incompressible finite elasticity in principal stretches. Continuum Basis Numer. Algorithms CMAME 85(3), 273–310 (1991)
Tang, C.Y., et al.: A 3D skeletal muscle model coupled with active contraction of muscle fibres and hyperelastic behaviour. J. Biomech. 42(7), 865–872 (2009)
Baer, T., Alfonso, P.J., Honda, K.: Electromyography of the tongue muscles during vowels in /gpvp/ environment. Ann Bull RILP 22, 7–19 (1988)
Agur A M R, et al., Grant’s atlas of anatomy. Lippincott Williams & Wilkins, 2009
Cootes, T.F., et al.: Active appearance models. TPAMI 23(6), 681–685 (2001)
Laprie, Y., Berger, M.O.: Extraction of tongue contours in x-ray images with minimal user interaction. ICSLP 1, 268–271 (1996)
Deng, Z., Chiang, P.Y., Fox, P. et al.: Animating blendshape faces by cross-mapping motion capture data. Interactive 3D graphics and games, pp. 43–48. ACM (2006)
Sock, R., Hirsch, F., Laprie, Y. et al.: An X-ray database, tools and procedures for the study of speech production. In: ISSP, pp. 41–48 (2011)
Yu, J., Li, A.: 3D visual pronunciation of Mandarine Chinese for language learning. In: IEEE International Conference on Image Processing, pp. 2036–2040 (2014)
Acknowledgement
This work is supported by the National Natural Science Foundation of China (No. 61572450, No. 61303150), the Open Project Program of the State KeyLab of CAD&CG, Zhejiang University (No. A1501), the Fundamental Research Funds for the Central Universities (WK2350000002), the Open Funding Project of State Key Laboratory of Virtual Reality Technology and Systems, Beihang University (No. BUAA-VR-16KF-12), the Open Funding Project of State Key Laboratory of Novel Software Technology, Nanjing University (No. KFKT2016B08).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Yu, J. (2017). Speech Synchronized Tongue Animation by Combining Physiology Modeling and X-ray Image Fitting. In: Amsaleg, L., Guðmundsson, G., Gurrin, C., Jónsson, B., Satoh, S. (eds) MultiMedia Modeling. MMM 2017. Lecture Notes in Computer Science(), vol 10132. Springer, Cham. https://doi.org/10.1007/978-3-319-51811-4_59
Download citation
DOI: https://doi.org/10.1007/978-3-319-51811-4_59
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-51810-7
Online ISBN: 978-3-319-51811-4
eBook Packages: Computer ScienceComputer Science (R0)