Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
I. Albrecht, J. Haber, and H.P. Seidel. Automatic generation of non-verbal facial expressions from speech. In Computer Graphics International (CGI 2002), pages 283–293, Bradford, U.K., July 2002.
K.S. Arun, T.S. Huang, and S.D. Blostein. Least-squares fitting of two 3-d point sets. IEEE Trans. Pattern Anal. Mach. Intell., 9(5):698–700, 1987.
P. Boersma and D. Weeninck. Praat, a system for doing phonetics by computer. Technical Report 132, Institute of Phonetic Sciences of the University of Amsterdam, Amsterdam, Netherlands, 1996. http://www.praat.org.
M. Brand. Voice puppetry. In Proc. 26th Ann. Conf. Computer Graph. Interactive Techniques (SIGGRAPH 1999), pages 21–28, New York, 1999.
C. Bregler, M. Covell, and M. Slaney. Video rewrite: Driving visual speech with audio. In Proc. 24th Annual Conf. Computer Graphics Interactive Techniques (SIGGRAPH 1997), pages 353–360, Los Angeles, CA, August 1997.
C. Busso, Z. Deng, M. Grimm, U. Neumann, and S. Narayanan. Rigid head motion in expressive speech animation: Analysis and synthesis. IEEE Transactions on Audio, Speech and Language Processing, 15(3): 1075–1086, March 2007.
C. Busso, Z. Deng, U. Neumann, and S.S. Narayanan. Natural head motion synthesis driven by acoustic prosodic features. Computer Animation and Virtual Worlds, 16(3–4):283–290, July 2005.
C. Busso and S. Narayanan. Interrelation between speech and facial gestures in emotional utterances: A single subject study. Accepted to appear in IEEE Transactions on Speech, Audio and Language Processing, 2007.
C. Busso and S.S. Narayanan. Interplay between linguistic and affective goals in facial expression during emotional utterances. In 7th International Seminar on Speech Production (ISSP 2006), pages 549–556, Ubatuba-SP, Brazil, December 2006.
J. Cassell, C. Pelachaud, N. Badler, M. Steedman, B. Achorn, T. Bechet, B. Douville, S. Prevost, and M. Stone. Animated conversation: Rule-based generation of facial expression gesture and spoken intonation for multiple conversational agents. In Computer Graphics (Proc. ACM SIGGRAPH’94), pages 413–420, Orlando, FL, 1994.
E. Chuang and C. Bregler. Mood swings: Expressive speech animation. ACM Transactions on Graphics, 24(2):331–347, April 2005.
M.M. Cohen and D.W. Massaro. Modeling coarticulation in synthetic visual speech. In Magnenat-Thalmann N., Thalmann D. (Eds.), Models and Techniques in Computer Animation, Springer Verlag, pages 139–156, Tokyo, 1993.
R. Cowie and R.R. Cornelius. Describing the emotional states that are expressed in speech. Speech Communication, 40(1–2):5–32, April 2003.
D. DeCarlo, C. Revilla, M. Stone, and J.J. Venditti. Making discourse visible: coding and animating conversational facial displays. In Computer Animation (CA 2002), pages 11–16, Geneva, Switzerland, June 2002.
Z. Deng, M. Bulut, U. Neumann, and S. Narayanan. Automatic dynamic expression synthesis for speech animation. In IEEE 17th International Conference on Computer Animation and Social Agents (CASA 2004), pages 267–274, Geneva, Switzerland, July 2004.
Z. Deng, C. Busso, S. Narayanan, and U. Neumann. Audio-based head motion synthesis for avatar-based telepresence systems. In ACM SIGMM 2004 Workshop on Effective Telepresence (ETP 2004), pages 24–30, ACM Press, New York, 2004.
Z. Deng, J.P. Lewis, and U. Neumann. Automated eye motion using texture synthesis. IEEE Computer Graphics and Applications, 25(2):24–30, March/April 2005.
Z. Deng, J.P. Lewis, and U. Neumann. Synthesizing speech animation by learning compact speech co-articulation models. In Computer Graphics International (CGI 2005), pages 19–25, Stony Brook, NY, June 2005.
Z. Deng, U. Neumann, J.P. Lewis, T.Y. Kim, M. Bulut, and S. Narayanan. Expressive facial animation synthesis by learning speech co-articultion and expression spaces. IEEE Transactions on Visualization and Computer Graphics (TVCG), 12(6):1523–1534, November/December 2006.
D. Eberly. 3D Game Engine Design: A Practical Approach to Real-Time Computer Graphics. Morgan Kaufmann Publishers, San Francisco, CA, 2000.
P. Ekman. Facial expression and emotion. American Psychologist, 48(4): 384–392, April 1993.
P. Ekman and E.L. Rosenberg. What the Face Reveals: Basic and Applied Studies of Spontaneous Expression Using the Facial Action Coding System (FACS). Oxford University Press, New York, 1997.
H.P. Graf, E. Cosatto, V. Strom, and F.J. Huang. Visual prosody: Facial movements accompanying speech. In Proc. of IEEE International Conference on Automatic Faces and Gesture Recognition, pages 396–401, Washington, DC, May 2002.
B. Granström and D. House. Audiovisual representation of prosody in expressive speech communication. Speech Communication, 46(3–4):473–484, July 2005.
J. Gratch and S. Marsella. Lessons from emotion psychology for the design of lifelike characters. Applied Artificial Intelligence, 19(3–4):215–233, March–April 2005.
D. Heylen. Challenges ahead head movements and other social acts in conversation. In Artificial Intelligence and Simulation of Behaviour (AISB 2005), Social Presence Cues for Virtual Humanoids Symposium, page 8, Hertfordshire, U.K., April 2005.
H. Hill and A. Johnston. Categorizing sex and identity from the biological motion of faces. Current Biology, 11(11):880–885, June 2001.
H. Hotelling. Relations between two sets of variates. Biometrika, 28(3/4): 321–377, December 1936.
L.N. Jefferies, J.T. Enns, S. DiPaola, and A. Arya. Facial actions as visual cues for personality. Computer Animation and Virtual Worlds, 17(3–4):371–382, July 2006.
K. Kakihara, S. Nakamura, and K. Shikano. Speech-to-face movement synthesis based on HMMS. In IEEE International Conference on Multimedia and Expo (ICME), volume 1, pages 427–430, New York, April 2000.
S. Kettebekov, M. Yeasin, and R. Sharma. Prosody based audiovisual coanalysis for coverbal gesture recognition. IEEE Transactions on Multimedia, 7(2): 234–242, April 2005.
T. Kuratate, K.G. Munhall, P.E. Rubin, E. Vatikiotis-Bateson, and H. Yehia. Audio-visual synthesis of talking faces from speech production correlates. In Sixth European Conference on Speech Communication and Technology, Eurospeech 1999, pages 1279–1282, Budapest, Hungary, September 1999.
S. Lee, S. Yildirim, A. Kazemzadeh, and S. Narayanan. An articulatory study of emotional speech production. In 9th European Conference on Speech Communication and Technology (Interspeech’2005—Eurospeech), pages 497–500, Lisbon, Portugal, September 2005.
Y. Linde, A. Buzo, and R. Gray. An algorithm for vector quantizer design. IEEE Transactions on Communications, 28(1):84–95, January 1980.
Maya software, Alias Systems division of Silicon Graphics Limited. http://www.alias.com, 2005.
K.G. Munhall, J.A. Jones, D.E. Callan, T. Kuratate, and E. Vatikiotis-Bateson. Visual prosody and speech intelligibility: Head movement improves auditory speech perception. Psychological Science, 15(2):133–137, February 2004.
C. Pelachaud, N. Badler, and M. Steedman. Generating facial expressions for speech. Cognitive Science, 20(1):1–46, January 1996.
R.W. Picard. Affective computing. Technical Report 321, MIT Media Laboratory Perceptual Computing Section, Cambridge, MA, November 1995.
L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2):257–286, February 1989.
M.E. Sargin, O. Aran, A. Karpov, F. Ofli, Y. Yasinnik, S. Wilson, E. Erzin, Y. Yemez, and A.M. Tekalp. Combined gesture-speech analysis and speech driven gesture synthesis. In IEEE International Conference on Multimedia and Expo (ICME 2006), pages 893–896, Toronto, ON, Canada, July 2006.
K.R. Scherer. Vocal communication of emotion: A review of research paradigms. Speech Communication, 40(1–2):227–256, April 2003.
K. Shoemake. Animating rotation with quaternion curves. Computer Graphics (Proceedings of SIGGRAPH85), 19(3):245–254, July 1985.
K. Smid, I.S. Pandzic, and V. Radman. Autonomous speaker agent. In IEEE 17th International Conference on Computer Animation and Social Agents (CASA 2004), pages 259–266, Geneva, Switzerland, July 2004.
E. Vatikiotis-Bateson, K.G. Munhall, Y. Kasahara, F. Garcia, and H. Yehia. Characterizing audiovisual information during speech. In Fourth International Conference on Spoken Language Processing (ICSLP 96), volume 3, pages 1485–1488, Philadelphia, PA, October 1996.
H. Yehia, T. Kuratate, and E. Vatikiotis-Bateson. Facial animation and head motion driven by speech acoustics. In 5th Seminar on Speech Production: Models and Data, pages 265–268, Kloster Seeon, Bavaria, Germany, May 2000.
H. Yehia, P. Rubin, and E. Vatikiotis-Bateson. Quantitative association of vocal-tract and facial behavior. Speech Commun., 26(1–2):23–43, 1998.
S. Yildirim, M. Bulut, C.M. Lee, A. Kazemzadeh, C. Busso, Z. Deng, S. Lee, and S.S. Narayanan. An acoustic study of emotions expressed in speech. In 8th International Conference on Spoken Language Processing (ICSLP 04), Jeju Island, Korea, 2004.
S. Young, G. Evermann, T. Hain, D. Kershaw, G. Moore, J. Odell, D. Ollason, D. Povey, V. Valtchev, and P. Woodland. The HTK Book. Entropic Cambridge Research Laboratory, Cambridge, UK, 2002.
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag London Limited
About this chapter
Cite this chapter
Busso, C., Deng, Z., Neumann, U., Narayanan, S. (2008). Learning Expressive Human-Like Head Motion Sequences from Speech. In: Deng, Z., Neumann, U. (eds) Data-Driven 3D Facial Animation. Springer, London. https://doi.org/10.1007/978-1-84628-907-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-84628-907-1_6
Publisher Name: Springer, London
Print ISBN: 978-1-84628-906-4
Online ISBN: 978-1-84628-907-1
eBook Packages: Computer ScienceComputer Science (R0)