Synthesizing cooperative conversation

Pelachaud, Catherine; Cassell, Justine; Badler, Norman; Steedman, Mark; Prevost, Scott; Stone, Matthew

doi:10.1007/BFb0052313

Synthesizing cooperative conversation

Catherine Pelachaud¹,
Justine Cassell²,
Norman Badler¹,
Mark Steedman¹,
Scott Prevost¹ &
…
Matthew Stone¹

Conference paper
First Online: 01 January 2006

282 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1374))

Abstract

We describe an implemented system which automatically generates and animates conversations between multiple human-like agents with appropriate and synchronized speech, intonation, facial expressions, and hand gestures. Conversations are created by a dialogue planner that produces the text as well as the intonation of the utterances. The speaker/listener relationship, the text, and the intonation in turn drive facial expressions, lip motions, eye gaze, head motion, and arm gesture generators.

The original version of the paper was written while this author was working at the Université di Roma “La Sapienza”.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Argyle, M. and Cook, M., (1976) Gaze and Mutual gaze, Cambridge University Press.
Google Scholar
Badler, Norman, Phillips, Carry, and Webber, Bonnie (1993) Simulating Humans: Computer Graphics Animation and Control, Oxford University Press.
Google Scholar
Becket, Tripp M. (1994) The jack lisp api, Technical Report MS-CIS-94-01, Graphics Lab 59, University of Pennsylvania.
Google Scholar
Biermann, Alan W, Guinn, Curry I., Hipp, Richard and Smith, Ronnie W. (1993) ‘Efficient collaborative discourse: A theory and its implementation'. In Proceedings of the ARPA Human Language Technology Workshop, 177–181.
Google Scholar
Bolinger, Dwight (1989) Intonation and its uses, Stanford University Press.
Google Scholar
Calvert, Tom (1991) ‘Composition of realistic animation sequences for multiple human figures'. In Making Them Move: Mechanics, Control, and Animation of Articulated Figures, Badler, Norman I., Barsky, Brian A., and Zeltzer, David (eds.), San Mateo, CA: Morgan-Kaufmann, 35–50.
Google Scholar
Cassell, Justine, Pelachaud, Catherine, Badler, Norman, Steedman, Mark, Achorn, Brett, Becket, Tripp, Douville, Brett, Prevost, Scott, and Stone, Matthew (1994) ‘Animated conversation:rule-based generation of facial expression, gesture and spoken intonation for multiple conversational agents'. In Computer Graphics '94, 413–420.
Google Scholar
Cassell, Justine, Stone, Matthew, Douville, Brett, Prevost, Scott, Achorn, Brett, Badler, Norm, Steedman, Mark and Pelachaud, Catherine (1994) ‘Modeling the Interaction between Speech and Gesture'. In Proceedings of the Sixteenth Annual Meeting of the Cognitive Science Society, Atlanta, GA.
Google Scholar
Chen, D. T., Pieper, S. D., Singh, S. K., Rosen, J. M., and Zeltzer, D. (1993) ‘The virtual sailor: An implementation of interactive human body modeling'. In Proc. 1993 Virtual Reality Annual International Symposium, Seattle, WA:IEEE.
Google Scholar
Cohen, M. M. and Massaro, D. W. (1993) ‘Modeling coarticulation in synthetic visual speech'. In Models and Techniques in Computer Animation, Magnenat-Thalmann, M. and Thalmann, D. (eds.), Tokyo:Springer-Verlag.
Google Scholar
Collier, G. (1985) Emotional Expression, Lawrence Erlbaum Associates.
Google Scholar
Condon, W.S. and Osgton, W.D. (1971) 'speech and body motion synchrony of the speaker-hearer'. In The perception of Language, Horton, D.H. and Jenkins, J.J. (eds.), Academic Press, 150–184.
Google Scholar
Dale, Robert (1989) Generating Referring Expressions in a Domain of Objects and Processes, PhD thesis, Centre for Cognitive Science, University of Edinburgh.
Google Scholar
Davis, James and Hirschberg, Julia (1988) ‘Assigning intonational features in synthesized spoken discourse'. In ACL, Buffalo, 187–193.
Google Scholar
Duncan, S. (1974) 'some signals and rules for taking speaking turns in conversations'. In Nonverbal Communication, Weitz (ed.), Oxford University Press.
Google Scholar
Ekman, Paul (1976) Movements with precise meanings, The Journal of Communication, 26.
Google Scholar
Ekman, P. and Friesen, W. (1978) Facial Action Coding System, Consulting Psychologists Press, Inc.
Google Scholar
Essai, I.A. and Pentland, A. (1994) ‘A vision system for observing and extracting facial action parameters'. In Proceedings of Computer Vision and Pattern Recognition (CVPR 94), 76–83.
Google Scholar
Feiner, S. and McKeown, K.R. (1990) ‘Generating coordinated multimedia explanations'. In Proceedings of the Sixth Conference on Artificial Intelligence Applications, 290–296.
Google Scholar
Gourret, Jean-Paul, Magnenat-Thalmann, Nadia, and Thalmann, Daniel (1989) ‘Simulation of object and human skin deformations in a grasping task'. In Computer Graphics, 23(3), 21–30.
Article Google Scholar
Guinn, Curry I. (1993) ‘A computational model of dialogue initiative in collaborative discourse'. In Human-Computer Collaboration: Reconciling Theory, Synthesizing Practice, Papers from the 1993 Fall Symposium Series, AAAI Technical Report FS-93-05.
Google Scholar
Hajičovà, Eva and Sgall (1988) ‘Topic and focus of a sentence and the patterning of a text'. In Text and Discourse Constitution, Petofi, Jànos (ed.), Berlin: De Gruyter.
Google Scholar
Halliday, Michael (1967) Intonation and Grammar in British English, The Hague: Mouton.
Google Scholar
Hill, D.R., Pearce, A., and Wyvill, B. (1988) ‘Animating speech: an automated approach using speech synthesised by rules'. In The Visual Computer, 3, 277–289.
Article Google Scholar
Houghton, George (1986) The Production of Language in Dialogue: A Computational Model. PhD thesis, University of Sussex.
Google Scholar
Houghton, George and Isard, Stephen (1987) ‘Why to speak, what to say and how to say it'. In Modelling Cognition, Morris, P. (ed.), Wiley.
Google Scholar
Houghton, George and Pearson, M., ‘The production of spoken dialogue'. In Advances in Natural Language Generation: An Interdisciplinary Perspective, Vol. 1, Zock, M. and Sabah, G. (eds.), London: Pinter Publishers.
Google Scholar
Hovy, Eduard H (1988) ‘Planning coherent multisentential text'. In ACL, 163–169.
Google Scholar
Kalra, P., Mangili, A., Magnenat-Thalmann, N., and Thalmann, D. (1991) 'sMILE: A multilayered facial animation system'. In Modeling in Computer Graphics, Kunii, T.L. (ed.), Springer-Verlag.
Google Scholar
Kendon, Adam (1974) ‘Movement coordination in social interaction: some examples described'. In Nonverbal Communication, Weitz (ed.), Oxford University Press.
Google Scholar
Kendon, Adam (1980) ‘Gesticulation and speech: Two aspects of the process of utterance'. In The Relation between Verbal and Nonverbal Communication, Key, M.R. (ed.), Mouton, 207–227.
Google Scholar
Lee, Jintae and Kunii, Tosiyasu L. (1993) ‘Visual translation: From native language to sign language'. In Workshop on Visual Languages, Seattle, WA:IEEE.
Google Scholar
Lee, Philip, Wei, Susanna, Zhao, Jianmin, and Badler, Norman I. (1990) ‘Strength guided motion'. In Computer Graphics, 24(4), 253–262.
Article Google Scholar
Liberman, Mark and Buchsbaum, A. L. (1985) 'structure and usage of current Bell Labs text to speech programs', Technical Memorandum TM 11225-850731-11, AT&T Bell Laboratories.
Google Scholar
Loomis, Jeffrey, Poizner, Howard, Bellugi, Ursula, Blakemore, Alynn, and Hollerbach, John (1983) ‘Computer graphic modeling of American Sign Language'. In Computer Graphics, 17(3), 105–114.
Article Google Scholar
Lyons, John (1977) Semantics (vol II), Cambridge University Press.
Google Scholar
Magnenat-Thalmann, Nadia and Thalmann, Daniel (1991) ‘Human body deformations using joint-dependent local operators and finite-element theory'. In Making Them Move: Mechanics, Control, and Animation of Articulated Figures, Badler, Norman I., Barsky, Brian A., and Zeltzer, David (eds.), San Mateo, CA: Morgan-Kaufmann, 243–262.
Google Scholar
McNeill, David (1992) Hand and Mind: What Gestures Reveal about Thought, University of Chicago.
Google Scholar
Meteer, Marie W. (1991) ‘Bridging the generation gap between text planning and linguistic realization'. In Computational Intelligence, 7(4), 296–304.
Article Google Scholar
Moore, Johanna D. and Paris, Cécile L. (1989) ‘Planning text for advisory dialogues'. In ACL, 203–211.
Google Scholar
Nahas, M., Huitric, H., and Saintourens, M. (1988) ‘Animation of a B-spline figure'. In The Visual Computer, 3(5), 272–276.
Article Google Scholar
Parke, F.I. (1982) ‘A parameterized model for facial animation'. In IEEE Computer Graphics and Applications, 2(9), 61–70.
Article Google Scholar
Patel, M. and Willis, P.J. (1991) ‘FACES — The facial animation, construction and editing system'. In Eurographics'91, 33–45.
Google Scholar
Pearce, A., Wyvill, B., and Hill, D.R. (1986) 'speech and expression: a computer solution to face animation'. In Graphics and Vision Interface '86, 136–140.
Google Scholar
Pelachaud, Catherine, Badler, Norman I., and Steedman, Marc (1991) ‘Linguistic issues in facial animation'. In Computer Animation '91, Magnenat-Thalmann, N. and Thalmann, D. (eds.), Springer-Verlag, 15–30.
Google Scholar
Power, Richard, (1977) ‘The organisation of purposeful dialogues'. In Linguistics, 17(1/2), 107–152.
Google Scholar
Prevost, Scott and Steedman, Mark (1993a) ‘Generating contextually appropriate intonation'. In Proceedings of the Sixth Conference of the European Chapter of the Association for Computational Linguistics, Utrecht, 332–340.
Google Scholar
Prevost, Scott and Steedman, Mark (1993b) ‘Using context to specify intonation in speech synthesis'. In Proceedings of the 3rd European Conference of Speech Communication and Technology (EUROSPEECH), Berlin, pages 2103–2106.
Google Scholar
Prevost, Scott and Steedman, Mark (1994) ‘Specifying intonation from context for speech synthesis'. In Speech Communication, 15(1–2), 139–153.
Article Google Scholar
Prince, Ellen F. (1992) ‘The ZPG letter: Subjects, definiteness and information status'. In Discourse description: diverse analyses of a fund raising text, Thompsoni, S. and Mann, W. (eds.), John Benjamins B.V., 295–325.
Google Scholar
Reiter, Ehud (1994) ‘Has a consensus NL generation architecture appeared, and is it psycholinguistically plausible?'. In Seventh International Workshop on Natural Language Generation, 163–170.
Google Scholar
Rijpkema, Hans and Girard, Michael (1991) ‘Computer animation of hands and grasping'. In Computer Graphics, 25(4), 339–348.
Article Google Scholar
Scherer, Klaus R. (1980) ‘The functions of nonverbal signs in conversation'. In The Social and Physiological Contexts of Language, Giles, H. and St. Clair, R. (eds.), Lawrence Erlbaum Associates, 225–243.
Google Scholar
Shieber, Stuart, Van Noord, Gertjan, Pereira, Fernando and Moore, Robert (1990) ‘Semantic-head-driven generation'. In Computational Linguistics, 16, 30–42.
Google Scholar
Steedman, Mark (1991) ‘Structure and intonation'. In Language, 67, 260–296.
Google Scholar
Takeuchi, Akikazu and Nagao, Katashi (1993) ‘Communicative facial displays as a new conversational modality'. In ACM/IFIP INTERCHI'93, Amsterdam.
Google Scholar
Terken, Jacques (1984) ‘The distribution of accents in instructions as a function of discourse structure'. In Language and Structure, 27, 269–289.
Google Scholar
Terzopoulos, D. and Waters, K. (1993) ‘Analysis and synthesis of facial image sequences using physical and anatomical models'. In IEEE Transactions on Pattern Analysis and Machine Intelligence, 15(6), 569–579.
Article Google Scholar
Wahlster, Wolfgang, André, Elisabeth, Bandyopadhyay, Son, Graf, Winfried, and Rist, Thomas, ‘WIP: The coordinated generation of multimodal presentations from a common representation'. In Computational Theories of Communication and their Applications, Stock, Oliviero, Slack, John, and Ortony, Andrew (eds.), Berlin: Springer Verlag.
Google Scholar
Walker, Lyn (1993) Informational redundancy and resource bounds in dialogue, PhD thesis, University of Pennsylvania (Institute for Research in Cognitive Science report IRCS-93-45).
Google Scholar
Zacharski, R., Monaghan, A.I.C., Ladd, D.R., and Delin, J., (1993) BRIDGE: Basic research on intonation in dialogue generation, Technical report, HCRC: University of Edinburgh, (Unpublished manuscript).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer and Information Science, University of Pennsylvania, 200 S. 33rd street, 19104, Philadelphia, PA
Catherine Pelachaud, Norman Badler, Mark Steedman, Scott Prevost & Matthew Stone
M.I.T. Media Lab, 20 Ames Street, 02139, Cambridge, MA
Justine Cassell

Authors

Catherine Pelachaud
View author publications
You can also search for this author in PubMed Google Scholar
Justine Cassell
View author publications
You can also search for this author in PubMed Google Scholar
Norman Badler
View author publications
You can also search for this author in PubMed Google Scholar
Mark Steedman
View author publications
You can also search for this author in PubMed Google Scholar
Scott Prevost
View author publications
You can also search for this author in PubMed Google Scholar
Matthew Stone
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Harry Bunt Robbert-Jan Beun Tijn Borghuis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pelachaud, C., Cassell, J., Badler, N., Steedman, M., Prevost, S., Stone, M. (1998). Synthesizing cooperative conversation. In: Bunt, H., Beun, RJ., Borghuis, T. (eds) Multimodal Human-Computer Communication. CMC 1995. Lecture Notes in Computer Science, vol 1374. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0052313

Download citation

DOI: https://doi.org/10.1007/BFb0052313
Published: 17 May 2006
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64380-7
Online ISBN: 978-3-540-69764-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics