Conversational Multimedia Interaction

Maybury, M. T.

doi:10.1007/978-1-4757-5687-6_5

M. T. Maybury

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 511))

143 Accesses

Abstract

More effective, efficient and natural human computer or computer mediated human-human interaction will require both automated understanding and generation of multimedia. Fluent conversational interaction demands explicit models of the user, discourse, task and context. It will also require a richer understanding of media (i.e., text, audio, video), both in its use in the interface to support interaction with the user as well as its use in access to content by the user during a session. Multimedia dialogue prototypes have been developed in several application domains including CUBRICON (for a mission planning domain) (Neal and Shapiro, 1991), XTRA (tax-form preparation) (Wahlster, 1991), AIMI (air mission planning) (Burger and Marshall, 1993), and AlFresco (art history information exploration) (Stock et al., 1993). Typically, these systems parse mixed (typically asynchronous) multimedia input and generate coordinated multimedia output. They also attempt to maintain coherency, cohesion, and consistency across both multimedia input and output. For example, these systems often support integrated language and deixis for both input and output. They extend research in discourse and user modeling (Kobsa and Wahlster, 1989) by incorporating representations of media to enable media (cross) reference and reuse over the course of a session with a user. These enhanced representations support the exploitation of user perceptual abilities and media preferences as well as the resolution of multimedia references (e.g. “Send this plane there” articulated with synchronous gestures on a map).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P. and Vilain, M. 1995. Description of the Alembic System Used for MUC-6, Proceedings of the Sixth Message Understanding Conference. Advanced Research Projects Agency Information Technology Office, Columbia, MD.
Google Scholar
Brill, E. 1995. Transformation-based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics, 21 (4).
Google Scholar
Brown, M.G., Foote, J.T., Jones, G.J.F., Sparck-Jones, K. and Young, S.J. 1995. Automatic Content-Based Retrieval of Broadcast News, Proceedings of ACM Multimedia. San Francisco, CA, p. 35–44.
Google Scholar
Dubner, B. 1996. Automatic Scene Detector and Videotape logging system, User Guide, Dubner International, Inc., Copyright 1995.
Google Scholar
Grosz, B. J. and Sidner, C. 1986. Attention, Intentions, and the Structure of Discourse. Computational Linguistics 12 (3): 175–204.
Google Scholar
Hearst, M. A. 1994. Multi-Paragraph Segmentation of Expository Text, ACL-94, Las Cruces, New Mexico.
Google Scholar
Kobsa, A. and Wahlster, W. (eds.) 1989. User Models in Dialog Systems. Berlin: Springer-Verlag.
MATH Google Scholar
Mani, I. 1995. Very Large Scale Text Summarization, Technical Note, MITRE Corporation. Mani, I., House, D., Maybury, M. and Green, M. 1997. Towards Content-based Browsing of
Google Scholar
Broadcast News Video. In Maybury, M. (ed.) Intelligent Multimedia Information
Google Scholar
Retrieval, AAAUMIT Press, 241–258.
Google Scholar
Maybury, M. T. (ed.) 1993. Intelligent Multimedia Interfaces. Menlo Park: AAAI/MIT Press. (http://www.aaai.org/Publications/Press/Catalog/maybury.html)
Google Scholar
Maybury, M. T. 1995. Generating Summaries from Event Data. International Journal of Information Processing and Management: Special Issue on Text Summarization 31 (5): 735–751.
Google Scholar
Maybury, M. T. (ed.) 1997. Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/M T Press. (http://www.aaai org:80/Press/Books/Maybury-2/)
Google Scholar
Maybury, M., Merlino, A. and Morey, D. 1997. Broadcast News Navigation using Story Segments, Proceedings of the ACM International Multimedia Conference, Seattle, WA, November 8–14, 381–391.
Google Scholar
Michell, R. 1996. Forager for Information on the Super Highway (FISH). Unpublished Manuscript.
Google Scholar
Proceedings of the Sixth Message Understanding Conference. Advanced Research Projects Agency Information Technology Office, Columbia, MD, 6–8 November, 1995.
Google Scholar
Pelachaud, C. 1992. Functional Decomposition of Facial Expressions for an Animation System. In Catarci, T., Costabile, M. F. and Levialdi, S. (eds). Advanced Visual Interfaces: Proceedings of the International Workshop, Singapore: World Scientific Series in Computer Science, Vol 36: 26–49.
Google Scholar
Reiter, E., Mellish, C. and Levine, J. 1995. Automatic Generation of Technical Documentation. Applied Artificial Intelligence 9 (3): 259–287
Article Google Scholar
Shahraray, B. and Gibbon, D. 1995. Automated Authoring of Hypermedia Documents of Video Programs. Proceedings of ACM Multimedia. San Francisco, CA, p. 401–409.
Google Scholar
Smotroff, I., Hirschman, L. and Bayer, S. 1995. Integrating Natural Language with Large DataspaceVisualization, to appear in Adam, N and Bhargava, B. (eds), Advances in Digital Libraries, Lecture Notes in Computer Science, Springer Verlag.
Google Scholar
Stevens et al., 1994. Informedia–Improving Access to Digital Video, Interactions, October, pp. 67–71.
Google Scholar
Stock, O. and the ALFRESCO Project Team. 1993. ALFRESCO: Enjoying the Combination of Natural Language Processing and Hypermedia for Information Exploration. In Intelligent Multimedia Interfaces, ed. M. Maybury, 197–224. Menlo Park: AAAI/NIIT Press.
Google Scholar
Wahlster, W. 1991. User and Discourse Models for Multimodal Communication. In Sullivan, J. W. and Tyler, S. W. (eds). Intelligent User Interfaces. Frontier Series. New York: ACM Press, 45–67.
Google Scholar

Download references

Authors

M. T. Maybury
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Sheffield, UK
Yorick Wilks

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Maybury, M.T. (1999). Conversational Multimedia Interaction. In: Wilks, Y. (eds) Machine Conversations. The Springer International Series in Engineering and Computer Science, vol 511. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-5687-6_5

Download citation

DOI: https://doi.org/10.1007/978-1-4757-5687-6_5
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-5092-5
Online ISBN: 978-1-4757-5687-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics