Skip to main content

Conversational Multimedia Interaction

  • Chapter
Machine Conversations

Part of the book series: The Springer International Series in Engineering and Computer Science ((SECS,volume 511))

  • 143 Accesses

Abstract

More effective, efficient and natural human computer or computer mediated human-human interaction will require both automated understanding and generation of multimedia. Fluent conversational interaction demands explicit models of the user, discourse, task and context. It will also require a richer understanding of media (i.e., text, audio, video), both in its use in the interface to support interaction with the user as well as its use in access to content by the user during a session. Multimedia dialogue prototypes have been developed in several application domains including CUBRICON (for a mission planning domain) (Neal and Shapiro, 1991), XTRA (tax-form preparation) (Wahlster, 1991), AIMI (air mission planning) (Burger and Marshall, 1993), and AlFresco (art history information exploration) (Stock et al., 1993). Typically, these systems parse mixed (typically asynchronous) multimedia input and generate coordinated multimedia output. They also attempt to maintain coherency, cohesion, and consistency across both multimedia input and output. For example, these systems often support integrated language and deixis for both input and output. They extend research in discourse and user modeling (Kobsa and Wahlster, 1989) by incorporating representations of media to enable media (cross) reference and reuse over the course of a session with a user. These enhanced representations support the exploitation of user perceptual abilities and media preferences as well as the resolution of multimedia references (e.g. “Send this plane there” articulated with synchronous gestures on a map).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aberdeen, J., Burger, J., Day, D., Hirschman, L., Robinson, P. and Vilain, M. 1995. Description of the Alembic System Used for MUC-6, Proceedings of the Sixth Message Understanding Conference. Advanced Research Projects Agency Information Technology Office, Columbia, MD.

    Google Scholar 

  2. Brill, E. 1995. Transformation-based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging. Computational Linguistics, 21 (4).

    Google Scholar 

  3. Brown, M.G., Foote, J.T., Jones, G.J.F., Sparck-Jones, K. and Young, S.J. 1995. Automatic Content-Based Retrieval of Broadcast News, Proceedings of ACM Multimedia. San Francisco, CA, p. 35–44.

    Google Scholar 

  4. Dubner, B. 1996. Automatic Scene Detector and Videotape logging system, User Guide, Dubner International, Inc., Copyright 1995.

    Google Scholar 

  5. Grosz, B. J. and Sidner, C. 1986. Attention, Intentions, and the Structure of Discourse. Computational Linguistics 12 (3): 175–204.

    Google Scholar 

  6. Hearst, M. A. 1994. Multi-Paragraph Segmentation of Expository Text, ACL-94, Las Cruces, New Mexico.

    Google Scholar 

  7. Kobsa, A. and Wahlster, W. (eds.) 1989. User Models in Dialog Systems. Berlin: Springer-Verlag.

    MATH  Google Scholar 

  8. Mani, I. 1995. Very Large Scale Text Summarization, Technical Note, MITRE Corporation. Mani, I., House, D., Maybury, M. and Green, M. 1997. Towards Content-based Browsing of

    Google Scholar 

  9. Broadcast News Video. In Maybury, M. (ed.) Intelligent Multimedia Information

    Google Scholar 

  10. Retrieval, AAAUMIT Press, 241–258.

    Google Scholar 

  11. Maybury, M. T. (ed.) 1993. Intelligent Multimedia Interfaces. Menlo Park: AAAI/MIT Press. (http://www.aaai.org/Publications/Press/Catalog/maybury.html)

    Google Scholar 

  12. Maybury, M. T. 1995. Generating Summaries from Event Data. International Journal of Information Processing and Management: Special Issue on Text Summarization 31 (5): 735–751.

    Google Scholar 

  13. Maybury, M. T. (ed.) 1997. Intelligent Multimedia Information Retrieval. Menlo Park: AAAI/M T Press. (http://www.aaai org:80/Press/Books/Maybury-2/)

    Google Scholar 

  14. Maybury, M., Merlino, A. and Morey, D. 1997. Broadcast News Navigation using Story Segments, Proceedings of the ACM International Multimedia Conference, Seattle, WA, November 8–14, 381–391.

    Google Scholar 

  15. Michell, R. 1996. Forager for Information on the Super Highway (FISH). Unpublished Manuscript.

    Google Scholar 

  16. Proceedings of the Sixth Message Understanding Conference. Advanced Research Projects Agency Information Technology Office, Columbia, MD, 6–8 November, 1995.

    Google Scholar 

  17. Pelachaud, C. 1992. Functional Decomposition of Facial Expressions for an Animation System. In Catarci, T., Costabile, M. F. and Levialdi, S. (eds). Advanced Visual Interfaces: Proceedings of the International Workshop, Singapore: World Scientific Series in Computer Science, Vol 36: 26–49.

    Google Scholar 

  18. Reiter, E., Mellish, C. and Levine, J. 1995. Automatic Generation of Technical Documentation. Applied Artificial Intelligence 9 (3): 259–287

    Article  Google Scholar 

  19. Shahraray, B. and Gibbon, D. 1995. Automated Authoring of Hypermedia Documents of Video Programs. Proceedings of ACM Multimedia. San Francisco, CA, p. 401–409.

    Google Scholar 

  20. Smotroff, I., Hirschman, L. and Bayer, S. 1995. Integrating Natural Language with Large DataspaceVisualization, to appear in Adam, N and Bhargava, B. (eds), Advances in Digital Libraries, Lecture Notes in Computer Science, Springer Verlag.

    Google Scholar 

  21. Stevens et al., 1994. Informedia–Improving Access to Digital Video, Interactions, October, pp. 67–71.

    Google Scholar 

  22. Stock, O. and the ALFRESCO Project Team. 1993. ALFRESCO: Enjoying the Combination of Natural Language Processing and Hypermedia for Information Exploration. In Intelligent Multimedia Interfaces, ed. M. Maybury, 197–224. Menlo Park: AAAI/NIIT Press.

    Google Scholar 

  23. Wahlster, W. 1991. User and Discourse Models for Multimodal Communication. In Sullivan, J. W. and Tyler, S. W. (eds). Intelligent User Interfaces. Frontier Series. New York: ACM Press, 45–67.

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer Science+Business Media New York

About this chapter

Cite this chapter

Maybury, M.T. (1999). Conversational Multimedia Interaction. In: Wilks, Y. (eds) Machine Conversations. The Springer International Series in Engineering and Computer Science, vol 511. Springer, Boston, MA. https://doi.org/10.1007/978-1-4757-5687-6_5

Download citation

  • DOI: https://doi.org/10.1007/978-1-4757-5687-6_5

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5092-5

  • Online ISBN: 978-1-4757-5687-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics