Skip to main content

Incremental Natural Language Description of Dynamic Imagery

  • Conference paper
Wissensbasierte Systeme

Part of the book series: Informatik-Fachberichte ((INFORMATIK,volume 227))

Abstract

Although image understanding and natural language processing constitute two major areas of AI, they have mostly been studied independently of each other. Only a few attempts have been concerned with the integration of computer vision and the generation of natural language expressions for the description of image sequences.

The aim of our joint efforts at combining a vision system and a natural language access system is the automatic simultaneous description of dynamic imagery, i.e., we are interested in image interpretation and language processing on an incremental basis. In this contribution1 we sketch an approach towards the integration of the Karlsruhe vision system called ACTIONS and the natural language component VITRA developed in Saarbrücken. The steps toward realization, based on available components, are outlined and the capabilities of the current system are demonstrated.

Zusammenfassung

Obwohl das Bildverstehen und die Verarbeitung natürlicher Sprache zwei der Kerngebiete im Bereich der KI darstellen, wurden sie bisher nahezu unabhängig voneinander untersucht. Nur sehr wenige Ansätze haben sich mit der Intergration von maschinellem Sehen und der Generierung natürlichsprachlicher Äußerungen zur Beschreibung von Bildfolgen beschäftigt.

Das Ziel unserer Zusammenarbeit bei der Kopplung eines bildverstehenden Systems und eines natürlichsprachlichen Zugangssystems ist die automatische simultane Beschreibung zeitveränderlicher Szenen, d.h. wir sind interessiert an Bildfolgeninterpretation und Sprachverarbeitung auf inkrementeller Basis. In diesem Beitrag beschreiben wir einen Ansatz zur Integration des Karlsruher Bildfolgenanalysesystems Actions und der natürlichsprachlichen Komponente Vitra, die in Saarbrücken entwickelt wird. Die Schritte hin zur Realisierung, basierend auf bereits verfügbaren Komponenten, werden dargestellt und die Fähigkeiten des derzeit vorhandenen Systems demonstriert.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 44.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 59.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. J.F. Allen. Towards a General Theory of Action and Time. Artificial Intelligence, 23 (2): 123–154, 1984.

    Article  MATH  Google Scholar 

  2. E. André, G. Bosch, G. Herzog and T. Rist. Characterizing Trajectories of Moving Objects Using Natural Language Path Descriptions. In: Proc. of 7th ECAI, Vol. 2, pp. 1–8, Brighton, UK, 1986.

    Google Scholar 

  3. E. André, G. Herzog and T. Rist. On the Simultaneous Interpretation of Real World Image Sequences and their Natural Language Description: The System SOCCER. In: Proc. of 8th ECAI, pp. 449–454, Munich, 1988.

    Google Scholar 

  4. E. André, T. Rist and G. Herzog. Generierung natürlichsprachlicher Äußerungen zur simultanen Beschreibung zeitveränderlicher Szenen. In: K. Morik (ed.), GWAI–87, llth German Workshop on AI, pp. 330 - 337, Berlin: Springer, 1987.

    Google Scholar 

  5. N. Ayache and O.D. Faugeras. Building, Registrating, and Fusing Noisy Visual Maps. In: Proc. of First International Conference on Computer Vision, pp. 73–82, London, 1987.

    Google Scholar 

  6. N.J. Badler. Temporal Scene Analysis: Conceptual Description of Object Movements. Technical Report 80, Computer Science Department, University of Toronto, 1975.

    Google Scholar 

  7. R. Bajcsy, A. Joshi, E. Krotkov and A. Zwarico. LandScan: A Natural Language and Computer Vision System for Analyzing Aerial Images. In: Proc. of 9th IJCAI, pp. 919–921, Los Angeles, 1985.

    Google Scholar 

  8. S. Busemann. Surface Transformations during the Generation of Written German Sentences. In: L. Bole (ed.), Natural Language Generation Systems, Berlin: Springer, 1984.

    Google Scholar 

  9. O.D. Faugeras. A Few Steps toward Artificial 3D Vision. Report 790, Institut National de Recherche en Informatique et en Automatique INRIA, Domarne de Voluceau, Rocquencourt, Le Chesnay, France, 1988.

    Google Scholar 

  10. N.H. Goddard. Recognizing Animal Motion. In: Proc. of Image Understanding Workshop, pp. 938–944, San Mateo, CA, 1988.

    Google Scholar 

  11. H.P. Grice. Logic and Conversation. In: P. Cole and J.L. Morgan (eds.), Speech Acts, pp. 41 - 58, London: Academic Press, 1975.

    Google Scholar 

  12. T. Kanade. Region Segmentation: Signal versus Semantics. Computer Graphics and Image Process¬ing, 13: 279–297, 1980.

    Article  MATH  MathSciNet  Google Scholar 

  13. G. Kempen and E. Hoenkanip. An Incremental Procedural Grammar for Sentence Formulation. Cognitive Science, ll(2): 201–258, 1987.

    Google Scholar 

  14. R. Kories and G. Zimmermann. A Versatile Method for the Estimation of Displacement Vector Fields from Image Sequences. In: Proc. of Workshop on Motion: Representation and Analysis, pp. 101–106, Kiawah Island, Island Resort, Charleston, SC, 1986.

    Google Scholar 

  15. H.-H. Nagel. From Image Sequences Towards Conceptual Descriptions. Image and Vision Computing, 6 (2): 59–74, 1988.

    Article  Google Scholar 

  16. H.-H. Nagel. Image Sequences - Ten (Octal) Years - From Phenomenology towards a Theoretical Foundation. International Journal of Pattern Recognition and Artificial Intelligence, 2: 495–483, 1988.

    Article  Google Scholar 

  17. B. Neumann. Natural Language Description of Time-Varying Scenes. Report 105, Fachbereich Infor-matik, Universität Hamburg, 1984.

    Google Scholar 

  18. H. Niemann, H. Bunke, I. Hofmann, G. Sagerer, F. Wolf and H. Feistel. A Knowledge Based System for Analysis of Gated Blood Pool Studies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7: 246 - 259, 1985.

    Article  Google Scholar 

  19. H.-J. Novak. Generating a Coherent Text Describing a Traffic Scene. In: Proc. of llth COLING, pp. 570–575, Bonn, 1986.

    Google Scholar 

  20. N. Okada. SUPP: Understanding Moving Picture Patterns Based on Linguistic Knowledge. In: Proc. of 6th IJCAI, pp. 690–692, Tokyo, 1979.

    Google Scholar 

  21. G. Retz-Schmidt. A REPLAI of SOCCER: Recognizing Intentions in the Domain of Soccer Games. In: Proc. of 8th ECAI, pp. 455–457, Munich, 1988.

    Google Scholar 

  22. J.R.J. Schirra, G. Bosch, C.K. Sung and G. Zimmermann. From Image Sequences to Natural Language: A First Step towards Automatic Perception and Description of Motions. Applied Artificial Intelligence, 1: 287–305, 1987.

    Article  Google Scholar 

  23. C.-K. Sung. Extraktion von typischen und komplexen Vorgängen aus einer langen Bildfolge einer Verkehrsszene. In: H. Bunke, 0. Kubier, and P. Stucki (eds.), Mustererkennung 1988, Informatik Fachberichte, Vol. 180, pp. 90–96, Berlin: Springer, 1988.

    Google Scholar 

  24. C.-K. Sung and G. Zimmermann. Detektion und Verfolgung mehrerer Objekte in Bildfolgen. In: G. Hartmann (ed.), Mustererkennung 1986, Informatik Fachberichte, Vol. 125, pp. 181–184, Berlin: Springer, 1986.

    Google Scholar 

  25. J.K. Tsotsos. Knowledge Organization and its Role in Representation and Interpretation for Time- Varying Data: the ALVEN System. Computational Intelligence, 1: 16–32, 1985.

    Article  Google Scholar 

  26. W. Wahlster, H. Marburger, A. Jameson and S. Busemann. Over-answering Yes-No Questions: Extended Responses in a NL Interface to a Vision System. In: Proc. of 8th IJCAI, pp. 643–646, Karlsruhe, 1983.

    Google Scholar 

  27. I. Walter, P.C. Lockemann and H.-H. Nagel. Database Support for Knowledge-Based Image Evalu-ation. In: Proc. of 13th Conference on Very Large Databases, pp. 3–11, Brighton, UK, 1988.

    Google Scholar 

  28. A. Witkin, M. Kass, D. Terzopoulos and K. Fleischer. Physically Based Modeling for Vision and Graphics. In: Proc. of Image Understanding Workshop, pp. 254–278, San Mateo, CA, 1988.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1989 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Herzog, G. et al. (1989). Incremental Natural Language Description of Dynamic Imagery. In: Brauer, W., Freksa, C. (eds) Wissensbasierte Systeme. Informatik-Fachberichte, vol 227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-75182-0_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-75182-0_15

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-51838-9

  • Online ISBN: 978-3-642-75182-0

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics