Incremental Natural Language Description of Dynamic Imagery

Herzog, G.; Sung, C.-K.; André, E.; Enkelmann, W.; Nagel, H.-H.; Rist, T.; Wahlster, W.; Zimmermann, G.

doi:10.1007/978-3-642-75182-0_15

G. Herzog³,
C.-K. Sung⁴,
E. André⁵,
W. Enkelmann⁴,
H.-H. Nagel^4,6,
T. Rist⁵,
W. Wahlster^3,5 &
…
G. Zimmermann⁴

Part of the book series: Informatik-Fachberichte ((INFORMATIK,volume 227))

71 Accesses
13 Citations

Abstract

Although image understanding and natural language processing constitute two major areas of AI, they have mostly been studied independently of each other. Only a few attempts have been concerned with the integration of computer vision and the generation of natural language expressions for the description of image sequences.

The aim of our joint efforts at combining a vision system and a natural language access system is the automatic simultaneous description of dynamic imagery, i.e., we are interested in image interpretation and language processing on an incremental basis. In this contribution¹ we sketch an approach towards the integration of the Karlsruhe vision system called ACTIONS and the natural language component VITRA developed in Saarbrücken. The steps toward realization, based on available components, are outlined and the capabilities of the current system are demonstrated.

Zusammenfassung

Obwohl das Bildverstehen und die Verarbeitung natürlicher Sprache zwei der Kerngebiete im Bereich der KI darstellen, wurden sie bisher nahezu unabhängig voneinander untersucht. Nur sehr wenige Ansätze haben sich mit der Intergration von maschinellem Sehen und der Generierung natürlichsprachlicher Äußerungen zur Beschreibung von Bildfolgen beschäftigt.

Das Ziel unserer Zusammenarbeit bei der Kopplung eines bildverstehenden Systems und eines natürlichsprachlichen Zugangssystems ist die automatische simultane Beschreibung zeitveränderlicher Szenen, d.h. wir sind interessiert an Bildfolgeninterpretation und Sprachverarbeitung auf inkrementeller Basis. In diesem Beitrag beschreiben wir einen Ansatz zur Integration des Karlsruher Bildfolgenanalysesystems Actions und der natürlichsprachlichen Komponente Vitra, die in Saarbrücken entwickelt wird. Die Schritte hin zur Realisierung, basierend auf bereits verfügbaren Komponenten, werden dargestellt und die Fähigkeiten des derzeit vorhandenen Systems demonstriert.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

J.F. Allen. Towards a General Theory of Action and Time. Artificial Intelligence, 23 (2): 123–154, 1984.
Article MATH Google Scholar
E. André, G. Bosch, G. Herzog and T. Rist. Characterizing Trajectories of Moving Objects Using Natural Language Path Descriptions. In: Proc. of 7th ECAI, Vol. 2, pp. 1–8, Brighton, UK, 1986.
Google Scholar
E. André, G. Herzog and T. Rist. On the Simultaneous Interpretation of Real World Image Sequences and their Natural Language Description: The System SOCCER. In: Proc. of 8th ECAI, pp. 449–454, Munich, 1988.
Google Scholar
E. André, T. Rist and G. Herzog. Generierung natürlichsprachlicher Äußerungen zur simultanen Beschreibung zeitveränderlicher Szenen. In: K. Morik (ed.), GWAI–87, llth German Workshop on AI, pp. 330 - 337, Berlin: Springer, 1987.
Google Scholar
N. Ayache and O.D. Faugeras. Building, Registrating, and Fusing Noisy Visual Maps. In: Proc. of First International Conference on Computer Vision, pp. 73–82, London, 1987.
Google Scholar
N.J. Badler. Temporal Scene Analysis: Conceptual Description of Object Movements. Technical Report 80, Computer Science Department, University of Toronto, 1975.
Google Scholar
R. Bajcsy, A. Joshi, E. Krotkov and A. Zwarico. LandScan: A Natural Language and Computer Vision System for Analyzing Aerial Images. In: Proc. of 9th IJCAI, pp. 919–921, Los Angeles, 1985.
Google Scholar
S. Busemann. Surface Transformations during the Generation of Written German Sentences. In: L. Bole (ed.), Natural Language Generation Systems, Berlin: Springer, 1984.
Google Scholar
O.D. Faugeras. A Few Steps toward Artificial 3D Vision. Report 790, Institut National de Recherche en Informatique et en Automatique INRIA, Domarne de Voluceau, Rocquencourt, Le Chesnay, France, 1988.
Google Scholar
N.H. Goddard. Recognizing Animal Motion. In: Proc. of Image Understanding Workshop, pp. 938–944, San Mateo, CA, 1988.
Google Scholar
H.P. Grice. Logic and Conversation. In: P. Cole and J.L. Morgan (eds.), Speech Acts, pp. 41 - 58, London: Academic Press, 1975.
Google Scholar
T. Kanade. Region Segmentation: Signal versus Semantics. Computer Graphics and Image Process¬ing, 13: 279–297, 1980.
Article MATH MathSciNet Google Scholar
G. Kempen and E. Hoenkanip. An Incremental Procedural Grammar for Sentence Formulation. Cognitive Science, ll(2): 201–258, 1987.
Google Scholar
R. Kories and G. Zimmermann. A Versatile Method for the Estimation of Displacement Vector Fields from Image Sequences. In: Proc. of Workshop on Motion: Representation and Analysis, pp. 101–106, Kiawah Island, Island Resort, Charleston, SC, 1986.
Google Scholar
H.-H. Nagel. From Image Sequences Towards Conceptual Descriptions. Image and Vision Computing, 6 (2): 59–74, 1988.
Article Google Scholar
H.-H. Nagel. Image Sequences - Ten (Octal) Years - From Phenomenology towards a Theoretical Foundation. International Journal of Pattern Recognition and Artificial Intelligence, 2: 495–483, 1988.
Article Google Scholar
B. Neumann. Natural Language Description of Time-Varying Scenes. Report 105, Fachbereich Infor-matik, Universität Hamburg, 1984.
Google Scholar
H. Niemann, H. Bunke, I. Hofmann, G. Sagerer, F. Wolf and H. Feistel. A Knowledge Based System for Analysis of Gated Blood Pool Studies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 7: 246 - 259, 1985.
Article Google Scholar
H.-J. Novak. Generating a Coherent Text Describing a Traffic Scene. In: Proc. of llth COLING, pp. 570–575, Bonn, 1986.
Google Scholar
N. Okada. SUPP: Understanding Moving Picture Patterns Based on Linguistic Knowledge. In: Proc. of 6th IJCAI, pp. 690–692, Tokyo, 1979.
Google Scholar
G. Retz-Schmidt. A REPLAI of SOCCER: Recognizing Intentions in the Domain of Soccer Games. In: Proc. of 8th ECAI, pp. 455–457, Munich, 1988.
Google Scholar
J.R.J. Schirra, G. Bosch, C.K. Sung and G. Zimmermann. From Image Sequences to Natural Language: A First Step towards Automatic Perception and Description of Motions. Applied Artificial Intelligence, 1: 287–305, 1987.
Article Google Scholar
C.-K. Sung. Extraktion von typischen und komplexen Vorgängen aus einer langen Bildfolge einer Verkehrsszene. In: H. Bunke, 0. Kubier, and P. Stucki (eds.), Mustererkennung 1988, Informatik Fachberichte, Vol. 180, pp. 90–96, Berlin: Springer, 1988.
Google Scholar
C.-K. Sung and G. Zimmermann. Detektion und Verfolgung mehrerer Objekte in Bildfolgen. In: G. Hartmann (ed.), Mustererkennung 1986, Informatik Fachberichte, Vol. 125, pp. 181–184, Berlin: Springer, 1986.
Google Scholar
J.K. Tsotsos. Knowledge Organization and its Role in Representation and Interpretation for Time- Varying Data: the ALVEN System. Computational Intelligence, 1: 16–32, 1985.
Article Google Scholar
W. Wahlster, H. Marburger, A. Jameson and S. Busemann. Over-answering Yes-No Questions: Extended Responses in a NL Interface to a Vision System. In: Proc. of 8th IJCAI, pp. 643–646, Karlsruhe, 1983.
Google Scholar
I. Walter, P.C. Lockemann and H.-H. Nagel. Database Support for Knowledge-Based Image Evalu-ation. In: Proc. of 13th Conference on Very Large Databases, pp. 3–11, Brighton, UK, 1988.
Google Scholar
A. Witkin, M. Kass, D. Terzopoulos and K. Fleischer. Physically Based Modeling for Vision and Graphics. In: Proc. of Image Understanding Workshop, pp. 254–278, San Mateo, CA, 1988.
Google Scholar

Download references

Author information

Authors and Affiliations

Fachbereich Informatik, Universität des Saarlandes, Im Stadtwald 15, D-6600, Saarbrücken 11, FR of Germany
G. Herzog & W. Wahlster
Fraunhofer-Institut für Informations- und Datenverarbeitung (IITB), Fraunhoferstr. 1, D-7500, Karlsruhe 1, FR of Germany
C.-K. Sung, W. Enkelmann, H.-H. Nagel (Fakultät für Informatik der Universität Karlsruhe (TH) & G. Zimmermann
Deutsches Forschungszentrum für Künstliche Intelligenz GmbH, Stuhlsatzenhausweg 3, D-6600, Saarbrücken 11, FR of Germany
E. André, T. Rist & W. Wahlster
Facultät für Informatik , Universität Karlsruhe (TH), Germany
H.-H. Nagel (Fakultät für Informatik der Universität Karlsruhe (TH)

Authors

G. Herzog
View author publications
You can also search for this author in PubMed Google Scholar
C.-K. Sung
View author publications
You can also search for this author in PubMed Google Scholar
E. André
View author publications
You can also search for this author in PubMed Google Scholar
W. Enkelmann
View author publications
You can also search for this author in PubMed Google Scholar
H.-H. Nagel
View author publications
You can also search for this author in PubMed Google Scholar
T. Rist
View author publications
You can also search for this author in PubMed Google Scholar
W. Wahlster
View author publications
You can also search for this author in PubMed Google Scholar
G. Zimmermann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Informatik, Technische Universität München, Postfach 202420, D-8000, München 2, Germany
W. Brauer & C. Freksa &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Herzog, G. et al. (1989). Incremental Natural Language Description of Dynamic Imagery. In: Brauer, W., Freksa, C. (eds) Wissensbasierte Systeme. Informatik-Fachberichte, vol 227. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-75182-0_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-75182-0_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-51838-9
Online ISBN: 978-3-642-75182-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics