Disambiguation of Utterances by Visual Context Information

Kronenberg, Susanne; Wachsmuth, Sven; Kummert, Franz; Sagerer, Gerhard

doi:10.1007/978-3-642-60243-6_39

Susanne Kronenberg³,
Sven Wachsmuth³,
Franz Kummert³ &
…
Gerhard Sagerer³

Part of the book series: Informatik aktuell ((INFORMAT))

236 Accesses

Abstract

Extrapositions are discontinuous constructions of spontaneous speech where constituents are extraposed into the Nachfeld. Parallelism constraints between the source sentence and the extraposed constituents are claimed to govern all possible interpretations induced by an extra-posed constituent. The aim of this article is to integrate visual context information into the parsing model to avoid possible overgeneralizations which might arise from contradictory information given by the source sentence and the extraposed constituent. Therefore, a careful approach to uncertainty is needed because the system is concerned with erroneous, vague, and incomplete data from vision and speech recognition. Bayesian Networks provide an adequate environment to model these uncertain¬ties. The conditional probabilities of the Network are estimated from psycholinguistic experiments.

This work has been supported by the German Reasearch Foundation (DFG) within SFB 360.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gemot A. Fink. Developing HMM-based recognizers with ESMERALDA. In Workshop on Text, Speech, and Dialog, Pilsen, Czech Republic, September 1999. to appear.
Google Scholar
G. Heidemann, F. Kümmert, H. Ritter, and G. Sagerer. A Hybrid Object Recognition Architecture. In C. von der Malsburg, W. von Seelen, J.C. Vorbrüggen, and B. Sendhoff, editors, Artificial Neural Networks - ICANN 96, 16.–19. July, pages 305–310. Springer-Verlag, Berlin, 1996.
Google Scholar
F. Kümmert, G.A. Fink, G. Sagerer, and E. Braun. Hybrid Object Recognition in Image Sequences. In 14th International Conference on Pattern Recognition, volume II, pages 1165–1170, Brisbane, 1998.
Google Scholar
S. Kronenberg and F. Kümmert. Syntax coordination: Interaction of discourse and extrapositions. In ICSLP, volume 5, pages 2071–2074, Sydney, Australia, 1998.
Google Scholar
S. Kronenberg and F. Kümmert. Soft unification: Towards robust parsing of spantaneous speech. In IASTED International Conference on Artificial Intelligence and Soft Computing, 9–12. August 1999, Honolulu,USA, 1999. to appear.
Google Scholar
S. Kronenberg and F. Kümmert. Syntactic disambiguation of extrapositions. In IASTED International Conference on Artificial Intelligence and Soft Computing, 9.–12. August 1999, Honolulu,US A, 1999. to appear.
Google Scholar
A. Maßmann, S. Posch, and D. Schlüter. Using markov random fields for contour-based grouping. In Proc. of Int. Conf. on Image Processing, volume 2, pages 207–242, 1997.
Chapter Google Scholar
Judea Pearl. Probabilstic reasoning in intelligent systems: networks of plausi ble inference. Morgan Kaufmann, 1989.
Google Scholar
G. Socher, T. Merz, and S. Posch. 3-D Reconstruction and Camera Calibration from Images with Known Objects. In D. Pycock, editor, Proc. 6th British Machine Vision Conference, pages 167–176, 1995.
Google Scholar
G. Socher, G. Sagerer, and P. Perona. Baysian Reasoning on Qualitative Descriptions from Images and Speech. In H. Buxton and A. Mukerjee, editors, ICCV’98 WS on Conceptual Description of Images, 1998.
Google Scholar
S. Wachsmuth, H. Brandt-Pook, G. Socher, F. Kümmert, and G. Sagerer. Multilevel integration of vision and speech understanding using bayesian networks. In H. I. Christensen, editor, Computer Vision Systems: First International Conference, volume 1542 of Lecture Notes in Computer Sci-ence, pages 231–254, Las Palmas, Spain, January 1999. Springer-Verlag.
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Fakultät, AG Angewandte Informatik, Universität Bielefeld, Postfach 100131, 33501, Bielefeld, Germany
Susanne Kronenberg, Sven Wachsmuth, Franz Kummert & Gerhard Sagerer

Authors

Susanne Kronenberg
View author publications
You can also search for this author in PubMed Google Scholar
Sven Wachsmuth
View author publications
You can also search for this author in PubMed Google Scholar
Franz Kummert
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard Sagerer
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Photogrammetrie, Universität Bonn, Nußallee 15, D-53115, Bonn, Germany
Wolfgang Förstner , Annett Faber & Petko Faber , &
Institut für Informatik III, Universität Bonn, Römerstrasse 164, D-53117, Bonn, Germany
Joachim M. Buhmann

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kronenberg, S., Wachsmuth, S., Kummert, F., Sagerer, G. (1999). Disambiguation of Utterances by Visual Context Information. In: Förstner, W., Buhmann, J.M., Faber, A., Faber, P. (eds) Mustererkennung 1999. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60243-6_39

Download citation

DOI: https://doi.org/10.1007/978-3-642-60243-6_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66381-2
Online ISBN: 978-3-642-60243-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics