Skip to main content

Disambiguation of Utterances by Visual Context Information

  • Conference paper
Book cover Mustererkennung 1999

Part of the book series: Informatik aktuell ((INFORMAT))

  • 236 Accesses

Abstract

Extrapositions are discontinuous constructions of spontaneous speech where constituents are extraposed into the Nachfeld. Parallelism constraints between the source sentence and the extraposed constituents are claimed to govern all possible interpretations induced by an extra-posed constituent. The aim of this article is to integrate visual context information into the parsing model to avoid possible overgeneralizations which might arise from contradictory information given by the source sentence and the extraposed constituent. Therefore, a careful approach to uncertainty is needed because the system is concerned with erroneous, vague, and incomplete data from vision and speech recognition. Bayesian Networks provide an adequate environment to model these uncertain¬ties. The conditional probabilities of the Network are estimated from psycholinguistic experiments.

This work has been supported by the German Reasearch Foundation (DFG) within SFB 360.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gemot A. Fink. Developing HMM-based recognizers with ESMERALDA. In Workshop on Text, Speech, and Dialog, Pilsen, Czech Republic, September 1999. to appear.

    Google Scholar 

  2. G. Heidemann, F. Kümmert, H. Ritter, and G. Sagerer. A Hybrid Object Recognition Architecture. In C. von der Malsburg, W. von Seelen, J.C. Vorbrüggen, and B. Sendhoff, editors, Artificial Neural Networks - ICANN 96, 16.–19. July, pages 305–310. Springer-Verlag, Berlin, 1996.

    Google Scholar 

  3. F. Kümmert, G.A. Fink, G. Sagerer, and E. Braun. Hybrid Object Recognition in Image Sequences. In 14th International Conference on Pattern Recognition, volume II, pages 1165–1170, Brisbane, 1998.

    Google Scholar 

  4. S. Kronenberg and F. Kümmert. Syntax coordination: Interaction of discourse and extrapositions. In ICSLP, volume 5, pages 2071–2074, Sydney, Australia, 1998.

    Google Scholar 

  5. S. Kronenberg and F. Kümmert. Soft unification: Towards robust parsing of spantaneous speech. In IASTED International Conference on Artificial Intelligence and Soft Computing, 9–12. August 1999, Honolulu,USA, 1999. to appear.

    Google Scholar 

  6. S. Kronenberg and F. Kümmert. Syntactic disambiguation of extrapositions. In IASTED International Conference on Artificial Intelligence and Soft Computing, 9.–12. August 1999, Honolulu,US A, 1999. to appear.

    Google Scholar 

  7. A. Maßmann, S. Posch, and D. Schlüter. Using markov random fields for contour-based grouping. In Proc. of Int. Conf. on Image Processing, volume 2, pages 207–242, 1997.

    Chapter  Google Scholar 

  8. Judea Pearl. Probabilstic reasoning in intelligent systems: networks of plausi ble inference. Morgan Kaufmann, 1989.

    Google Scholar 

  9. G. Socher, T. Merz, and S. Posch. 3-D Reconstruction and Camera Calibration from Images with Known Objects. In D. Pycock, editor, Proc. 6th British Machine Vision Conference, pages 167–176, 1995.

    Google Scholar 

  10. G. Socher, G. Sagerer, and P. Perona. Baysian Reasoning on Qualitative Descriptions from Images and Speech. In H. Buxton and A. Mukerjee, editors, ICCV’98 WS on Conceptual Description of Images, 1998.

    Google Scholar 

  11. S. Wachsmuth, H. Brandt-Pook, G. Socher, F. Kümmert, and G. Sagerer. Multilevel integration of vision and speech understanding using bayesian networks. In H. I. Christensen, editor, Computer Vision Systems: First International Conference, volume 1542 of Lecture Notes in Computer Sci-ence, pages 231–254, Las Palmas, Spain, January 1999. Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kronenberg, S., Wachsmuth, S., Kummert, F., Sagerer, G. (1999). Disambiguation of Utterances by Visual Context Information. In: Förstner, W., Buhmann, J.M., Faber, A., Faber, P. (eds) Mustererkennung 1999. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60243-6_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-60243-6_39

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-66381-2

  • Online ISBN: 978-3-642-60243-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics