Skip to main content

Part of the book series: NATO ASI Series ((NATO ASI F,volume 169))

Summary

This chapter describes the various knowledge sources required to handle human-machine multimodal interaction efficiently: they constitute the task, user, dialogue, environment and system models. The first part of the chapter discusses the content of these models, emphasising the problems occurring when speech is combined with other modalities. The second part focuses on spoken language characteristics, describes different parsing methods (rule-based and stochastic) using a task model, and briefly presents the integration of the rule-based method in an end-to-end information retrieval system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bernsen N.O., Dybkjaer L. and Dybkjaer H. A Dedicated Task-oriented Dialogue Theory in Support of Spoken Language Dialogue Systems Design. Proceedings of the IC-SLP Conference, Yokohama, Japan. 875–878. September 1994

    Google Scholar 

  2. Bellik Y., Ferrari S., Néel F., Teil D. Requirements for Multimodal Dialogue Including Vocal Interaction. ESCA Workshop on Spoken Dialogue Systems. Theories and Applications. Vigso, Denmark, May 30-June 2. pp. 161–164. 1995

    Google Scholar 

  3. Bellik Y. La composante temporelle dans les interfaces multimodales. Proceedings. Interfaces 97. 6th International Conference. MontpellierMay 28–30, 1997

    Google Scholar 

  4. Bellik Y. Media Integration in Multimodal Interfaces. IEEE Workshop on Multimedia Signal Processing, Princeton, New Jersey, 23–25 June 1997

    Google Scholar 

  5. Martin, J.-C.A Connectionist Model using Multiplexed Oscillations and Synchrony to Enable Dynamic Connections. Proceedings of the Fourth International Conference on Artificial Neural Networks (ICANN’94), pp 755–758, vol. 1, 26–29 May, Sorrento, Italy. 1994

    Google Scholar 

  6. Martin, J.C. Towards “intelligent” cooperation between modalities. The example of a system enabling multimodal interaction with a map. IJCAI-97 workshop on Intelligent Multimodal Systems. August, 24th. Nagoya, Japan. 1997

    Google Scholar 

  7. Collet C., Finkel A. Gherbi R. CapRe: un système de capture du regard dans un contexte d’interaction homme-machine. Proceedings Interfaces 97. 6th International Conference. Montpellier May 28–30, 1997

    Google Scholar 

  8. Morel M-A. Analyse linguistique d’un corpus de dialogues homme-machine Tomes 1–2. Univ. Paris III Sorbonne Nouvelle. 1988

    Google Scholar 

  9. Néel, F. Etude lexicologique: vocabulaire propre à la tâche In: Analyse linguistique d’un corpus de dialogues homme-machineTome 1. Univ. Paris III Sorbonne Nouvelle. 1988

    Google Scholar 

  10. Briffault X. et Denis M. Interactions pilote-copilote au cours de dialogues de navigation. 5èmes Journées Internationales Monpellier 96, L’Interface des Mondes Réels & Virtuels, Montpellier, 21–24 mai 1996

    Google Scholar 

  11. Levinson S.E. and Shipley K.L. A Conversational-Mode Airline Information and Reservation System Using Speech Input and Output. Bell Sys. Tech. Journ., Vol. 59, No. 1, pp. 119–137, 1980

    Google Scholar 

  12. Marque F., Bennacef S.K., Néel F., Trinh S. PAROLE: a Vocal Dialogue System for Air Traffic Control Training. Joint ESCA-NATO/RSG.IO Tutorial and Research Workshop on Applications of Speech Technology, Lautrach, 16–17 September 1993

    Google Scholar 

  13. Guyomard M. and Siroux J. Suggestive and Corrective Answers: A single Mechanism. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 361–374. 1989

    Google Scholar 

  14. Bellik Y., Ferrari S., Néel F., Teil D., Pierre E., Tachoires V. Interaction multimodale: concepts et architecture. 4èmes Journées Internationales sur V Interface des Mondes Réels et Virtuels. Montpellier, 26–30 juin 1995

    Google Scholar 

  15. Levinson S. C. Pragmatics. Cambridge University Press. 1983

    Google Scholar 

  16. Grice H.P. Logic and Conversation. In Cole & Morgan (eds) (1975: 41–58). (Part of Grice (1967)).1975.

    Google Scholar 

  17. Grice H.P. Further Notes on Logic and Conversation. In Cole (1978: 113–28). (Part of Grice (1967)).1978.

    Google Scholar 

  18. Austin J. L., How To Do Things With Words. Oxford: Clarendon Press. 1962

    Google Scholar 

  19. Searle J.R.Speech Acts, an essay in the Philosophy of Language. Cambridge University Press. 1969

    Google Scholar 

  20. Bunt H.C. Information Dialogues as Communicative Actions in Relation to Partner Modelling and Information Processing. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 47–73. 1989

    Google Scholar 

  21. Taleb L. Communicational Deviation in Finalized Informative Dialogue Management. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 pp. 61–70 1996

    Google Scholar 

  22. Bilange E. An Approach to Oral Dialogue Modelling. 2nd Venaco Workshop. “The Structure of Multimodal Dialogue.” ESCA ETRW. Acquafredda di Maratea, Italy, September 16–70, 1991

    Google Scholar 

  23. Alexandersson J. Some Ideas for the Automatic Acquisition of Dialogue Structure. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 1996

    Google Scholar 

  24. Proctor C. and Young S. Dialogue Control in Conversational Speech Interfaces. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 385–398. 1989

    Google Scholar 

  25. Sadek M.D., Bretier P., Cadoret V., Cozannet A., Dupont P., Ferrieux A., Panaget F. A Cooperative Spoken Dialogue System Based on a Rational Agent Model: A First Implementation on the AGS Application. ESCA Workshop on Spoken Dialogue Systems, Vigso, Denmark, pp. 145–148. May 1995

    Google Scholar 

  26. Maier E. Context Construction as Subtask of Dialogue Processing-the VERBMOBIL Case. pp. 113–122. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 1996.

    Google Scholar 

  27. Briffault X. et Denis M., Analyse d’un Corpus de Dialogues de Navigation à Bord d’un Véhicule Automobile. Notes et Documents LIMSI-CNRSNo. 95–28. 1995

    Google Scholar 

  28. Siroux J., Guyomard M., Jolly Y., Multon F., Remondeau C., Speech and tactile-based GEORAL system. EUROSPEECH’95, Madrid, pp. 1943–1946, September 1995

    Google Scholar 

  29. Montacie C., Project AMIBE: Applications Multimoddles pour Interfaces et bornes Evoluees; GDR No. 39, rapport d’activité 1994

    Google Scholar 

  30. Bellik Y. Meditor: a Multimodal Text Editor for Blind Users, ACM UIST’96, Ninth Annual Symposium on User Interface Software, Seattle, Washington, USA, November 6–8, 1996

    Google Scholar 

  31. Bourdot P., Krus M., Gherbi R. Gestion de périphériques non-standards pour des interfaces multimodales sous Unix/XI1: Application à un modeleur tridimensionnel, 4èmes Journées Internationales Montpellier 95 sur l’Interface des Mondes Réels et Virtuels, juin 1995

    Google Scholar 

  32. Morel M-A. Computer Human-Communication. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ., 1989. pp. 323–330

    Google Scholar 

  33. Beun R.J. Declarative Question Acts: Two Experiments on Identification. In:The Structure of Multimodal Dialogue M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ., pp. 313–321. 1989

    Google Scholar 

  34. Beun R.J. The functions of repetitions in spoken information dialogues. IPO Annual Progress Report, 20, pp. 91–98. Eindhoven, The Netherlands: Institute for Perception Research

    Google Scholar 

  35. Fillmore Ch.J. The case for case, Universals in Linguistic Theory, Emmon Bach and Robert T. Harms, Holt, Rinehart and Winston Inc., pp. 1–90, 1968

    Google Scholar 

  36. Bruce B. Case Systems for Natural Language, Artificial Intelligence, Vol. 6, pp. 327–360, 1975

    Article  MATH  Google Scholar 

  37. Minker W. Stochastic versus Rule-based Speech Understanding for Information Retrieval. Speech Communication Vol 25 (4), October 1998, pp. 223–247

    Article  Google Scholar 

  38. Hayes P., Hauptman A., Carbonnell J., and Tornita M. Parsing Spoken Language, a Semantic Caseframe Approach, Proc. COLING, 1986

    Google Scholar 

  39. Matrouf K., Néel F. Use of Upper Level Knowledge to improve Human-Machine Interaction. 2nd Venaco Workshop. “The Structure of Multimodal Dialogue”. ESCA ETRW. Acquafredda di Maratea, Italy, September 16–20, 1991

    Google Scholar 

  40. Matrouf A.K., Gauvain J.-L., Néel F., Mariani J. An Oral Task-Oriented Dialogue for Air-Traffic Controller Training. SPIE-IEEE. Orlando, USA, April 1990

    Google Scholar 

  41. Price P. Evaluation of Spoken Language Systems: The ATIS Domain. Proc. ARPA Human Language Technology, June, 1990

    Google Scholar 

  42. Dahl D.A., Bates M., Brown M., Fisher W., Hunicke-Smith K., Pallett D., Pau C., Rudnicky A., Shriberg E. Expanding the scope of the ATIS task: the ATIS-3 corpus. Proc. ARPA Human Language Technology, March 1994

    Google Scholar 

  43. Minker W. et Bennacef S.K Compréhension et Evaluation dans le Domaine ATIS. Journées d’Etudes en Parole, JEP, Juin, 1996

    Google Scholar 

  44. Bennacef S.K., Modélisation du dialogue oral Homme-Machine-Mise en oeuvre dans une application de demande d’informations. PhD thesis, Université de Paris XI, Orsay, 1995

    Google Scholar 

  45. Gauvain J.L., Bennacef S.K., Devillers L., Lamel L., and Rosset S. The Spoken Language Component of the MASK Kiosk. Proc. Human Comfort Security Workshop. 1995

    Google Scholar 

  46. Bennacef S.K., Bonneau-Maynard H., Gauvain J.L., Lamel L.F., and Minker W., A Spoken Language System For Information Retrieval. Proc. ICSLP-94, September 1994

    Google Scholar 

  47. Minker W., Bennacef S.K. and Gauvain J.L. A Stochastic Case Frame Approach for Natural Language Understanding. Proc. ICSLP-96, October, 1996

    Google Scholar 

  48. MADCOW Multi-Site Data Collection for a Spoken Language Corpus. Proc. DARPA Speech and Natural Language Workshop, February, 1992

    Google Scholar 

  49. Hayamizy S. Lively Communication with Spoken Dialogue Systems Utilizing Acoustic-Prosodic Information, Internal Report, LIMSI-CNRS, 1994

    Google Scholar 

  50. Lamel L.F., Rosset S., Bennacef S., Bonneau-Maynard H., Devillers L. and Gauvain J.L., Development of Spoken Language Corpora for Travel Information. Proc. EUROSPEECH-95, September 1995

    Google Scholar 

  51. Rabiner L.R. and Juang B.H. An introduction to Hidden Markov Models, IEEE Transactions on Acoustics, Speech and Signal Processing, 3 (1): 4–16, 1986

    Google Scholar 

  52. Jelinek F., Lafferty J., and Mercer R. Basic Methods of Probabilistic Context Free Grammars. Speech Recognition and Understanding. Recent Advances, Vol. 75, pp. 345–360, 1992

    Google Scholar 

  53. Schwartz R., Miller S., Stallard D. and Malkoul J. Language Understanding Using Hidden Understanding Models. Proceedings of ICSLP, pp. 997–1000, October 1996

    Google Scholar 

  54. Levin E. and Pieraccini R. Chronus-The Next Generation, Proceedings ARPA Workshop on Human Language Technology, January, 1995

    Google Scholar 

  55. Epstein M., Papineni K., Roukos S., Ward T. and Della Pietra S. Statistical Natural Language Understanding Using Hidden dumpings. Proceedings oflCASSP, pp. 176–179, May 1996

    Google Scholar 

  56. Oerder M. and Aust H. A Realtime Prototype of an Automatic Inquiry System. Proceedings of ICSLP, pp. 703–706, 1994

    Google Scholar 

  57. Minker W. Stochastically-Based Natural Language Understanding Across Tasks and Languages.Proceedings EUROSPEECH-93, September 1997

    Google Scholar 

  58. Katz S.M. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 35 (3), pp. 400–401, 1987

    Article  Google Scholar 

  59. Bates M., Boisen S., and Makhoul J. Developing an Evaluation Methodology for Spoken Language Systems.Proc. DARPA Speech and Natural Language Workshop, February 1992

    Google Scholar 

  60. Bennacef S.K., Neel F., Maynard H.B. An Oral Dialogue Model based on Speech Acts Categorisation. ESCA Workshop on Spoken Dialogue Systems. Theories and Applications. Vigso, Denmark, May 30-June 2, 1995, pp. 237–240

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 1999 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Néel, F.D., Minker, W.M. (1999). Multimodal Speech Systems. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-60087-6_34

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-64250-0

  • Online ISBN: 978-3-642-60087-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics