Multimodal Speech Systems

Néel, Françoise D.; Minker, Wolfgang M.

doi:10.1007/978-3-642-60087-6_34

Françoise D. Néel² &
Wolfgang M. Minker²

Part of the book series: NATO ASI Series ((NATO ASI F,volume 169))

228 Accesses
1 Citations

Summary

This chapter describes the various knowledge sources required to handle human-machine multimodal interaction efficiently: they constitute the task, user, dialogue, environment and system models. The first part of the chapter discusses the content of these models, emphasising the problems occurring when speech is combined with other modalities. The second part focuses on spoken language characteristics, describes different parsing methods (rule-based and stochastic) using a task model, and briefly presents the integration of the rule-based method in an end-to-end information retrieval system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bernsen N.O., Dybkjaer L. and Dybkjaer H. A Dedicated Task-oriented Dialogue Theory in Support of Spoken Language Dialogue Systems Design. Proceedings of the IC-SLP Conference, Yokohama, Japan. 875–878. September 1994
Google Scholar
Bellik Y., Ferrari S., Néel F., Teil D. Requirements for Multimodal Dialogue Including Vocal Interaction. ESCA Workshop on Spoken Dialogue Systems. Theories and Applications. Vigso, Denmark, May 30-June 2. pp. 161–164. 1995
Google Scholar
Bellik Y. La composante temporelle dans les interfaces multimodales. Proceedings. Interfaces 97. 6th International Conference. MontpellierMay 28–30, 1997
Google Scholar
Bellik Y. Media Integration in Multimodal Interfaces. IEEE Workshop on Multimedia Signal Processing, Princeton, New Jersey, 23–25 June 1997
Google Scholar
Martin, J.-C.A Connectionist Model using Multiplexed Oscillations and Synchrony to Enable Dynamic Connections. Proceedings of the Fourth International Conference on Artificial Neural Networks (ICANN’94), pp 755–758, vol. 1, 26–29 May, Sorrento, Italy. 1994
Google Scholar
Martin, J.C. Towards “intelligent” cooperation between modalities. The example of a system enabling multimodal interaction with a map. IJCAI-97 workshop on Intelligent Multimodal Systems. August, 24th. Nagoya, Japan. 1997
Google Scholar
Collet C., Finkel A. Gherbi R. CapRe: un système de capture du regard dans un contexte d’interaction homme-machine. Proceedings Interfaces 97. 6th International Conference. Montpellier May 28–30, 1997
Google Scholar
Morel M-A. Analyse linguistique d’un corpus de dialogues homme-machine Tomes 1–2. Univ. Paris III Sorbonne Nouvelle. 1988
Google Scholar
Néel, F. Etude lexicologique: vocabulaire propre à la tâche In: Analyse linguistique d’un corpus de dialogues homme-machineTome 1. Univ. Paris III Sorbonne Nouvelle. 1988
Google Scholar
Briffault X. et Denis M. Interactions pilote-copilote au cours de dialogues de navigation. 5èmes Journées Internationales Monpellier 96, L’Interface des Mondes Réels & Virtuels, Montpellier, 21–24 mai 1996
Google Scholar
Levinson S.E. and Shipley K.L. A Conversational-Mode Airline Information and Reservation System Using Speech Input and Output. Bell Sys. Tech. Journ., Vol. 59, No. 1, pp. 119–137, 1980
Google Scholar
Marque F., Bennacef S.K., Néel F., Trinh S. PAROLE: a Vocal Dialogue System for Air Traffic Control Training. Joint ESCA-NATO/RSG.IO Tutorial and Research Workshop on Applications of Speech Technology, Lautrach, 16–17 September 1993
Google Scholar
Guyomard M. and Siroux J. Suggestive and Corrective Answers: A single Mechanism. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 361–374. 1989
Google Scholar
Bellik Y., Ferrari S., Néel F., Teil D., Pierre E., Tachoires V. Interaction multimodale: concepts et architecture. 4èmes Journées Internationales sur V Interface des Mondes Réels et Virtuels. Montpellier, 26–30 juin 1995
Google Scholar
Levinson S. C. Pragmatics. Cambridge University Press. 1983
Google Scholar
Grice H.P. Logic and Conversation. In Cole & Morgan (eds) (1975: 41–58). (Part of Grice (1967)).1975.
Google Scholar
Grice H.P. Further Notes on Logic and Conversation. In Cole (1978: 113–28). (Part of Grice (1967)).1978.
Google Scholar
Austin J. L., How To Do Things With Words. Oxford: Clarendon Press. 1962
Google Scholar
Searle J.R.Speech Acts, an essay in the Philosophy of Language. Cambridge University Press. 1969
Google Scholar
Bunt H.C. Information Dialogues as Communicative Actions in Relation to Partner Modelling and Information Processing. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 47–73. 1989
Google Scholar
Taleb L. Communicational Deviation in Finalized Informative Dialogue Management. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 pp. 61–70 1996
Google Scholar
Bilange E. An Approach to Oral Dialogue Modelling. 2nd Venaco Workshop. “The Structure of Multimodal Dialogue.” ESCA ETRW. Acquafredda di Maratea, Italy, September 16–70, 1991
Google Scholar
Alexandersson J. Some Ideas for the Automatic Acquisition of Dialogue Structure. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 1996
Google Scholar
Proctor C. and Young S. Dialogue Control in Conversational Speech Interfaces. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ. pp. 385–398. 1989
Google Scholar
Sadek M.D., Bretier P., Cadoret V., Cozannet A., Dupont P., Ferrieux A., Panaget F. A Cooperative Spoken Dialogue System Based on a Rational Agent Model: A First Implementation on the AGS Application. ESCA Workshop on Spoken Dialogue Systems, Vigso, Denmark, pp. 145–148. May 1995
Google Scholar
Maier E. Context Construction as Subtask of Dialogue Processing-the VERBMOBIL Case. pp. 113–122. Proceedings of the eleventh Twente Workshop on Language Technology (TWLT 11): Dialogue Management in Natural Language Systems. LuperFoy S. Nijholt A. and Veldhuijzen van Zanten G. (eds). June 19–21 1996.
Google Scholar
Briffault X. et Denis M., Analyse d’un Corpus de Dialogues de Navigation à Bord d’un Véhicule Automobile. Notes et Documents LIMSI-CNRSNo. 95–28. 1995
Google Scholar
Siroux J., Guyomard M., Jolly Y., Multon F., Remondeau C., Speech and tactile-based GEORAL system. EUROSPEECH’95, Madrid, pp. 1943–1946, September 1995
Google Scholar
Montacie C., Project AMIBE: Applications Multimoddles pour Interfaces et bornes Evoluees; GDR No. 39, rapport d’activité 1994
Google Scholar
Bellik Y. Meditor: a Multimodal Text Editor for Blind Users, ACM UIST’96, Ninth Annual Symposium on User Interface Software, Seattle, Washington, USA, November 6–8, 1996
Google Scholar
Bourdot P., Krus M., Gherbi R. Gestion de périphériques non-standards pour des interfaces multimodales sous Unix/XI1: Application à un modeleur tridimensionnel, 4èmes Journées Internationales Montpellier 95 sur l’Interface des Mondes Réels et Virtuels, juin 1995
Google Scholar
Morel M-A. Computer Human-Communication. In:The Structure of Multimodal Dialogue, M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ., 1989. pp. 323–330
Google Scholar
Beun R.J. Declarative Question Acts: Two Experiments on Identification. In:The Structure of Multimodal Dialogue M.M. Taylor, F. Néel, D.G. Bouwhuis (Eds). Elsevier Science Publ., pp. 313–321. 1989
Google Scholar
Beun R.J. The functions of repetitions in spoken information dialogues. IPO Annual Progress Report, 20, pp. 91–98. Eindhoven, The Netherlands: Institute for Perception Research
Google Scholar
Fillmore Ch.J. The case for case, Universals in Linguistic Theory, Emmon Bach and Robert T. Harms, Holt, Rinehart and Winston Inc., pp. 1–90, 1968
Google Scholar
Bruce B. Case Systems for Natural Language, Artificial Intelligence, Vol. 6, pp. 327–360, 1975
Article MATH Google Scholar
Minker W. Stochastic versus Rule-based Speech Understanding for Information Retrieval. Speech Communication Vol 25 (4), October 1998, pp. 223–247
Article Google Scholar
Hayes P., Hauptman A., Carbonnell J., and Tornita M. Parsing Spoken Language, a Semantic Caseframe Approach, Proc. COLING, 1986
Google Scholar
Matrouf K., Néel F. Use of Upper Level Knowledge to improve Human-Machine Interaction. 2nd Venaco Workshop. “The Structure of Multimodal Dialogue”. ESCA ETRW. Acquafredda di Maratea, Italy, September 16–20, 1991
Google Scholar
Matrouf A.K., Gauvain J.-L., Néel F., Mariani J. An Oral Task-Oriented Dialogue for Air-Traffic Controller Training. SPIE-IEEE. Orlando, USA, April 1990
Google Scholar
Price P. Evaluation of Spoken Language Systems: The ATIS Domain. Proc. ARPA Human Language Technology, June, 1990
Google Scholar
Dahl D.A., Bates M., Brown M., Fisher W., Hunicke-Smith K., Pallett D., Pau C., Rudnicky A., Shriberg E. Expanding the scope of the ATIS task: the ATIS-3 corpus. Proc. ARPA Human Language Technology, March 1994
Google Scholar
Minker W. et Bennacef S.K Compréhension et Evaluation dans le Domaine ATIS. Journées d’Etudes en Parole, JEP, Juin, 1996
Google Scholar
Bennacef S.K., Modélisation du dialogue oral Homme-Machine-Mise en oeuvre dans une application de demande d’informations. PhD thesis, Université de Paris XI, Orsay, 1995
Google Scholar
Gauvain J.L., Bennacef S.K., Devillers L., Lamel L., and Rosset S. The Spoken Language Component of the MASK Kiosk. Proc. Human Comfort Security Workshop. 1995
Google Scholar
Bennacef S.K., Bonneau-Maynard H., Gauvain J.L., Lamel L.F., and Minker W., A Spoken Language System For Information Retrieval. Proc. ICSLP-94, September 1994
Google Scholar
Minker W., Bennacef S.K. and Gauvain J.L. A Stochastic Case Frame Approach for Natural Language Understanding. Proc. ICSLP-96, October, 1996
Google Scholar
MADCOW Multi-Site Data Collection for a Spoken Language Corpus. Proc. DARPA Speech and Natural Language Workshop, February, 1992
Google Scholar
Hayamizy S. Lively Communication with Spoken Dialogue Systems Utilizing Acoustic-Prosodic Information, Internal Report, LIMSI-CNRS, 1994
Google Scholar
Lamel L.F., Rosset S., Bennacef S., Bonneau-Maynard H., Devillers L. and Gauvain J.L., Development of Spoken Language Corpora for Travel Information. Proc. EUROSPEECH-95, September 1995
Google Scholar
Rabiner L.R. and Juang B.H. An introduction to Hidden Markov Models, IEEE Transactions on Acoustics, Speech and Signal Processing, 3 (1): 4–16, 1986
Google Scholar
Jelinek F., Lafferty J., and Mercer R. Basic Methods of Probabilistic Context Free Grammars. Speech Recognition and Understanding. Recent Advances, Vol. 75, pp. 345–360, 1992
Google Scholar
Schwartz R., Miller S., Stallard D. and Malkoul J. Language Understanding Using Hidden Understanding Models. Proceedings of ICSLP, pp. 997–1000, October 1996
Google Scholar
Levin E. and Pieraccini R. Chronus-The Next Generation, Proceedings ARPA Workshop on Human Language Technology, January, 1995
Google Scholar
Epstein M., Papineni K., Roukos S., Ward T. and Della Pietra S. Statistical Natural Language Understanding Using Hidden dumpings. Proceedings oflCASSP, pp. 176–179, May 1996
Google Scholar
Oerder M. and Aust H. A Realtime Prototype of an Automatic Inquiry System. Proceedings of ICSLP, pp. 703–706, 1994
Google Scholar
Minker W. Stochastically-Based Natural Language Understanding Across Tasks and Languages.Proceedings EUROSPEECH-93, September 1997
Google Scholar
Katz S.M. Estimation of Probabilities from Sparse Data for the Language Model Component of a Speech Recognizer. IEEE Trans. on Acoustics, Speech and Signal Processing, Vol. 35 (3), pp. 400–401, 1987
Article Google Scholar
Bates M., Boisen S., and Makhoul J. Developing an Evaluation Methodology for Spoken Language Systems.Proc. DARPA Speech and Natural Language Workshop, February 1992
Google Scholar
Bennacef S.K., Neel F., Maynard H.B. An Oral Dialogue Model based on Speech Acts Categorisation. ESCA Workshop on Spoken Dialogue Systems. Theories and Applications. Vigso, Denmark, May 30-June 2, 1995, pp. 237–240
Google Scholar

Download references

Author information

Authors and Affiliations

LIMSI-CNRS, B.P. 133, 91403, Orsay Cedex, France
Françoise D. Néel & Wolfgang M. Minker

Authors

Françoise D. Néel
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang M. Minker
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Speech Research Unit, DERA Malvern, St. Andrew’s Road, WR14 4DT, Great Malvern, Worcs, UK
Keith Ponting

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Néel, F.D., Minker, W.M. (1999). Multimodal Speech Systems. In: Ponting, K. (eds) Computational Models of Speech Pattern Processing. NATO ASI Series, vol 169. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60087-6_34

Download citation

DOI: https://doi.org/10.1007/978-3-642-60087-6_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-64250-0
Online ISBN: 978-3-642-60087-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics