Spoken Language Dialogue Models



Spoken language interactive systems range from speech-enabled command interfaces to dialogue systems which conduct spoken conversations with the user. In the first case, spoken language is used as an alternative input and output modality, so that the commands, which the user could type or select from the menu, may also be uttered. The system responses can also be given as spoken utterances, instead of written language or drawings on the screen, so the whole interaction can be conducted in speech. Spoken dialogue systems, however, are built on models concerning spoken conversations between participants so as to allow flexible interaction capabilities. Although interactions are limited concerning topics, turn-taking principles and conversational strategies, the systems aim at human–computer interaction that would support natural interaction which enables the user to interact with the system in an intuitive manner. Moreover, trying to combine insights of the processes that underlie typical human interactions, spoken dialogue modelling also seeks to advance our knowledge and understanding of the principles that govern communicative situations in general.


Dialogue System Conversation Analysis Dialogue Modelling Dialogue Management Dialogue Research 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Ai, H., Raux, A., Bohus, D., Eskenazi, M., Litman, D. (2007). Comparing spoken dialog corpora collected with recruited subjects versus real users. In: Proc. 8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium.Google Scholar
  2. 2.
    Allen, J., Byron, D., Dzikovska, M., Ferguson, G., Galescu, L., Stent, A. (2000). An architecture for a generic dialog shell. Nat. Lang. Eng., 6 (3), 1–16.CrossRefGoogle Scholar
  3. 3.
    Allen, J., Perrault, C.R. (1980). Analyzing intention in utterances. Artif. Intell., 15, 143–178.CrossRefGoogle Scholar
  4. 4.
    Allen, J. F., Schubert, L. K., Ferguson, G., Heeman, P., Hwang, C. H., Kato, T., Light, M., Martin, N. G., Miller, B. W. Poesio, M., Traum, D. R. (1995). The TRAINS Project: A case study in building a conversational planning agent. J. Exp. Theor. AI, 7, 7–48. Also available as TRAINS Technical Note 94–3 and Technical Report 532, Computer Science Department, University of Rochester, September 1994.CrossRefMATHGoogle Scholar
  5. 5.
    Allwood, J. (1976). Linguistic Communication as Action and Cooperation. Department of Linguistics, University of Göteborg. Gothenburg Monographs in Linguistics, 2.Google Scholar
  6. 6.
    Allwood, J. (1977). A critical look at speech act theory. In: Dahl, Ö. (ed.) Logic, Pragmatics, and Grammar, Studentlitteratur, Lund.Google Scholar
  7. 7.
    Allwood, J. (1994). Obligations and options in dialogue. Think Q., 3, 9–18.Google Scholar
  8. 8.
    Allwood, J. Cerrato, L., Jokinen, K., Navarretta, C., Paggio, P. (2007). The MUMIN coding scheme for the annotation of feedback, turn management and sequencing phenomena. In: Martin, J. C., Paggio, P., Kuenlein, P., Stiefelhagen, R., Pianesi F. (eds), Multimodal Corpora For Modelling Human Multimodal Behaviour. Int. J. Lang. Res. Eval. (Special Issue), 41 (3–4), 273–287.Google Scholar
  9. 9.
    Allwood, J., Traum, D., Jokinen, K. (2000). Cooperation, dialogue, and ethics. Int. J. Hum. Comput. Studies, 53, 871–914.CrossRefMATHGoogle Scholar
  10. 10.
    Anderson, A. H., Bader, M., Bard, E. G., Boyle, E., Doherty, G., Garrod, S., Isard, S., Kowtko, J., McAllister, J., Miller, J., Sotillo, C., Thompson, H. S., Weinert, R. (1991). The HCRC map task corpus. Lang. Speech, 34 (4), 351–366.Google Scholar
  11. 11.
    Appelt, D. E. (1985). Planning English Sentences. Cambridge University Press, Cambridge.CrossRefGoogle Scholar
  12. 12.
    Aust, H., Oerder, M., Seide, F., Steinbiss, V. (1995). The Philips automatic train timetable information system. Speech Commun., 17, 249–262.CrossRefGoogle Scholar
  13. 13.
    Austin, J. L. (1962). How to do Things with Words. Clarendon Press, Oxford.Google Scholar
  14. 14.
    Axelrod, R. (1984). Evolution of Cooperation. Basic Books, New York.Google Scholar
  15. 15.
    Ballim, A., Wilks, Y. (1991). Artificial Believers. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
  16. 16.
    Black, W., Allwood, J., Bunt, H., Dols, F., Donzella, C., Ferrari, G., Gallagher, J., Haidan, R., Imlah, B., Jokinen, K., Lancel, J.-M., Nivre, J., Sabah, G., Wachtel, T. (1991). A pragmatics based language understanding system. In: Proc. ESPRIT Conf. Brussels, Belgium.Google Scholar
  17. 17.
    Bolt, R.A. (1980). Put-that-there: Voice and gesture at the graphic interface. Comput. Graphics, 14 (3), 262–270.MathSciNetCrossRefGoogle Scholar
  18. 18.
    Bos, J., Klein, E., Oka T. (2003). Meaningful conversation with a mobile robot. In: Proceedings of the Research Note Sessions of the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL’03), Budapest, 71–74.Google Scholar
  19. 19.
    Brown, P., Levinson, S. C. (1999) [1987]. Politeness: Some universals in language usage. In: Jaworski, A., Coupland, N. (eds) The Discourse Reader. Routledge, London, 321–335.Google Scholar
  20. 20.
    Bunt, H. C. (1990). DIT – Dynamic interpretation in text and dialogue. In: Kálmán, L., Pólos, L. (eds) Papers from the Second Symposium on Language and Logic. Akademiai Kiadó, Budapest.Google Scholar
  21. 21.
    Bunt, H. C. (2000). Dynamic interpretation and dialogue theory. In: Taylor, M. M. Néel, F., Bouwhuis, D. G. (eds) The Structure of Multimodal Dialogue II., John Benjamins, Amsterdam, 139–166.Google Scholar
  22. 22.
    Bunt, H. C. (2005). A framework for dialogue act specification. In: Fourth Workshop on Multimodal Semantic Representation (ACL-SIGSEM and ISO TC37/SC4), Tilburg.Google Scholar
  23. 23.
    Carberry, S. (1990). Plan Recognition in Natural Language Dialogue. MIT Press, Cambridge, MA.Google Scholar
  24. 24.
    Carletta, J. (2006). Announcing the AMI Meeting Corpus. ELRA Newslett., 11 (1), 3–5.Google Scholar
  25. 25.
    Carletta, J., Dahlbäck, N., Reithinger, N., Walker, M. (eds) (1997). Standards for Dialogue Coding in Natural Language Processing. Dagstuhl-Seminar Report 167.Google Scholar
  26. 26.
    Carlson R. (1996). The dialog component in the Waxholm system. In: LuperFoy, S., Nijholt, A., Veldhuijzen van Zanten, G. (eds) Proc. Twente Workshop on Language Technology. Dialogue Management in Natural Language Systems (TWLT 11), Enschede, The Netherlands, 209–218.Google Scholar
  27. 27.
    Chin, D. (1989). KNOME: Modeling what the user knows in UC. In: Kobsa, A., Wahlster, W. (eds) User Modeling in Dialogue Systems. Springer-Verlag Berlin, Heidelberg, 74–107.CrossRefGoogle Scholar
  28. 28.
    Chomsky, N. (1957). Syntactic Structures. Mouton, The Hague/Paris.Google Scholar
  29. 29.
    Chu-Carroll, J., Brown, M. K. (1998). An evidential model for tracking initiative in collaborative dialogue interactions. User Model. User-Adapted Interact., 8 (3–4), 215–253.CrossRefGoogle Scholar
  30. 30.
    Chu-Carroll, J., Carpenter, B. (1999). Vector-based natural language call routing. Comput. Linguist., 25 (3), 256–262.Google Scholar
  31. 31.
    Clark, H. H., Schaefer, E. F. (1989). Contributing to discourse. Cogn. Sci., 13, 259–294.CrossRefGoogle Scholar
  32. 32.
    Clark, H. H., Wilkes-Gibbs, D. (1986). Referring as a collaborative process. Cognition, 22, 1–39.CrossRefGoogle Scholar
  33. 33.
    Cohen, P. R., Levesque, H. J. (1990a). Persistence, intention, and commitment. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press, Cambridge, MA, 33–69.Google Scholar
  34. 34.
    Cohen, P. R., Levesque, H. J. (1990b). Rational interaction as the basis for communication. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press, Cambridge, MA, 221–255.Google Scholar
  35. 35.
    Cohen, P. R., Levesque, H. J. (1991). Teamwork. Nous, 25 (4), 487–512.CrossRefGoogle Scholar
  36. 36.
    Cohen, P. R., Morgan, J., Pollack, M. (eds) (1990). Intentions in Communication. MIT Press, Cambridge.Google Scholar
  37. 37.
    Cohen, P. R., Perrault, C. R. (1979). Elements of plan-based theory of speech acts. Cogn. Sci., 3, 177–212.CrossRefGoogle Scholar
  38. 38.
    Cole, R. A., Mariani, J., Uszkoreit, H., Zaenen, A., Zue, V. (eds) (1996). Survey of the State of the Art in Human Language Technology. Also available at http://www.cse.ogi.edu/CSLU/HLTSurvey/
  39. 39.
    Core, M. G., Allen, J. F. (1997). Coding dialogs with the DAMSL annotation scheme. In: Working Notes of AAAI Fall Symposium on Communicative Action in Humans and Machines, Boston, MA. Google Scholar
  40. 40.
    Danieli M., Gerbino E. (1995). Metrics for evaluating dialogue strategies in a spoken language system. In: Proc. AAAI Spring Symposium on Empirical Methods in Discourse Interpretation and Generation, Stanford University, 34–39.Google Scholar
  41. 41.
    Dybkjaer, L., Bernsen, N. O., Dybkjaer, H. (1996). Evaluation of spoken dialogue systems. In: Proc. 11th Twente Workshop on Language Technology, Twente.Google Scholar
  42. 42.
    Erman, L. D., Hayes-Roth, F., Lesser, V. R., Reddy, D. R. (1980). The HEARSAY-II speech understanding system: Integrating knowledge to resolve uncertainty. Comput. Surv., 12 (2), 213–253.CrossRefGoogle Scholar
  43. 43.
    Esposito, A., Campbell, N., Vogel, C., Hussain, A., and Nijholt, A. (Eds.). Development of Multimodal Interfaces: Active Listening and Synchrony. Springer Publishers.Google Scholar
  44. 44.
    Galliers, J. R. (1989). A theoretical framework for computer models of cooperative dialogue, acknowledging multi-agent conflict. Technical Report 17.2, Computer Laboratory, University of Cambridge.Google Scholar
  45. 45.
    Gmytrasiewicz, P. J., Durfee, E. H. (1993). Elements of utilitarian theory of knowledge and action. In: Proc. 12th Int. Joint Conf. on Artificial Intelligence, Chambry, France, 396–402.Google Scholar
  46. 46.
    Gmytrasiewicz, P. J., Durfee, E. H., Rosenschein, J. S. (1995). Towards rational communicative behavior. In: AAAI Fall Symp. on Embodied Language, AAAI Press, Cambridge, MA.Google Scholar
  47. 47.
    Goodwin, C. (1981). Conversational Organization: Interaction between Speakers and Hearers. Academic Press, New York.Google Scholar
  48. 48.
    Gorin, A. L., Riccardi, G., Wright, J. H. (1997). How may i help you? Speech Commun., 23 (1/2), 113–127.CrossRefGoogle Scholar
  49. 49.
    Grice, H. P. (1975). Logic and conversation. In: Cole, P., Morgan, J. L. (eds) Syntax and Semantics. Vol 3: Speech Acts. Academic Press, New York, 41–58.Google Scholar
  50. 50.
    Grosz, B. J. (1977). The Representation and Use of Focus in Dialogue Understanding. SRI Stanford Research Institute, Stanford, CA.Google Scholar
  51. 51.
    Grosz, B. J., Hirschberg, J. (1992). Some international characteristics of discourse. Proceedings of the Second International Conference on Spoken Language Processing (ICSLP’92), Banff, Alberta, Canada, 1992, 429–432.Google Scholar
  52. 52.
    Grosz, B. J., Kraus, S. (1995). Collaborative plans for complex group action. Technical Report TR-20-95, Harvard University, Center for Research in Computing Technology.Google Scholar
  53. 53.
    Grosz, B. J., Sidner, C. L. (1986). Attention, intentions, and the structure of discourse. Comput. Linguist., 12 (3), 175–203.Google Scholar
  54. 54.
    Grosz, B. J., Sidner, C. L. (1990). Plans for discourse. In: Cohen, P. R., Morgan, J., Pollack, M. E. (eds) Intentions in Communication. The MIT Press. Cambridge, MA, 417–444.Google Scholar
  55. 55.
    Guinn, C. I. (1996). Mechanisms for mixed-initiative human-computer collaborative discourse. In: Proc. 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, California, USA, 278–285.Google Scholar
  56. 56.
    Hasida, K., Den, Y., Nagao, K., Kashioka, H., Sakai, K., Shimazu, A. (1995). Dialeague: A proposal of a context for evaluating natural language dialogue systems. In: Proc. 1st Annual Meeting of the Japanese Natural Language Processing Society, Tokyo, Japan, 309–312 (in Japanese).Google Scholar
  57. 57.
    Heeman, P. A., Allen, J. F. (1997). International boundaries, speech repairs, and discourse markers: Modelling spoken dialog. In: Proc. 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, Madrid, Spain.Google Scholar
  58. 58.
    Hirasawa, J., Nakano, M., Kawabata, T., Aikawa, K. (1999). Effects of system barge-in responses on user impressions. In: Sixth Eur. Conf. on Speech Communication and Technology, Budapest, Hungary, 3, 1391–1394.Google Scholar
  59. 59.
    Hirschberg, J., Litman, D. (1993). Empirical studies on the disambiguation of cue phrases Comput. Linguist., 19 (3), 501–530.Google Scholar
  60. 60.
    Hirschberg, J., Nakatani, C. (1998). Acoustic indicators of topic segmentation. In: Proc. Int. Conf. on Spoken Language Processing, Sydney, Australia, 976–979.Google Scholar
  61. 61.
    Hobbs, J. (1979). Coherence and coreference. Cogn. Sci., 3 (1), 67–90.CrossRefGoogle Scholar
  62. 62.
    Hovy, E. H. (1988). Generating Natural Language under Pragmatic Constraints. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
  63. 63.
    Isard, A., McKelvie, D., Cappelli, B., Dybkjær, L., Evert, S., Fitschen, A., Heid, U., Kipp, M., Klein, M., Mengel, A., Møller, M. B., Reithinger, N. (1998). Specification of workbench architecture. MATE Deliverable D3.1.Google Scholar
  64. 64.
    Jekat, S., Klein, A., Maier, E., Maleck, I., Mast, M., Quantz, J. (1995). Dialogue acts in VERBMOBIL. Technical Report 65, BMBF Verbmobil Report.Google Scholar
  65. 65.
    Jokinen, K. (1996). Goal formulation based on communicative principles. In: Proc. 16th Int. Conf. on Computational Linguistics (COLING - 96) Copenhagen, Denmark, 598–603.Google Scholar
  66. 66.
    Jokinen, K. (2009). Constructive Dialogue Modelling – Speech Interaction and Rational Agents. John Wiley, Chichester.Google Scholar
  67. 67.
    Jokinen, K., Hurtig, T. (2006). User expectations and real experience on a multimodal interactive system. In: Proc. 9th Int. Conf. on Spoken Language Processing (Interspeech 2006 – ICSLP) Pittsburgh, US.Google Scholar
  68. 68.
    Jokinen, K., Hurtig, T., Hynnä, K., Kanto, K., Kerminen, A., Kaipainen, M. (2001). Self-organizing dialogue management. In: Isahara, H., Ma, Q. (eds) NLPRS2001 Proc. 2nd Workshop on Natural Language Processing and Neural Networks, Tokyo, Japan, 77–84.Google Scholar
  69. 69.
    Joshi, A., Webber, B. L., Weischedel, R. M. (1984). Preventing false inferences. In: Proc. 10th In Conf. on Computational Linguistics and 22nd Annual Meeting of the Association for Computational Linguistics, 1984, Stanford, California, USA, 34–138.Google Scholar
  70. 70.
    Jurafsky, D., Shriberg, E., Fox, B., Curl, T. (1998). Lexical, prosodic, and syntactic cues for dialog acts. In: ACL/COLING-98 Workshop on Discourse Relations and Discourse Markers. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Montreal, Quebec, Canada.Google Scholar
  71. 71.
    Kearns, M., Isbell, C., Singh, S., Litman, D., Howe, J. (2002). CobotDS: A spoken dialogue system for chat. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence, Edmonton, Alberta.Google Scholar
  72. 72.
    Keizer, S., Akker, R. op den, Nijholt, A. (2002). Dialogue act recognition with Bayesian Network for Dutch dialogues. In: Jokien, K., McRoy, S. (eds.) Proc. 3rd SIGDial Workshop on Discourse and Dialogue, Philadelphia, US.Google Scholar
  73. 73.
    Kerminen, A., Jokinen, K. (2003). Distributed dialogue management. In: Jokinen, K., Gambäck, B., Black, W. J., Catizone, R., Wilks, Y. (eds.) Proc. EACL Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management. Budapest, Hungary.Google Scholar
  74. 74.
    Kendon, A. (2004). Gesture: Visible action as utterance. Cambridge. In: Proc. 13th Eur. Conf. on Artificial Intelligence (ECAI).Google Scholar
  75. 75.
    Kipp, M. (2001). Anvil – A generic annotation tool for multimodal dialogue. In: Proc. 7th Eur. Conf. on Speech Communication and Technology, (Eurospeech), Aalborg, Denmark, 1367–1370.Google Scholar
  76. 76.
    Koeller, A., Kruijff, G.-J. (2004). Talking robots with LEGO mindstorms. In: Proc. 20th COLING, Geneva.Google Scholar
  77. 77.
    Koiso, H., Horiuchi, Y., Tutiya, S., Ichikawa, A., Den, Y. (1998). An analysis of turn taking and backchannels based on prosodic and syntactic features in Japanese Map Task dialogs. Lang. Speech, 41 (3–4), 295–321.Google Scholar
  78. 78.
    Krahmer, E., Swerts, M., Theune, M., Weegels, M. (1999). Problem spotting in human-machine interaction. In: Proc. Eurospeech ‘99, Budapest, Hungary, 3, 1423–1426.Google Scholar
  79. 79.
    Lemon, O., Bracy, A., Gruenstein, A., Peters, S. (2001). The WITAS multi-modal dialogue system I. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech), Aalborg, Denmark.Google Scholar
  80. 80.
    Lendvai, P., Bosch, A. van den, Krahmer, E. (2003). Machine learning for shallow interpretation of user utterances in spoken dialogue systems. In: Jokinen, K., Gambäck B., Black, W. J., Catizone, R., Wilks, Y. (eds) Proc. ACL Workshop on Dialogue Systems: Interaction, Adaptation and Styles of Management, Budapest, Hungary, 69–78.Google Scholar
  81. 81.
    Lesh, N., Rich, C., Sidner, C. L. (1998). Using plan recognition in human-computer collaboration. MERL Technical Report.Google Scholar
  82. 82.
    Levesque, H. J., Cohen, P. R., Nunes, J. H. T. (1990). On acting together. In: Proc. AAAI-90, 94–99. Boston, MA.Google Scholar
  83. 83.
    Levin, E., Pieraccini, R. (1997). A stochastic model of computer-human interaction for learning dialogue strategies. In: Proc. Eurospeech, 1883–1886, Rhodes, Greece.Google Scholar
  84. 84.
    Levin, E., Pieraccini, R., Eckert, W. (2000). A stochastic model of human-machine interaction for learning dialog strategies. IEEE Trans. Speech Audio Process., 8, 1.CrossRefGoogle Scholar
  85. 85.
    Levinson, S. (1983). Pragmatics. Cambridge University Press, Cambridge.Google Scholar
  86. 86.
    Litman, D. J., Allen, J. (1987). A plan recognition model for subdialogues in conversation. Cogn. Sci., 11(2), 163–200.CrossRefGoogle Scholar
  87. 87.
    Litman, D., Kearns, M., Singh, S., Walker, M. (2000). Automatic optimization of dialogue management. In: Proc. 18th Int. Conf. on Computational Linguistics (COLING 2000) Saarbrcken, Germany, 502–508.Google Scholar
  88. 88.
    Lopez Cozar, R., Araki, M. (2005). Spoken, multilingual and multimodal dialogue systems. Wiley, New York, NY.Google Scholar
  89. 89.
    Majaranta, P., Räihä, K. (2002). Twenty years of eye typing: Systems and design issues. In: Proc. 2002 Symp. on Eye Tracking Research & Applications (ETRA '02), ACM, New York, 15–22.Google Scholar
  90. 90.
    Martin, D., Cheyer, A., Moran, D. (1998). Building distributed software systems with the Open Agent Architecture. In: Proc. 3rd Int. Conf. on the Practical Application of Intelligent Agents and Multi-Agent Technology, Blackpool, UK. The Practical Application Company, Ltd.Google Scholar
  91. 91.
    McCoy, K. F. (1988). Reasoning on a highlighted user model to respond to misconceptions. Comput. Linguist., 14 (3), 52–63.Google Scholar
  92. 92.
    McGlashan, S., Fraser, N. M, Gilbert, N., Bilange, E., Heisterkamp, P., Youd, N. J. (1992). Dialogue management for telephone information services. In: Proc. Int. Conf. on Applied Language Processing, Trento, Italy.Google Scholar
  93. 93.
    McRoy, S. W., Hirst, G. (1995). The repair of speech act misunderstandings by abductive inference. Comput. Linguist., 21 (4), 435–478.Google Scholar
  94. 94.
    McTear, M. (2004). Spoken Dialogue Technology: Toward the Conversational User Interface. Springer Verlag, London.Google Scholar
  95. 95.
    Miikkulainen, R. (1993). Sub-symbolic Natural Language Processing: An Integrated Model of Scripts, Lexicon, and Memory. MIT Press, Cambridge.Google Scholar
  96. 96.
    Minsky, M. (1974). A Framework for Representing Knowledge. AI Memo 306. M.I.T. Artificial Intelligence Laboratory, Cambridge, MA.Google Scholar
  97. 97.
    Moore, J. D., Swartout, W. R. (1989). A reactive approach to explanation. In: Proc. 11th Int. Joint Conf. on Artificial Intelligence (IJCAI), Detroit, MI, 20–25.Google Scholar
  98. 98.
    Motooka, T., Kitsuregawa, M., Moto-Oka, T., Apps, F. D. R. (1985). The Fifth Generation Computer: The Japanese Challenge. Wiley, New York, NY.Google Scholar
  99. 99.
    Möller, S. (2002). A new taxonomy for the quality of telephone services based on spoken dialogue systems. In: Jokinen, K., McRoy, S. (eds) Proc. 3rd SIGdial Workshop on Discourse and Dialogue, Philadelphia, PA, 142–153.Google Scholar
  100. 100.
    Nagata, M., Morimoto, T. (1994). First steps towards statistical modeling of dialogue to predict the speech act type of the next utterance. Speech Commun., 15 (3–4), 193–203.CrossRefGoogle Scholar
  101. 101.
    Nakano, M., Miyazaki, N., Hirasawa, J., Dohsaka, K., Kawabata, T. (1999). Understanding unsegmented user utterances in real-time spoken dialogue systems. In: Proc. 37th Annual Meeting of the Association for Computational Linguistics on Computational Linguistics, Maryland, USA, 200–207.Google Scholar
  102. 102.
    Nakano, M., Miyazaki, N., Yasuda, N., Sugiyama, A., Hirasawa, J., Dohsaka, K., Aikawa, K. (2000). WIT: Toolkit for building robust and real-time spoken dialogue systems. In: Dybkjær, L., Hasida, K., Traum, D. (eds) Proc. 1st SIGDial workshop on Discourse and Dialouge – Volume 10, Hong Kong, 150–159.Google Scholar
  103. 103.
    Nakatani, C., Hirschberg, J. (1993). A speech-first model for repair detection and correction. In: Proc. 31st Annual Meeting on Association for Computational Linguistics, Columbus, OH, 46–53.Google Scholar
  104. 104.
    Nakatani, C., Hirschberg, J., Grosz, B. (1995). Discourse structure in spoken language: Studies on speech corpora. In: Working Notes of the AAAI-95 Spring Symposium on Empirical Methods in Discourse Interpretation, Palo Alto, CA.Google Scholar
  105. 105.
    Newell, A., Simon, H. (1976). Computer science as empirical inquiry: Symbols and search. Commun. ACM, 19, 113–126.MathSciNetCrossRefGoogle Scholar
  106. 106.
    Nielsen, J. (1994). Heuristic evaluation. In: Nielsen, J., Mack, R. L. (eds) Usability Inspection Methods,  Chapter 2, John Wiley, New York.Google Scholar
  107. 107.
    Norman, D. A., Draper, S. W. (eds) (1986). User Centered System Design: New Perspectives on Human-Computer Interaction. Lawrence Erlbaum Associates, Hillsdale, NJ.Google Scholar
  108. 108.
    Paek; T., Pieraccini, R. (2008). Automating spoken dialogue management design using machine learning: an industry perspective. In: McTear, M. F, Jokinen, K., Larson, J. (eds) Evaluating New Methods and Models for Advanced Speech-Based Interactive Systems. Special Issue of Speech Commun., 50 (8–9).Google Scholar
  109. 109.
    Paris, C. L. (1988). Tailoring object descriptions to a user’s level of expertise. Comput. Linguist., 14 (3), 64–78.Google Scholar
  110. 110.
    Power, R. (1979). Organization of purposeful dialogue. Linguistics, 17, 107–152.CrossRefGoogle Scholar
  111. 111.
    Price, P., Hirschman, L., Shriberg, E., Wade, E. (1992). Subject-based evaluation measures for interactive spoken language systems. In: Proc. Workshop on Speech and Natural Language, Harriman, New York, 34–39.Google Scholar
  112. 112.
    Reichman, R. (1985). Getting Computers to Talk Like You and Me. Discourse Context, Focus, and Semantics (An ATN Model). The MIT Press, Cambridge, MA.Google Scholar
  113. 113.
    Reithinger, N., Maier, E. (1995). Utilizing statistical dialogue act processing in Verbmobil. In: Proc. 33rd Annual Meeting of ACL, MIT, Cambridge, US, 116–121.Google Scholar
  114. 114.
    Ries, K. (1999). HMM and neural network based speech act detection. ICASSP. Also available: citeseer.nj.nec.com/ries99hmm.htmlGoogle Scholar
  115. 115.
    Roy, N., Pineau, J., Thrun, S. (2000). Spoken dialog management for robots. In: Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics, Hong Kong.Google Scholar
  116. 116.
    Rudnicky, A., Thayer, E, Constantinides, P., Tchou, C., Shern, R., Lenzo, K., Xu, W., Oh, A. (1999). Creating natural dialogs in the Carnegie Mellon Communicator System. In: Proc. 6th Eur. Conf. on Speech Communication and Technology (Eurospeech-99), Budapest, 1531–1534.Google Scholar
  117. 117.
    Sacks, H., Schegloff, E., Jefferson, G. (1974). A simplest systematics for the organization of turn-taking for conversation. Language, 50 (4), 696–735.CrossRefGoogle Scholar
  118. 118.
    Sadek, D., Bretier, P., Panaget, F. (1997). ARTIMIS: Natural dialogue meets rational agency. In: Proc. IJCAI-97, Nagoya, Japan, 1030–1035.Google Scholar
  119. 119.
    Samuel, K., Carberry, S., Vijay-Shanker, K. (1998). Dialogue act tagging with transformation-based learning. In: Proc. 36th Annual Meeting of the Association for Computational Linguistics and the 17th International Conference on Computational Linguistics (ACL-COLING), Montreal, Quebec, Canada, 1150–1156.Google Scholar
  120. 120.
    Schank, R. C., Abelson, R. P. (1977). Scripts, Plans, Goals, and Understanding. Lawrence Erlbaum Associates, Hillsdale, NJ.MATHGoogle Scholar
  121. 121.
    Schatzmann, J., Weilhammer, K., Stuttle, M. N., Young, S. (2006). A survey of statistical user simulation techniques for reinforcement-learning of dialogue management strategies. Knowledge Eng. Rev., 21 (2), 97–126.CrossRefGoogle Scholar
  122. 122.
    Scheffler, K., Young, S. (2000). Probabilistic simulation of human-machine dialogues. In: Proc. IEEE ICASSP, Istanbul, Turkey, 1217–1220.Google Scholar
  123. 123.
    Searle, J. R. (1979). Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge University Press, Cambridge.CrossRefGoogle Scholar
  124. 124.
    Seneff, S., Hurley, E., Lau, R., Pao, C., Schmid, P., Zue, V. (1998). GALAXY-II: A reference architecture for conversational system development. In: Proc. 5th Int. Conf. on Spoken Language Processing (ICSLP 98). Sydney, Australia.Google Scholar
  125. 125.
    Shriberg, E., Bates, R., Taylor, P., Stolcke, A., Jurafsky, D., Ries, K., Coccaro, N., Martin, R., Meteer, M., Van Ess-Dykema, C. (1998). Can prosody aid the automatic classification of dialog acts in conversational speech? Lang. Speech, 41, 3–4, 439–487.Google Scholar
  126. 126.
    Sinclair, J. M., Coulthard, R. M. (1975). Towards an Analysis of Discourse: The English Used by Teacher and Pupils. Oxford University Press, Oxford.Google Scholar
  127. 127.
    Smith, R. W. (1998). An evaluation of strategies for selectively verifying utterance meanings in spoken natural language dialog. Int. J. Hum. Comput. Studies, 48, 627–647.CrossRefGoogle Scholar
  128. 128.
    Smith, R. W., Hipp, D. R. (1994). Spoken Natural Language Dialog Systems – A Practical Approach. Oxford University Press, New York, NY.Google Scholar
  129. 129.
    Stent, A., Dowding, J., Gawron, J. M., Owen-Bratt, E., Moore, R. (1999). The CommandTalk spoken dialogue system. In: Proc. 37th Annual Meeting of the Association for Computational Linguistics, College Park, Maryland, US, 20–26.Google Scholar
  130. 130.
    Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Van Ess-Dykema, C., Meteer, M. (2000). Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech. Comput. Linguist., 26 (3), 339–373.CrossRefGoogle Scholar
  131. 131.
    Suhm, B., Geutner, P., Kemp, T., Lavie, A., Mayfield, L., McNair, A. E., Rogina, I., Schultz, T., Sloboda, T., Ward, W., Woszczyna, M., Waibel, A. (1995). JANUS: Towards multilingual spoken language translation. In: Proc. ARPA Spoken Language Workshop, Austin, TX.Google Scholar
  132. 132.
    Swerts, M., Hirschberg, J., Litman, D. (2000). Correction in spoken dialogue systems. In: Proc. Int. Conf. on Spoken Language Processing (ICSLP-2000), Beijing, China, 615–618.Google Scholar
  133. 133.
    Takezawa, T., Morimoto, T., Sagisaka, Y., Campbell, N., Iida, H., Sugaya, F., Yokoo, A., Yamamoto, S. (1998). A Japanese-to-English speech translation system: ATR-MATRIX. In: Proc. (ICSLP-98), Sydney, Australia, 957–960.Google Scholar
  134. 134.
    Traum, D. R. (2000). 20 questions on dialogue act taxonomies. J. Semantics, 17, 7–30.CrossRefGoogle Scholar
  135. 135.
    Traum, D. R., Allen, J. F. (1994). Discourse obligations in dialogue processing. In: Proc. 32nd Annual Meeting of the Association for Computational Linguistics, Las Cruces, New Mexico, USA, 1–8.Google Scholar
  136. 136.
    Traum, D., Roque, A., Leuski, A., Georgiou, P., Gerten, J., Martinovski, B., Narayanan, S., Robinson, S., Vaswani Hassan, A. (2007). A virtual human for tactical questioning. In: Proc. 8th SIGDial Workshop on Discourse and Dialogue, Antwerp, Belgium, 71–74.Google Scholar
  137. 137.
    Turing, A. M. (1950). Computing machinery and intelligence. Mind, 49, 433–460.MathSciNetCrossRefGoogle Scholar
  138. 138.
    Wahlster, W. (ed) (2000). Verbmobil: Foundations of Speech-to-Speech Translation. Springer, Berlin.MATHGoogle Scholar
  139. 139.
    Wahlster, W., Marburger, H., Jameson, A., Busemann, S. (1983). Overanswering yes-no Questions: Extended responses in a NL interface to a vision system. In: Proc. 8th Int. Joint Conf. on Artificial Intelligence (IJCAI'83), Karlsruhe, 643–646.Google Scholar
  140. 140.
    Walker, M. A., Fromer, J. C., Narayanan, S. (1998). Learning optimal dialogue strategies: A case study of a spoken dialogue agent for email. In: Proc. 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics Montreal, Quebec, Canada.Google Scholar
  141. 141.
    Walker, M. A., Hindle, D., Fromer, J., Di Fabbrizio, G., Mestel, G. (1997a). Evaluating competing agent strategies for a voice email agent. In: Proc. 5th Eur. Conf. on Speech Communication and Technology. (Eurospeech 97), Rhodes, Greece.Google Scholar
  142. 142.
    Walker, M. A., Litman, D. J., Kamm, C. A., Abella, A. (1997b). Evaluating spoken dialogue agents with PARADISE: Two case studies. Comput. Speech Lang., 12 (3), 317–347.Google Scholar
  143. 143.
    Wallace, M. D., Anderson, T. J. (1993). Approaches to interface design. Interacting Comput., 5 (3), 259–278.CrossRefGoogle Scholar
  144. 144.
    Ward, N., Tsukahara, W. (2000). Prosodic features which cue back-channel responses in English and Japanese. J. Pragmatics, 23, 1177–1207.CrossRefGoogle Scholar
  145. 145.
    Weinschenk, S., Barker, D. (2000). Designing Effective Speech Interfaces. Wiley, London.Google Scholar
  146. 146.
    Weiser, M. (1991). The computer for the twenty-first century. Sci. Am., September 1991 (Special Issue: Communications, Computers and Networks), 265(3), 94–104.Google Scholar
  147. 147.
    Weizenbaum, J. (1966). ELIZA – A computer program for the study of natural language communication between man and machine. Commun. ACM, 9, 36–45.CrossRefGoogle Scholar
  148. 148.
    Wermter, S., Weber, V. (1997). SCREEN: Learning a flat syntactic and semantic spoken language analysis using artificial neural networks. J. Artif. Intell. Res., 6 (1), 35–85.Google Scholar
  149. 149.
    Williams, J. D., Young, S. J. (2007). Partially observable Markov decision processes for spoken dialog systems. Comput. Speech Lang., 21 (2), 231–422.CrossRefGoogle Scholar
  150. 150.
    Winograd, T. (1972). Understanding Natural Language. Academic Press, New York.Google Scholar
  151. 151.
    Woods, W. A., Kaplan, R. N., Webber, B. N. (1972). The lunar sciences natural language information system: Final Report. BBN Report 2378, Bolt Beranek and Newman Inc., Cambridge, MA.Google Scholar
  152. 152.
    Yankelovich, N. (1996). How do users know what to say? Interactions, 3 (6), 32–43.CrossRefGoogle Scholar
  153. 153.
    Young, S. L., Hauptmann, A. G., Ward, W. H., Smith, E. T., Werner, P. (1989). High-level knowledge sources in usable speech recognition systems, Commun. ACM, 32 (2), 183–194.CrossRefGoogle Scholar
  154. 154.
    Zock, M., Sabah, G. (eds) (1988). Advances in Natural Language Generation: An Interdisciplinary Perspective. Pinter Publishers, London.Google Scholar

Copyright information

© Springer Science+Business Media, LLC 2010

Authors and Affiliations

  1. 1.University of HelsinkiHelsinkiFinland
  2. 2.University of TartuTartuEstonia

Personalised recommendations