Abstract
We propose a new approach toward evaluation of spoken dialog systems. The novelty of our method is based on utilization of domain-specific knowledge combined with the deterministic measurement of dialog system performance on a set of individual tasks within the domain. The proposed methodology thus attempts to answer questions such as: “How well is my dialog system performing on a specific domain?”, “How much has my dialog system improved since the previous version?”, “How much is my dialog system better/worse than other dialog systems performing on that domain?”
Chapter PDF
Similar content being viewed by others
References
Weizenbaum, J.: ELIZA - A Computer Program for the Study of Natural Language Communication between Man and Machine. Communications of the Association for Computing Machinery 9, 36–45 (1966)
Allen, J., Chambers, N., Ferguson, G., Galescu, L., Jung, H., Swift, M., Taysom, W.: PLOW: A Collaborative Task Learning Agent. In: Twenty-Second Conference on Artificial Intelligence, AAAI-2007 (2007)
Cassell, J., Stocky, T., Bickmore, T., Gao, Y., Nakano, Y., Ryokai, K.: Mack: Media lab autonomous conversational kiosk. In: Imagina 2002 (2002)
Graesser, A.C., VanLehn, K., Rosfie, C.P., Jordan, P.W., Harter, D.: Intelligent tutoring systems with conversational dialogue. AI Mag. 22(4), 39–51 (2001)
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (International edn.). Prentice-Hall, Englewood Cliffs (February 2000)
Gandhe, S., Traum, D.: Evaluation understudy for dialogue coherence models. In: Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, Columbus, Ohio, June 2008, pp. 172–181. Association for Computational Linguistics (2008)
Walker, M., Kamm, C., Litman, D.: Towards developing general models of usability with paradise. Nat. Lang. Eng. 6(3-4), 363–377 (2000)
Hajdinjak, M., Mihelific, F.: The paradise evaluation framework: Issues and findings. Comput. Linguist. 32(2), 263–272 (2006)
Le Bigot, L., Bretier, P., Terrier, P.: Detecting and exploiting user familiarity in natural language human-computer dialogue. In: Asai, K. (ed.) Human Computer Interaction: New Developments, pp. 269–382. InTech Education and Publishing (2008); ISBN: 978-953-7619-14-5
Nielsen, J.: Heuristic evaluation. In: Nielsen, J., Mack, R.L. (eds.) Usability Inspection Methods, pp. 25–64. John Wiley & Sons, New York (1994); ISBN: 0-471-01877-5
Carroll, J.: Human Computer Interaction in the New Millennium. ACM Press, New York (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kleindienst, J., Cuřín, J., Labský, M. (2009). ADiEU: Toward Domain-Based Evaluation of Spoken Dialog Systems. In: Jacko, J.A. (eds) Human-Computer Interaction. New Trends. HCI 2009. Lecture Notes in Computer Science, vol 5610. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-02574-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-02574-7_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-02573-0
Online ISBN: 978-3-642-02574-7
eBook Packages: Computer ScienceComputer Science (R0)