An Event-Based Conversational System for the Nao Robot

Kruijff-Korbayová, Ivana; Athanasopoulos, Georgios; Beck, Aryel; Cosi, Piero; Cuayáhuitl, Heriberto; Dekens, Tomas; Enescu, Valentin; Hiolle, Antoine; Kiefer, Bernd; Sahli, Hichem; Schröder, Marc; Sommavilla, Giacomo; Tesser, Fabio; Verhelst, Werner

doi:10.1007/978-1-4614-1335-6_14

Ivana Kruijff-Korbayová³,
Georgios Athanasopoulos⁵,
Aryel Beck⁶,
Piero Cosi⁴,
Heriberto Cuayáhuitl³,
Tomas Dekens⁵,
Valentin Enescu⁵,
Antoine Hiolle⁶,
Bernd Kiefer³,
Hichem Sahli⁵,
Marc Schröder³,
Giacomo Sommavilla⁴,
Fabio Tesser⁴ &
…
Werner Verhelst⁵

551 Accesses
10 Citations

Abstract

Conversational systems play an important role in scenarios without a keyboard, e.g., talking to a robot. Communication in human-robot interaction (HRI) ultimately involves a combination of verbal and non-verbal inputs and outputs. HRI systems must process verbal and non-verbal observations and execute verbal and non-verbal actions in parallel, to interpret and produce synchronized behaviours. The development of such systems involves the integration of potentially many components and ensuring a complex interaction and synchronization between them. Most work in spoken dialogue system development uses pipeline architectures. Some exceptions are [1, 17], which execute system components in parallel (weakly-coupled or tightly-coupled architectures). The latter are more promising for building adaptive systems, which is one of the goals of contemporary research systems.

Parts of the research reported on in this paper were performed in the context of the EU-FP7 project ALIZ-E (ICT-248116), which develops embodied cognitive robots for believable any-depth affective interactions with young users over an extended and possibly discontinuous period [2].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Acapela website http://www.acapela-group.com/index.html
ALIZ-E website http://aliz-e.org/
Mary TTS website http://mary.dfki.de/
OpenCCG website http://openccg.sourceforge.net/
OpenCV library website http://opencv.willowgarage.com
Baillie, J.: URBI: Towards a Universal Robotic Low-Level Programming Language. In: 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 3219–3224. IEEE (2005)
Google Scholar
Baldridge, J., Kruijff, G.J.: Multi-modal combinatory categorial grammar. In: Proceedings of 10th Annual Meeting of the European Association for Computational Linguistics (2003)
Google Scholar
Beck, A., Cañamero, L., Bard, K.: Towards an affect space for robots to display emotional body language. In: Proceedings of the 19th IEEE international symposium on robot and human interactive communication, Ro-Man 2010, pp. 464–469. IEEE (2010)
Google Scholar
Beck, A., Hiolle, A., Mazel, A., Cañamero, L.: Interpretation of emotional body language displayed by robots. In: Proceedings of the 3rd international workshop on Affective interaction in natural environments, AFFINE ’10, pp. 37–42. ACM, New York, NY, USA (2010). DOI http://doi.acm.org/10.1145/ 1877826.1877837. URL http://doi.acm.org/10.1145/1877826. 1877837
Bradski, G., Davis, J.: Motion segmentation and pose recognition with motion history gradients. Machine Vision and Applications 13, 174–184 (2002)
Article Google Scholar
Cuayáhuitl, H., Renals, S., Lemon, O., Shimodaira, H.: Evaluation of a hierarchical reinforcement learning spoken dialogue system. Computer Speech and Language 24(2), 395–429 (2010). DOI 10.1016/j.csl.2009.07.001
Article Google Scholar
Dekens, T., Verhelst, W.: On the noise robustness of voice activity detection algorithms. In: Proc. of InterSpeech, Florence, Italy (2011)
Google Scholar
Gerosa, M., Giuliani, D., Brugnara, F.: Acoustic variability and automatic recognition of children’s speech. Speech Communication 49, 847–860 (2007)
Article Google Scholar
Hawes, N.,Wyatt, J.L., Sloman, A., Sridharan, M., Dearden, R., Jacobsson, H., Kruijff, G.: Architecture and representations. In: H.I. Christensen, A. Sloman, G. Kruijff, J. Wyatt (eds.) Cognitive Systems, pp. 53–95. published online at http://www.cognitivesystems.org/cosybook/ (2009)
Jones, M.J., Rehg, J.M.: Statistical color models with application to skin detection. Int. J. Comput. Vision 46, 81–96 (2002)
Article MATH Google Scholar
Larsson, S., Traum, D.: Information state and dialogue management in the TRINDI dialogue move engine toolkit. Natural Language Engineering 5(3–4), 323–340 (2000)
Article Google Scholar
Lemon, O., Bracy, A., Gruenstein, A., Peters, S.: The WITAS multi-modal dialogue system I. In: EUROSPEECH, pp. 1559–1562. Aalborg, Denmark (2001)
Google Scholar
Nicolao, M., Cosi, P.: Comparing SPHINX vs. SONIC Italian Children Speech Recognition Systems. In: 7th Conference of the Italian Association of Speech Sciences (2011). Unpublished draft version
Google Scholar
Schröder, M.: The SEMAINE API: towards a standards-based framework for building emotion-oriented systems. Advances in Human-Computer Interaction 2010(319406) (2010). DOI 10.1155/2010/319406
Stiefelhagen, R., Ekenel, H., Fugen, C., Gieselmann, P., Holzapfel, H., Kraft, F., Nickel, K., Voit, M., Waibel, A.: Enabling multimodal human-robot interaction for the Karlsruhe humanoid robot. pp. 840–851 (2007)
Google Scholar
Tesser, F., Zovato, E., Nicolao, M., Cosi, P.: Two Vocoder Techniques for Neutral to Emotional Timbre Conversion. In: Y. Sagisaka, K. Tokuda (eds.) 7th Speech Synthesis Workshop (SSW), pp. 130–135. ISCA, Kyoto, Japan (2010)
Google Scholar
Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A., Tokuda, K.: The HMM-based speech synthesis system (HTS) version 2.0. In: Proc. of ISCA SSW6, pp. 294–299 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

DFKI GmbH, Language Technology Lab, Saarbrücken, Germany
Ivana Kruijff-Korbayová, Heriberto Cuayáhuitl, Bernd Kiefer & Marc Schröder
Istituto di Scienze e Tecnologie della Cognizione, ISTC, C.N.R., Rome, Italy
Piero Cosi, Giacomo Sommavilla & Fabio Tesser
Dept. ETRO-DSSP, IBBT, Vrije Universiteit Brussel, Brussels, Belgium
Georgios Athanasopoulos, Tomas Dekens, Valentin Enescu, Hichem Sahli & Werner Verhelst
Adaptive Systems Research Group, School of Computer Science, University of Hertfordshire, Hatfield, UK
Aryel Beck & Antoine Hiolle

Authors

Ivana Kruijff-Korbayová
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Athanasopoulos
View author publications
You can also search for this author in PubMed Google Scholar
Aryel Beck
View author publications
You can also search for this author in PubMed Google Scholar
Piero Cosi
View author publications
You can also search for this author in PubMed Google Scholar
Heriberto Cuayáhuitl
View author publications
You can also search for this author in PubMed Google Scholar
Tomas Dekens
View author publications
You can also search for this author in PubMed Google Scholar
Valentin Enescu
View author publications
You can also search for this author in PubMed Google Scholar
Antoine Hiolle
View author publications
You can also search for this author in PubMed Google Scholar
Bernd Kiefer
View author publications
You can also search for this author in PubMed Google Scholar
Hichem Sahli
View author publications
You can also search for this author in PubMed Google Scholar
Marc Schröder
View author publications
You can also search for this author in PubMed Google Scholar
Giacomo Sommavilla
View author publications
You can also search for this author in PubMed Google Scholar
Fabio Tesser
View author publications
You can also search for this author in PubMed Google Scholar
Werner Verhelst
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ivana Kruijff-Korbayová .

Editor information

Editors and Affiliations

, Dept. of Languages and Computer Systems, University of Granada, Granada, 18071, Spain
Ramón López-Cózar Delgado
, Dept. of Computer Science & Engineering, Waseda University, Okubo 3-4-1, Tokyo, 169-8555, Japan
Tetsunori Kobayashi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kruijff-Korbayová, I. et al. (2011). An Event-Based Conversational System for the Nao Robot. In: Delgado, RC., Kobayashi, T. (eds) Proceedings of the Paralinguistic Information and its Integration in Spoken Dialogue Systems Workshop. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-1335-6_14

Download citation

DOI: https://doi.org/10.1007/978-1-4614-1335-6_14
Published: 12 August 2011
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-1334-9
Online ISBN: 978-1-4614-1335-6
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics