Expressive Speech Recognition and Synthesis as Enabling Technologies for Affective Robot-Child Communication

Yilmazyildiz, Selma; Mattheyses, Wesley; Patsis, Yorgos; Verhelst, Werner

doi:10.1007/11922162_1

Selma Yilmazyildiz²⁰,
Wesley Mattheyses²⁰,
Yorgos Patsis²⁰ &
…
Werner Verhelst²⁰

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4261))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

792 Accesses
7 Citations

Abstract

This paper presents our recent and current work on expressive speech synthesis and recognition as enabling technologies for affective robot-child interaction. We show that current expression recognition systems could be used to discriminate between several archetypical emotions, but also that the old adage ”there’s no data like more data” is more than ever valid in this field. A new speech synthesizer was developed that is capable of high quality concatenative synthesis. This system will be used in the robot to synthesize expressive nonsense speech by using prosody transplantation and a recorded database with expressive speech examples. With these enabling components lining up, we are getting ready to start experiments towards hopefully effective child-machine communication of affect and emotion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Simon et Odil: Website for hospitalized children, http://www.simonodil.com/
IBBT research project ASCIT: Again at my School by fostering Communication through Interactive Technologies for long term sick children, https://projects.ibbt.be/ascit/
Anty project website, http://anty.vub.ac.be/
Anty foundation website, http://www.anty.org/
Breazeal, C., Aryananda, L.: Recognition of Affective Communicative Intent in Robot-Directed Speech. Autonomous Robots 12, 83–104 (2002)
Article MATH Google Scholar
Oudeyer, P.: The production and recognition of emotions in speech: features and algorithms. International Journal of Human-Computer Studies 59, 157–183 (2003)
Article Google Scholar
Slaney, M., McRoberts, G.: A Recognition System for Affective Vocalization. Speech Communication 39, 367–384 (2003)
Article MATH Google Scholar
Ververidis, D., Kotropolos, C.: Automatic speech classification to five emotional states based on gender information. In: Proceedings of Eusipco 2004, pp. 341–344 (2004)
Google Scholar
Hammal, Z., Bozkurt, B., Couvreur, L., Unay, D., Caplier, A., Dutoit, T.: Passive versus active: vocal classification system. In: Proceedings of Eusipco-2005 (2005)
Google Scholar
Shami, M., Verhelst, W.: Automatic Classification of Emotions in Speech Using Multi-Corpora Approaches. In: Proc. of the second annual IEEE BENELUX/DSP Valley Signal Processing Symposium SPS-DARTS (2006)
Google Scholar
Verhelst, W., Borger, M.: Intra-Speaker Transplantation of Speech Characteristics. In: An Application of Waveform Vocoding Techniques and DTW. Proceedings of Eurospeech 1991, Genova, pp. 1319–1322 (1991)
Google Scholar
Van Coile, B., Van Tichelen, L., Vorstermans, A., Staessen, M.: Protran: A Prosody Transplantation Tool for Text-To-Speech Applications. In: Proceedings of the International Conference on Spoken Language Processing ICSLP 1994, Yokohama, pp. 423–426 (1994)
Google Scholar
Moulines, E., Charpentier, F.: Pitch-Synchronous Waveform Processing Techniques for Text-to-Speech Synthesis Using Diphones. Speech Communication 9, 453–467 (1990)
Article Google Scholar
Verhelst, W.: On the Quality of Speech Produced by Impulse Driven Linear Systems. In: Proceedings of the International Conference on Acoustics, Speech and Signal Processing - ICASSP 1991, pp. 501–504 (1991)
Google Scholar
Mattheyses, W.: Vlaamstalige tekst-naar-spraak systemen met PSOLA (Flemish text-to-speech systems with PSOLA, in Dutch). Master thesis, Vrije Universiteit Brussel (2006)
Google Scholar
Mattheyses, W., Verhelst, W., Verhoeve, P.: Robust Pitch Marking for Prosodic Modification of Speech Using TD-PSOLA. In: Proceedings of the IEEE Benelux/DSP Valley Signal Processing Symposium, SPS-DARTS, pp. 43–46 (2006)
Google Scholar
Conkie, A., Isard, I.: Optimal coupling of diphones. In: Proceedings of the 2nd ESCA/IEEE Workshop on Speech Synthesis - SSW2 (1994)
Google Scholar

Download references

Author information

Authors and Affiliations

dept. ETRO-DSSP, Vrije Universiteit Brussel, Pleinlaan 2, B-1050, Brussels, Belgium
Selma Yilmazyildiz, Wesley Mattheyses, Yorgos Patsis & Werner Verhelst

Authors

Selma Yilmazyildiz
View author publications
You can also search for this author in PubMed Google Scholar
Wesley Mattheyses
View author publications
You can also search for this author in PubMed Google Scholar
Yorgos Patsis
View author publications
You can also search for this author in PubMed Google Scholar
Werner Verhelst
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computer Science, Zhejiang University, China
Yueting Zhuang
Department of Computer Science and Technology, Tsinghua University, P.R. China
Shi-Qiang Yang
Microsoft Corporation, Microsoft China R&D Group, 49 Zhichun Road, 100080, Beijing, China
Yong Rui
College of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, Zhejiang Province, China
Qinming He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yilmazyildiz, S., Mattheyses, W., Patsis, Y., Verhelst, W. (2006). Expressive Speech Recognition and Synthesis as Enabling Technologies for Affective Robot-Child Communication. In: Zhuang, Y., Yang, SQ., Rui, Y., He, Q. (eds) Advances in Multimedia Information Processing - PCM 2006. PCM 2006. Lecture Notes in Computer Science, vol 4261. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11922162_1

Download citation

DOI: https://doi.org/10.1007/11922162_1
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-48766-1
Online ISBN: 978-3-540-48769-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics