Specifying Affect and Emotion for Expressive Speech Synthesis

Campbell, Nick

doi:10.1007/978-3-540-24630-5_47

Specifying Affect and Emotion for Expressive Speech Synthesis

Nick Campbell⁵

Conference paper

987 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2945))

Abstract

Speech synthesis is not necessarily synonymous with text-to-speech. This paper describes a prototype talking machine that produces synthesised speech from a combination of speaker, language, speaking-style, and content information, using icon-based input. The paper addresses the problems of specifying the text-content and output realisation of a conversational utterance from a combination of conceptual icons, in conjunction with language and speaker information. It concludes that in order to specify the speech content (i.e., both text details and speaking-style) adequately, selection options for speaker-commitment and speaker-listener relations will be required. The paper closes with a description of a constraint-based method for selection of affect-marked speech samples for concatenative speech synthesis.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Campbell, N., Mokhtari, P.: Voice Quality; the 4th prosodic parameter. In: Proc. 15th ICPhS, Barcelona, Spain (2003)
Google Scholar
Auchlin, A.: Linguistics, Geneva. Personal communication (2003)
Google Scholar
JST/CREST Expressive Speech Processing project, introductory web pages at: http://feast.his.atr.co.jp/
Campbell, W.N.: Databases of Emotional Speech. In: Proc. ISCA (International Speech Communication Association) ITRW on Speech and Emotion, pp. 34–38 (2000)
Google Scholar
Campbell, W.N., Black, A.W.: CHATR a multi-lingual speech re-sequencing synthesis system. Technical Report of IEICE SP96-7, pp. 45–52 (1996)
Google Scholar
Campbell, W.N.: Processing a Speech Corpus for CHATR Synthesis. In: Proceedings of the International Conference on Speech Processing, pp. 183–186 (1997)
Google Scholar
Campbell, W.N.: The Recording of Emotional speech; JST/CREST database research. In: Proc. LREC 2002 (2002)
Google Scholar
Campbell, N., Mokhtari, P.: DAT vs. Minidisc — Is MD recording quality good enough for prosodic analysis? In: Proc. ASJ Spring Meeting 2002, 1-P-27 (2002)
Google Scholar
Campbell, W.N., Marumoto, T.: Automatic labelling of voice-quality in speech databases for synthesis. In: Proceedings of 6th ICSLP 2000, pp. 468–471 (2000)
Google Scholar
Mokhtari, P., Campbell, W.N.: Automatic detection of acoustic centres of reliability for tagging paralinguistic information in expressive speech. In: Proc. LREC 2002 (2002)
Google Scholar
Iida, A., Iga, S., Higuchi, F., Campbell, N., Yasumura, M.: A speech synthesis system with emotion for assisting communication. In: ISCA (International Speech Communication and Assosiation) ITRW on Speech and Emotion, pp. 167–172 (2000)
Google Scholar
Iida, A., Campbell, N., Yasumura, M.: Design and Evaluation of Synthesised Speech with Emotion. Journal of Information Processing Society of Japan 40 (1998)
Google Scholar
Iida, A., Sakurada, Y., Campbell, N., Yasumura, M.: Communication aid for nonvocal people using corpus-based concatenative speech synthesis. In: Eurospeech 2001 (2001)
Google Scholar
Campbell, W.N.: Recording Techniques for capturing natural everyday speech. In: Proc. Language Resources and Evaluation Conference (LREC 2002), Las Palmas, Spain (2002)
Google Scholar
Mokhtari, P., Campbell, N.: Automatic measurement of pressed/breathy phonation at acoustic centres of reliability in continuous speech. Special Issue on Speech Information Processing of the IEICE Transactions on Information and Systems, The Institute of Electronics, Information and Communication Engineers E-86-D(3), 574–582 (2003)
Google Scholar
Maekawa, K., Koiso, H., Furui, S., Isahara, H.: Spontaneous Speech Corpus of Japanese. In: Proc. LREC 2000, Athens, Greece, pp. 947–952 (2000)
Google Scholar
Switchboard telephone-speech database, http://www.ldc.upenn.edu
CALLFRIEND: a telephone-speech database, LDC Catalog (2001)
Google Scholar
Campbell, W.N.: Foreign-Language Speech Synthesis. In: Proceedings ESCA/COCOSDA 3rd Speech Synthesis Workshop, Jenolan Caves, Australia, November 26 (1998)
Google Scholar
Campbell, W.N.: Multi-Lingual Concatenative Speech Synthesis. In: Proc. ICSLP 1998 (5th International Conference on Spoken Language Processing), Sydney, Australia, pp. 2835–2838 (1998)
Google Scholar
Labov, W., Yeager, M., Steiner, R.: Quantitative study of sound change in progress. U.S. Regional Survey, Philadelphia, PA (1972)
Google Scholar

Download references

Author information

Authors and Affiliations

ATR Human Information Science Laboratories, Kyoto, Japan
Nick Campbell

Authors

Nick Campbell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Polytechnic Institute, Center for Computing Research, 07738, Mexico City, México
Alexander Gelbukh

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Campbell, N. (2004). Specifying Affect and Emotion for Expressive Speech Synthesis. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2004. Lecture Notes in Computer Science, vol 2945. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24630-5_47

Download citation

DOI: https://doi.org/10.1007/978-3-540-24630-5_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21006-1
Online ISBN: 978-3-540-24630-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics