Synthetic Speech in Therapy of Auditory Hallucinations

Sorokosz, Kamil; Stefaniak, Izabela; Janicki, Artur

doi:10.1007/978-3-319-64206-2_10

Kamil Sorokosz¹⁵,
Izabela Stefaniak¹⁶ &
Artur Janicki¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10415))

Included in the following conference series:

International Conference on Text, Speech, and Dialogue

1535 Accesses
3 Citations

Abstract

In this article we propose using speech synthesis in the therapy of auditory verbal hallucinations, which are sometimes called “voices”. During a therapeutic session a patient converses with an avatar, which is controlled by a therapist. The avatar, based on the XFace model and commercial text-to-speech systems, uses a high quality synthetic voice synchronized with lip movements. A proof-of-concept is demonstrated, as well as the results of preliminary experiments with six patients. The initial results are highly encouraging – all the patients claimed that the therapy helped them, and they also highly assessed the quality of the avatar’s speech and its synchronization with the animations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

eSpeak - a free TTS engine. http://espeak.sourceforge.net/
Balcí, K.: Xface: open source toolkit for creating 3D faces of an embodied conversational agent. In: Butz, A., Fisher, B., Krüger, A., Olivier, P. (eds.) SG 2005. LNCS, vol. 3638, pp. 263–266. Springer, Heidelberg (2005). doi:10.1007/11536482_25
Bilikiewicz, A., Pużyński, S., Rybakowski, J., Wciórka, J.: Psychiatry. Wydawnictwo Medyczne Urban & Partner, Wrocław II (2002)
Google Scholar
Brinkman, W.P., Hartanto, D., Kang, N., de Vliegher, D., Kampmann, I.L., Morina, N., Emmelkamp, P.G.M., Neerincx, M.: A virtual reality dialogue system for the treatment of social phobia. In: Extended Abstracts on Human Factors in Computing Systems, CHI 2012, pp. 1099–1102. ACM, New York (2012)
Google Scholar
Buchanan, R.W., Kreyenbuhl, J., Kelly, D.L., Noel, J.M., Boggs, D.L., Fischer, B.A., Himelhoch, S., Fang, B., Peterson, E., Aquino, P.R., et al.: The 2009 schizophrenia PORT psychopharmacological treatment recommendations and summary statements. Schizophrenia Bull. 36(1), 71–93 (2010)
Article Google Scholar
Chadwick, P., Birchwood, M.: The omnipotence of voices. A cognitive approach to auditory hallucinations. Br. J. Psychiatry 164(2), 190–201 (1994)
Article Google Scholar
Craig, T.K.J., Rus-Calafell, M., Ward, T., Fornells-Ambrojo, M., McCrone, P., Emsley, R., Garety, P.: The effects of an audio visual assisted therapy aid for refractory auditory hallucinations (avatar therapy): study protocol for a randomised controlled trial. Trials 16(1), 349 (2015)
Article Google Scholar
Creer, S., Cunningham, P.G.S., Yamagishi, J.: Building personalized synthetic voices for individuals with Dysarthria using the HTS toolkit. In: Mullenix, J., Stern, S. (eds.) Computer Synthesized Speech Technologies: Tools for Aiding Impairment, pp. 92–115. IGI Global press, Hershey (2010)
Chapter Google Scholar
Ellis, D.: Time-frequency automatic gain control (2010). https://labrosaeecolumbiaedu/matlab/tf_agc
Falconer, C.J., Rovira, A., King, J.A., Gilbert, P., Antley, A., Fearon, P., Ralph, N., Slater, M., Brewin, C.R.: Embodying self-compassion within virtual reality and its effects on patients with depression. Br. J. Psychiatry 2(1), 74–80 (2016)
Google Scholar
Freitas, D., Kouroupetroglou, G.: Electronic speech processing for persons with disabilities. Technol. Disabil. 20, 135–156 (2008)
Google Scholar
Grogan, S., Conner, M., Willits, D., Norman, P.: Development of a questionnaire to measure patients’ satisfaction with general practitioners’ services. Br. J. Gen. Pract. 45(399), 525–529 (1995)
Google Scholar
Huckvale, M., Leff, J., Williams, G.: Avatar therapy: an audio-visual dialogue system for treating auditory hallucinations. In: Proceedings Interspeech 2013, pp. 392–396, August 2013
Google Scholar
Janicki, A., Bloch, J., Taylor, K.: Visual speech synthesis for Polish using keyframe based animation. In: Pułka, A., Golonek, T. (eds.) Proceedings of International Conference on Signals and Electronics Systems, ICSES 2010, pp. 423–426. IEEE, September 2010
Google Scholar
Jarema, M.: Psychiatry. In: PZWL (2016). (in Polish)
Google Scholar
Larøi, F., Sommer, I.E., Blom, J.D., Fernyhough, C., Hugdahl, K., Johns, L.C., McCarthy-Jones, S., Preti, A., Raballo, A., Slotema, C.W., et al.: The characteristic features of auditory verbal hallucinations in clinical and nonclinical groups: state-of-the-art overview and future directions. Schizophrenia Bull. 38(4), 724–733 (2012)
Article Google Scholar
Pagliari, C., Burton, C., Mckinstry, B.H., Wolters, M.: Psychosocial implications of avatar use in supporting therapy for depression. Stud. Health Technol. Inform. 181, 329–333 (2012)
Google Scholar
Paulo, S., Oliveira, L.C., Mendes, C., Figueira, L., Cassaca, R., Viana, C., Moniz, H.: DIXI – a generic text-to-speech system for European Portuguese. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 91–100. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85980-2_10
Chapter Google Scholar
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Article MATH Google Scholar
Sarver, N.W., Beidel, D., Spitalnick, J.S.: The feasibility and acceptability of virtual environments in the treatment of childhood social anxiety disorder. J. Clin. Child Adolesc. Psychol. 43, 63–73 (2013)
Article Google Scholar
Stahl, S.M.: Stahl’s Essential Psychopharmacology: Neuroscientific Basis and Practical Applications. Cambridge University Press, Cambridge (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Telecommunications, Warsaw University of Technology, Nowowiejska 15/19, 00-665, Warsaw, Poland
Kamil Sorokosz & Artur Janicki
Institute of Psychiatry and Neurology, Sobieskiego 9, 02-957, Warsaw, Poland
Izabela Stefaniak

Authors

Kamil Sorokosz
View author publications
You can also search for this author in PubMed Google Scholar
Izabela Stefaniak
View author publications
You can also search for this author in PubMed Google Scholar
Artur Janicki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Artur Janicki .

Editor information

Editors and Affiliations

University of West Bohemia, Pilsen, Czech Republic
Kamil Ekštein
University of West Bohemia, Pilsen, Czech Republic
Václav Matoušek

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sorokosz, K., Stefaniak, I., Janicki, A. (2017). Synthetic Speech in Therapy of Auditory Hallucinations. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-319-64206-2_10
Published: 29 July 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64205-5
Online ISBN: 978-3-319-64206-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics