Abstract
In this article we propose using speech synthesis in the therapy of auditory verbal hallucinations, which are sometimes called “voices”. During a therapeutic session a patient converses with an avatar, which is controlled by a therapist. The avatar, based on the XFace model and commercial text-to-speech systems, uses a high quality synthetic voice synchronized with lip movements. A proof-of-concept is demonstrated, as well as the results of preliminary experiments with six patients. The initial results are highly encouraging – all the patients claimed that the therapy helped them, and they also highly assessed the quality of the avatar’s speech and its synchronization with the animations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
eSpeak - a free TTS engine. http://espeak.sourceforge.net/
Balcí, K.: Xface: open source toolkit for creating 3D faces of an embodied conversational agent. In: Butz, A., Fisher, B., Krüger, A., Olivier, P. (eds.) SG 2005. LNCS, vol. 3638, pp. 263–266. Springer, Heidelberg (2005). doi:10.1007/11536482_25
Bilikiewicz, A., Pużyński, S., Rybakowski, J., Wciórka, J.: Psychiatry. Wydawnictwo Medyczne Urban & Partner, Wrocław II (2002)
Brinkman, W.P., Hartanto, D., Kang, N., de Vliegher, D., Kampmann, I.L., Morina, N., Emmelkamp, P.G.M., Neerincx, M.: A virtual reality dialogue system for the treatment of social phobia. In: Extended Abstracts on Human Factors in Computing Systems, CHI 2012, pp. 1099–1102. ACM, New York (2012)
Buchanan, R.W., Kreyenbuhl, J., Kelly, D.L., Noel, J.M., Boggs, D.L., Fischer, B.A., Himelhoch, S., Fang, B., Peterson, E., Aquino, P.R., et al.: The 2009 schizophrenia PORT psychopharmacological treatment recommendations and summary statements. Schizophrenia Bull. 36(1), 71–93 (2010)
Chadwick, P., Birchwood, M.: The omnipotence of voices. A cognitive approach to auditory hallucinations. Br. J. Psychiatry 164(2), 190–201 (1994)
Craig, T.K.J., Rus-Calafell, M., Ward, T., Fornells-Ambrojo, M., McCrone, P., Emsley, R., Garety, P.: The effects of an audio visual assisted therapy aid for refractory auditory hallucinations (avatar therapy): study protocol for a randomised controlled trial. Trials 16(1), 349 (2015)
Creer, S., Cunningham, P.G.S., Yamagishi, J.: Building personalized synthetic voices for individuals with Dysarthria using the HTS toolkit. In: Mullenix, J., Stern, S. (eds.) Computer Synthesized Speech Technologies: Tools for Aiding Impairment, pp. 92–115. IGI Global press, Hershey (2010)
Ellis, D.: Time-frequency automatic gain control (2010). https://labrosaeecolumbiaedu/matlab/tf_agc
Falconer, C.J., Rovira, A., King, J.A., Gilbert, P., Antley, A., Fearon, P., Ralph, N., Slater, M., Brewin, C.R.: Embodying self-compassion within virtual reality and its effects on patients with depression. Br. J. Psychiatry 2(1), 74–80 (2016)
Freitas, D., Kouroupetroglou, G.: Electronic speech processing for persons with disabilities. Technol. Disabil. 20, 135–156 (2008)
Grogan, S., Conner, M., Willits, D., Norman, P.: Development of a questionnaire to measure patients’ satisfaction with general practitioners’ services. Br. J. Gen. Pract. 45(399), 525–529 (1995)
Huckvale, M., Leff, J., Williams, G.: Avatar therapy: an audio-visual dialogue system for treating auditory hallucinations. In: Proceedings Interspeech 2013, pp. 392–396, August 2013
Janicki, A., Bloch, J., Taylor, K.: Visual speech synthesis for Polish using keyframe based animation. In: Pułka, A., Golonek, T. (eds.) Proceedings of International Conference on Signals and Electronics Systems, ICSES 2010, pp. 423–426. IEEE, September 2010
Jarema, M.: Psychiatry. In: PZWL (2016). (in Polish)
Larøi, F., Sommer, I.E., Blom, J.D., Fernyhough, C., Hugdahl, K., Johns, L.C., McCarthy-Jones, S., Preti, A., Raballo, A., Slotema, C.W., et al.: The characteristic features of auditory verbal hallucinations in clinical and nonclinical groups: state-of-the-art overview and future directions. Schizophrenia Bull. 38(4), 724–733 (2012)
Pagliari, C., Burton, C., Mckinstry, B.H., Wolters, M.: Psychosocial implications of avatar use in supporting therapy for depression. Stud. Health Technol. Inform. 181, 329–333 (2012)
Paulo, S., Oliveira, L.C., Mendes, C., Figueira, L., Cassaca, R., Viana, C., Moniz, H.: DIXI – a generic text-to-speech system for European Portuguese. In: Teixeira, A., Lima, V.L.S., Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS, vol. 5190, pp. 91–100. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85980-2_10
Sakoe, H., Chiba, S.: Dynamic programming algorithm optimization for spoken word recognition. IEEE Trans. Acoust. Speech Signal Process. 26(1), 43–49 (1978)
Sarver, N.W., Beidel, D., Spitalnick, J.S.: The feasibility and acceptability of virtual environments in the treatment of childhood social anxiety disorder. J. Clin. Child Adolesc. Psychol. 43, 63–73 (2013)
Stahl, S.M.: Stahl’s Essential Psychopharmacology: Neuroscientific Basis and Practical Applications. Cambridge University Press, Cambridge (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Sorokosz, K., Stefaniak, I., Janicki, A. (2017). Synthetic Speech in Therapy of Auditory Hallucinations. In: Ekštein, K., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2017. Lecture Notes in Computer Science(), vol 10415. Springer, Cham. https://doi.org/10.1007/978-3-319-64206-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-64206-2_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64205-5
Online ISBN: 978-3-319-64206-2
eBook Packages: Computer ScienceComputer Science (R0)