Skip to main content

“You’re as Sick as You Sound”: Using Computational Approaches for Modeling Speaker State to Gauge Illness and Recovery

  • Chapter
  • First Online:
Book cover Advances in Speech Recognition

Abstract

Recently, researchers in computer science and engineering have begun to explore the possibility of finding speech-based correlates of various medical ­conditions using automatic, computational methods. If such language cues can be identified and quantified automatically, this information can be used to support diagnosis and treatment of medical conditions in clinical settings and to further ­fundamental research in understanding cognition. This chapter reviews ­computational approaches that explore communicative patterns of patients who suffer from medical conditions such as depression, autism spectrum disorders, schizophrenia, and cancer. There are two main approaches discussed: research that explores features extracted from the acoustic signal and research that focuses on lexical and semantic features. We also present some applied research that uses computational methods to develop assistive technologies. In the final sections we discuss issues related to and the future of this emerging field of research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. H. Ai et al (2006), “Using System and User Performance Features to Improve Emotion Detection in Spoken Tutoring Dialogs,” Interspeech 2006, Pittsburgh.

    Google Scholar 

  2. M. Alpert et al (2001), “Reflections of depression in acoustic measures of the patient’s speech,” Journal of Affective Disorders, 66:59–69.

    Article  Google Scholar 

  3. J. Ang et al (2002), “Prosody-based automatic detection of annoyance and frustration in human-computer dialog”, ICSLP 2002, Denver.

    Google Scholar 

  4. H. Asperger (1944) (tr. U. Frith (1991), “Autistic psychopathy in childhood,” in U. Frith. Autism and Asperger syndrome. Cambridge University Press. pp. 37–92.

    Google Scholar 

  5. E. Bantum and J. Owen (2009), “Evaluating the Validity of Computerized Content Analysis Programs for Identification of Emotional Expression in Cancer Narratives,” Psychological Assessment, 2009, 21(1): 79–88.

    Article  Google Scholar 

  6. Emo-D B. Berlin Emotional Speech Corpus. (http://pascal.kgw.tu-berlin.de/emodb/).

  7. D. Bitouk et al. (2009), “Improving Emotion Recognition using Class-Level Spectral Features,” Interspeech 2009, Brighton.

    Google Scholar 

  8. C. Baltaxe (1984). “Use of contrastive stress in normal, aphasic, and autistic children,” Journal of Speech and Hearing Research, 27:97–105.

    Google Scholar 

  9. A. Batliner et al, (2003) “How to find trouble in communication,” Speech Communication, 40, pp. 117–143.

    Article  MATH  Google Scholar 

  10. P. Boersma & D. Weenink (2005). PRAAT: Doing phonetics by computer (Version 4.3.14) [Computer program]. Retrieved from http://www.praat.org.

  11. F. Burkhardt et al. (2005), “A Database of German Emotional Speech,” Interspeech 2005, Lisbon.

    Google Scholar 

  12. J. Diehl et al (2009), “An acoustic analysis of prosody in high-functioning autism”, Applied Psycholinguistics, 30(3).

    Google Scholar 

  13. R. el Kaliouby et al. (2006). “An Exploratory Social-Emotional Prosthetic for Autism Spectrum Disorders,” in Body Sensor Networks. 2006. MIT Media Lab.

    Google Scholar 

  14. R.B Fink et al (2009). “Evaluating Speech Recognition in a Computerized Naming Program for Aphasia,” American Speech-Language Hearing Association Conference. New Orleans, November.

    Google Scholar 

  15. R. B. Fink et al. (2002). “A computer implemented protocol for treatment of naming disorders: Evaluation ofclinician-guided and partially self-guided instruction,” Aphasiology,16(10/11):1061–1086.

    Article  Google Scholar 

  16. B. Elvevaag, P. Foltz, D. Weinberger, and T. Goldberg (2007), “Quantifying Incoherence in Speech: an Automated Methodology and Novel Application to Schizophrenia,” Schizophrenia Research, 93:304–316.

    Article  Google Scholar 

  17. B. Elvevaag, P. Foltz, M Rosenstein, and L. DeLisi (2009), “An automated method to analyze language use in patients with schizophrenia and their first degree-relatives,” Journal of Neurolinguistics.

    Google Scholar 

  18. W. Goldfarb et al. (1972), “Speech and language faults in schizophrenic children. Journal of Autism and Childhood Schizophrenia, 2(3):219–233, 1972.

    Article  MathSciNet  Google Scholar 

  19. P. Gupta & N. Rajput, (2006), “Two-Stream Emotion Recognition For Call Center Monitoring”, Interspeech 2006, Pittsburgh.

    Google Scholar 

  20. Gottschalk, L., Winget, C., & Gleser, G. (1969). Manual of instructions for using the Gottschalk-Gleser content analysis scales: Anxiety, hostility, and social alienation-personal disorganization. Berkeley: University of California Press.

    Google Scholar 

  21. K. Graves et al. (2005), “Emotional expression and emotional recognition in breast cancer survivors: A controlled comparison,” Psychology and Health, 20:579–595.

    Article  Google Scholar 

  22. M. E. Hoque et al. (2009), “Exploring Speech Therapy Games with Children on the Autism Spectrum,” Interspeech 2009, Brighton.

    Google Scholar 

  23. T. Johnstone et al (2006), “The voice of emotion: an FMRI study of neural responses to angry and happy vocal expressions,” Social, Cognitive and Affective Neuroscience, 1(3), 242–249.

    Article  MathSciNet  Google Scholar 

  24. L. Kanner (1946), “Irrelevant and metaphorical language in early infantile autism,” American Journal of Psychiatry, 103:242–246.

    Google Scholar 

  25. L. Kanner (1948), “Autistic Disturbances of Affective Contact,” Nervous Child, 2:217–2520.

    Google Scholar 

  26. C. M. Lee and S. Narayanan (2004), “Towards detecting emotionsin spoken dialogs,” IEEE Transactions on Speech and Audio Processing, 2004.

    Google Scholar 

  27. S. Lee et al (2006), “A Study of Emotional Speech Articulation using a Fast Magnetic Resonance Imaging Technique,” Interspeech 2006, Pittsburgh.

    Google Scholar 

  28. M. Le Normand et al (2008), “Prosodic disturbances in autistic children speaking French, Speech Prosody,” Campinas, Brazil.

    Google Scholar 

  29. M. Lehtinen (2008), “The prosodic and nonverbal deficiencies of French- and Finnish-speaking persons with Asperger Syndrome,” Proceedings of the ISCA Workshop on Experimental Linguistics, Athens.

    Google Scholar 

  30. M. Levit et al (2001), “Use of prosodic speech characteristics for automated detection of alcohol intoxication,” ISCA Workshop on Prosody in Speech Recognition and Understanding, Red Bank NJ.

    Google Scholar 

  31. Linguistic Data Consortium, “Emotional prosody speech and transcripts,” LDC Catalog No.: LDC2002S28, University of Pennsylvania.

    Google Scholar 

  32. J. Liscombe et al (2005), “Using Context to Improve Emotion Detection in Spoken Dialog Systems,” Interspeech 2005, Lisbon.

    Google Scholar 

  33. J. Liscombe et al (2006), “Detecting Certainness in Spoken Tutorial Dialogues,” Interspeech 2006, Pittsburgh.

    Google Scholar 

  34. X. Luo et al (2006), “Vocal Emotion Recognition with Cochlear Implants,” Interspeech 2006, Pittsburgh.

    Google Scholar 

  35. A. Maier, T. Haderlein, U. Eysholdt, F. Rosanowski, A. Batliner, M. Schuster, E. Nöth (2009), “PEAKS – A systems for the automatic evaluation of voice and speech disorders,” Speech Communication 51 (2009):425–437.

    Article  Google Scholar 

  36. F. Mairesse and M. Walker (2006), “Automatic Recognition of Personality in Conversation,” HLT-NAACL 2006, New York City.

    Google Scholar 

  37. G. Mesibov (1992). “Treatment issues with high-functioning adolescents and adults with autism,” In E. Schopler & G. Mesibov (Eds.), High-functioning individuals with autism (pp. 143–156). New York: Plenum Press.

    Google Scholar 

  38. Elliot Moore II, Mark Clements, John Peifer and Lydia Weisser (2003), “Investigating the Role of Glottal Features in Classifying Clinical Depression,” IEEE EMBS, Cancun.

    Google Scholar 

  39. S. Mozziconacci and D. J. Hermes (1999), “Role of intonation patterns in conveying emotion in speech,” ICPhS 1999, San Francisco.

    Google Scholar 

  40. Mundt, J. et al (2007), “Voice acoustic measures of depression severity and treatment response collected via interactive voice response (IVR) technology,” Journal of Neurolinguistics, 20(1):50–64.

    Article  Google Scholar 

  41. P. Oudeyer (2002), “Novel useful features and algorithms for the recognition of emotions in human speech,” Speech Prosody 2002, Aix-en-Provence.

    Google Scholar 

  42. T. Oxman, S Rosenberg, P. Schurr, and G. Tucker (1988), “Diagnostic Classification Through Content Analysis of Patient Speech,” American Journal of Psychiatry. 1988. 145:464–468.

    Google Scholar 

  43. Pennebaker, J. et al (2001), Linguistic Inquiry and Word Count: LIWC 2001. Mahwah, NJ: Erlbaum.

    Google Scholar 

  44. J. Pennebaker, M. Mehl, and K. Niederhoffer (2003), “Psychological Aspects of Natural Language Use: our Words, our Selves,” Annu. Rev. Psychol. 2003. 54:547–77.

    Article  Google Scholar 

  45. J. Pestian, P. Matykiewicz, J. Grupp-Phelan, S. Arszman Lavanier, J. Combs, and R. Kowatch (2008), “Using Natural Language Processing to Classify Suicide Notes,” ACL BioNLP Workshop, pp. 96–97.

    Google Scholar 

  46. Paul, R et al (2008) “Production of syllable stress in speakers with autism spectrum disorders,” Research in Autism Spectrum Disorders, 2:110–124.

    Article  Google Scholar 

  47. R. Ranganath, D. Jurafsky, and D. McFarland (2009), “It’s Not You, it’s Me: Detecting Flirting and its Misperception inSpeed-Dates,” EMNLP 2009, Singapore.

    Google Scholar 

  48. Rapin, I., and Dunna, M. (2003), “Update on the language disorders of individuals on the autistic spectrum,” Brain Development. 25:166–172.

    Article  Google Scholar 

  49. Shriberg, L. et al, (2001), “Speech and prosody characteristics of adolescents and adults with high-functioning autism and Asperger syndrome,” Journal of Speech, Language, and Hearing Research; 44(5).

    Google Scholar 

  50. P. Stone, D. Dunphy, M. Smith, et al (1969), “The General Inquirer: A Computer Approach to Content Analysis,” Cambridge, Mass. MIT Press.

    Google Scholar 

  51. van Santen, J. et al (2009), “Automated assessment of prosody production,” Speech Communication 51:1082–1097.

    Article  Google Scholar 

  52. J. Yuan et al (2002), “The acoustic realization of anger, fear, joy, and sadness in Chinese,” ICSLP, Denver.

    Google Scholar 

  53. Zei Pollerman, B. (2002), “A Study of Emotional Speech Articulation using a Fast Magnetic Resonance Imaging Technique,” Speech Prosody 2002, Aix-en-Provence.

    Google Scholar 

  54. E. Zetterholm (1999), “Emotional speech focusing on voice quality,” FONETIK: The Swedish Phonetics Conference, Gothemburg.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Julia Hirschberg .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Hirschberg, J., Hjalmarsson, A., Elhadad, N. (2010). “You’re as Sick as You Sound”: Using Computational Approaches for Modeling Speaker State to Gauge Illness and Recovery. In: Neustein, A. (eds) Advances in Speech Recognition. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5951-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-5951-5_13

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5950-8

  • Online ISBN: 978-1-4419-5951-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics