Extracting Emotions and Communication Styles from Prosody

  • Licia Sbattella
  • Luca ColomboEmail author
  • Carlo Rinaldi
  • Roberto Tedesco
  • Matteo Matteucci
  • Alessandro Trivilini
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8908)


According to many psychological and social studies, vocal messages contain two distinct channels—an explicit, linguistic channel, and an implicit, paralinguistic channel. In particular, the latter contains information about the emotional state of the speaker, providing clues about the implicit meaning of the message. Such information can improve applications requiring human-machine interactions (for example, Automatic Speech Recognition systems or Conversational Agents), as well as support the analysis of human-human interactions (for example, clinic or forensic applications). PrEmA, the tool we present in this work, is able to recognize and classify both emotions and communication style of the speaker, relying on prosodic features. In particular, recognition of communication-styles is, to our knowledge, new, and could be used to infer interesting clues about the state of the interaction. PrEmA uses two LDA-based classifiers, which rely on two sets of prosodic features. Experimenting PrEmA with Italian speakers we obtained \(Ac=71\,\%\) for emotions and \(Ac=86\,\%\) for communication styles.


Natural language processing Communication style recognition Emotion recognition 


  1. 1.
    Anolli, L.: Le emozioni. Edizioni Unicopli, Milano (2002)Google Scholar
  2. 2.
    Anolli, L., Ciceri, R.: The Voice of Emotions. Angeli, Milano (1997)Google Scholar
  3. 3.
    Asawa, K., Verma, V., Agrawal, A.: Recognition of vocal emotions from acoustic profile. In: Proceedings of the International Conference on Advances in Computing, Communications and Informatics (2012)Google Scholar
  4. 4.
    Avesani, C., Cosi, P., Fauri, E., Gretter, R., Mana, N., Rocchi, S., Rossi, F., Tesser, F.: Definizione ed annotazione prosodica di un database di parlato-letto usando il formalismo ToBI. In: Proceedings of Il Parlato Italiano, Napoli, Italy, February 2003Google Scholar
  5. 5.
    Balconi, M., Carrera, A.: Il lessico emotivo nel decoding delle espressioni facciali. ESE - Psychofenia - Salento University Publishing (2005)Google Scholar
  6. 6.
    Banse, R., Sherer, K.R.: Acoustic profiles in vocal emotion expression. J. Pers. Soc. Psychol. 70, 614–636 (1996)CrossRefGoogle Scholar
  7. 7.
    Boersma, P.: Accurate short-term analysis of the fundamental frequency and the Harmonics-to-Noise ratio of a sampled sound. In: Proceedings of Institute of Phonetic Sciences, University of Amsterdam, vol. 17, pp. 97–110 (1993).
  8. 8.
    Boersma, P.: Praat, a system for doing phonetics by computer. Glot Int. 5(9/10), 341–345 (2001)Google Scholar
  9. 9.
    Boersma, P., Weenink, D.: Manual of praat: doing phonetics by computer [computer program] (2013)Google Scholar
  10. 10.
    Bonvino, E.: Le strutture del linguaggio: un’introduzione alla fonologia. La Nuova Italia, Milano (2000)Google Scholar
  11. 11.
    Borchert, M., Diisterhoft, A.: Emotions in speech - experiments with prosody and quality features in speech for use in categorical and dimensional emotion recognition environments. In: IEEE Natural Language Processing and Knowledge Engineering (2005)Google Scholar
  12. 12.
    Caldognetto, E.M., Poggi, I.: Il parlato emotivo. aspetti cognitivi, linguistici e fonetici. In: Il parlato italiano. Atti del Convegno Nazionale. Napoli, Italy (2004)Google Scholar
  13. 13.
    Canepari, L.: L’intonazione linguistica e paralinguistica. Liguori Editore (1985)Google Scholar
  14. 14.
    Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W.: Emotion recognition in human-computer interaction. IEEE Signal Process. Mag. 18(1), 32–80 (2001)CrossRefGoogle Scholar
  15. 15.
    D’Anna, L., Petrillo, M.: APA: un prototipo di sistema automatico per l’analisi prosodica. In: Atti delle 11me giornate di studio del Gruppo di Fonetica Sperimentale (2001)Google Scholar
  16. 16.
    Delmonte, R.: SLIM prosodic automatic tools for self-learning instruction. Speech Commun. 30, 145–166 (2000)CrossRefGoogle Scholar
  17. 17.
    Ekman, D., Ekman, P., Davidson, R.: The Nature of Emotion: Fundamental Questions. Oxford University Press, New York, Oxford (1994)Google Scholar
  18. 18.
    Gobl, C., Chasaide, A.N.: Testing affective correlates of voice quality through analysis and resynthesis. In: ISCA Workshop on Emotion and Speech (2000)Google Scholar
  19. 19.
    Hammarberg, B., Fritzell, B., Gauffin, J., Sundberg, J., Wedin, L.: Perceptual and acoustic correlates of voice qualities. Acta Otolaryngol. 90(1–6), 441–451 (1980)CrossRefGoogle Scholar
  20. 20.
    Hastie, H.W., Poesio, M., Isard, S.: Automatically predicting dialog structure using prosodic features. Speech Commun. 36(1–2), 63–79 (2001)zbMATHGoogle Scholar
  21. 21.
    Hirshberg, J., Avesani, C.: Prosodic disambiguation in English and Italian. In: Botinis, A. (ed.) Intonation. Kluwer, Dordrecht (2000)Google Scholar
  22. 22.
    Hirst, D.: Automatic analysis of prosody for multilingual speech corpora. In: Keller, E., Bailly, G., Terken, J., Huckvale, M. (eds.) Improvements in Speech Synthesis. Wiley, Chichester (2001)Google Scholar
  23. 23.
    López-de-Ipiña, K., Alonso, J.B., Travieso, C.M., Solé-Casals, J., Egiraun, H., Faundez-Zanuy, M., Ezeiza, A., Barroso, N., Ecay-Torres, M., Martinez-Lage, P., de Lizardui, U.M.: On the selection of non-invasive methods based on speech analysis oriented to automatic alzheimer disease diagnosis. Sensors 13(5), 6730–6745 (2013). CrossRefGoogle Scholar
  24. 24.
    Izard, C.E.: The Face of Emotion. Appleton Century Crofts, New York (1971)Google Scholar
  25. 25.
    Juslin, P.N.: Emotional communication in music performance: a functionalist perspective and some data. Music Percept. 14(4), 383–418 (1997)CrossRefGoogle Scholar
  26. 26.
    Juslin, P.: A Functionalist Perspective on Emotional Communication in Music Performance, 1st edn. Acta Universitatis Upsaliensis, Uppsala (1998)Google Scholar
  27. 27.
    Koolagudi, S.G., Kumar, N., Rao, K.S.: Speech emotion recognition using segmental level prosodic analysis. In: IEEE, Devices and Communications (ICDeCom) (2011)Google Scholar
  28. 28.
    Lee, C.M., Narayanan, S.: Toward detecting emotions in spoken dialogs. IEEE Trans. Speech Audio Process. 13(2), 293–303 (2005)CrossRefGoogle Scholar
  29. 29.
    Leung, C., Lee, T., Ma, B., Li, H.: Prosodic attribute model for spoken language identification. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2010) (2010)Google Scholar
  30. 30.
    Mandler, G.: Mind and Body: Psychology of Emotion and Stress. Norton, New York (1984)Google Scholar
  31. 31.
    McGilloway, S., Cowie, R., Cowie, E.D., Gielen, S., Westerdijk, M., Stroeve, S.: Approaching automatic recognition of emotion from voice: a rough benchmark. In: ISCA Workshop on Speech and Emotion (2000)Google Scholar
  32. 32.
    McLachlan, G.J.: Discriminant Analysis and Statistical Pattern Recognition. Wiley, New York (2004)zbMATHGoogle Scholar
  33. 33.
    Mehrabian, A.: Nonverbal Communication. Aldine-Atherton, Chicago (1972)Google Scholar
  34. 34.
    Michel, F.: Assert Yourself. Centre for Clinical Interventions, Perth (2008)Google Scholar
  35. 35.
    Moridis, C.N., Economides, A.A.: Affective learning: empathetic agents with emotional facial and tone of voice expressions. IEEE Trans. Affect. Comput. 3(3) (2012)Google Scholar
  36. 36.
    Murray, E., Arnott, J.L.: Towards a simulation of emotion in synthetic speech: a review of the literature on human vocal emotion. J. Acoust. Soc. Am. 93(2), 1097–1108 (1993)CrossRefGoogle Scholar
  37. 37.
    Pinker, S., Prince, A.: Regular and irregular morphology and the psychological status of rules of grammar. In: Lima, S.D., Corrigan, R.L., Iverson, G.K. (eds.) The Reality of Linguistic Rules. John Benjamins Publishing Company, Amsterdam/Philadelphia (1994)Google Scholar
  38. 38.
    Planet, S., Iriondo, I.: Comparison between decision-level and feature-level fusion of acoustic and linguistic features for spontaneous emotion recognition. In: Information Systems and Technologies (CISTI) (2012)Google Scholar
  39. 39.
    Pleva, M., Ondas, S., Juhar, J., Cizmar, A., Papaj, J., Dobos, L.: Speech and mobile technologies for cognitive communication and information systems. In: 2011 2nd International Conference on Cognitive Infocommunications (CogInfoCom), July 2011, pp. 1–5 (2011)Google Scholar
  40. 40.
    Purandare, A., Litman, D.: Humor: Prosody analysis and automatic recognition for F * R * I * E * N * D * S *. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing, Sydney, Australia, July 2006Google Scholar
  41. 41.
    Russell, J.A., Snodgrass, J.: Emotion and the environment. In: Stokols, D., Altman, I. (eds.) Handbook of Environmental Psychology. Wiley, New York (1987)Google Scholar
  42. 42.
    Sbattella, L.: La Mente Orchestra. Elaborazione della risonanza e autismo, Vita e pensiero (2006)Google Scholar
  43. 43.
    Sbattella, L.: Ti penso, dunque suono. Costrutti cognitivi e relazionali del comportamento musicale: un modello di ricerca-azione. Vita e pensiero (2013)Google Scholar
  44. 44.
    Scherer, K.: What are emotions? and how can they be measured? Soc. Sci. Inf. 44(4), 695–729 (2005)CrossRefGoogle Scholar
  45. 45.
    Shi, Y., Song, W.: Speech emotion recognition based on data mining technology. In: Sixth International Conference on Natural Computation (2010)Google Scholar
  46. 46.
    Shriberg, E., Stolcke, A.: Prosody modeling for automatic speech recognition and understanding. In: Proceeding of ISCA Workshop on Prosody in Speech Recognition and Understanding (2001)Google Scholar
  47. 47.
    Shriberg, E., Stolcke, A., Hakkani-Tr, D., Tr, G.: Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun. 32(1–2), 127–154 (2000)CrossRefGoogle Scholar
  48. 48.
    Stern, D.: Il mondo interpersonale del bambino, 1st edn. Bollati Boringhieri, Torino (1985)Google Scholar
  49. 49.
    Tesser, F., Cosi, P., Orioli, C., Tisato, G.: Modelli prosodici emotivi per la sintesi dell’italiano. ITC-IRST, ISTC-CNR (2004)Google Scholar
  50. 50.
    Tomkins, S.: Affect theory. In: Sherer, K.R., Ekman, P. (eds.) Approaches to Emotion. Lawrence Erlbaum Associates, Hillsdale (1982)Google Scholar
  51. 51.
    Wang, C., Li, Y.: A study on the search of the most discriminative speech features in the speaker dependent speech emotion recognition. In: International Symposium on Parallel Architectures, Algorithms and Programming (PAAP 2012) (2012)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Licia Sbattella
    • 1
  • Luca Colombo
    • 1
    Email author
  • Carlo Rinaldi
    • 1
  • Roberto Tedesco
    • 1
  • Matteo Matteucci
    • 1
  • Alessandro Trivilini
    • 1
  1. 1.Dip. di Elettronica, Informazione e BiongegneriaPolitecnico di MilanoMilanoItaly

Personalised recommendations