Advertisement

Introducing AmuS: The Amused Speech Database

  • Kevin El HaddadEmail author
  • Ilaria Torre
  • Emer Gilmartin
  • Hüseyin Çakmak
  • Stéphane Dupont
  • Thierry Dutoit
  • Nick Campbell
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10583)

Abstract

In this paper we present the AmuS database of about three hours worth of data related to amused speech recorded from two males and one female subjects and contains data in two languages French and English. We review previous work on smiled speech and speech-laughs. We describe acoustic analysis on part of our database, and a perception test comparing speech-laughs with smiled and neutral speech. We show the efficiency of the data in AmuS for synthesis of amused speech by training HMM-based models for neutral and smiled speech for each voice and comparing them using an on-line CMOS test.

Keywords

Corpora and language resources Amused speech Laugh Smile Speech synthesis Speech processing HMM Affective computing Machine learning 

References

  1. 1.
    Ressel, J.A.: A circumplex model of affect. J. Pers. Soc. Psychol. 39, 1161 (1980)CrossRefGoogle Scholar
  2. 2.
    Barthel, H., Quené, H.: Acoustic-phonetic properties of smiling revised-measurements on a natural video corpus. In: Proceedings of the 18th International Congress of Phonetic Sciences (2015)Google Scholar
  3. 3.
    Bonin, F., Campbell, N., Vogel, C.: Time for laughter. Knowl.-Based Syst. 71, 15–24 (2014)CrossRefGoogle Scholar
  4. 4.
    Burkhardt, F., Paeschke, A., Rolfes, M., Sendlmeier, W.F., Weiss, B.: A database of German emotional speech. In: Interspeech, vol. 5, pp. 1517–1520 (2005)Google Scholar
  5. 5.
    Busso, C., Bulut, M., Lee, C.C., Kazemzadeh, A., Mower, E., Kim, S., Chang, J., Lee, S., Narayanan, S.S.: IEMOCAP: interactive emotional dyadic motion capture database. J. Lang. Res. Eval. 42(4), 335–359 (2008)CrossRefGoogle Scholar
  6. 6.
    Chovil, N.: Discourse oriented facial displays in conversation. Res. Lang. Soc. Interact. 25(1–4), 163–194 (1991)CrossRefGoogle Scholar
  7. 7.
    Digalakis, V.V., Rtischev, D., Neumeyer, L.G.: Speaker adaptation using constrained estimation of Gaussian mixtures. IEEE Trans. Speech Audio Process. 3(5), 357–366 (1995)CrossRefGoogle Scholar
  8. 8.
    Drahota, A., Costall, A., Reddy, V.: The vocal communication of different kinds of smile. Speech Commun. 50(4), 278–287 (2008)CrossRefGoogle Scholar
  9. 9.
    Dumpala, S., Sridaran, K., Gangashetty, S., Yegnanarayana, B.: Analysis of laughter and speech-laugh signals using excitation source information. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 975–979, May 2014Google Scholar
  10. 10.
    Dupont, S., et al.: Laughter research: a review of the ILHAIRE project. In: Esposito, A., Jain, L.C. (eds.) Toward Robotic Socially Believable Behaving Systems - Volume I. ISRL, vol. 105, pp. 147–181. Springer, Cham (2016). doi: 10.1007/978-3-319-31056-5_9 CrossRefGoogle Scholar
  11. 11.
    El Haddad, K., Cakmak, H., Dupont, S., Dutoit, T.: An HMM approach for synthesizing amused speech with a controllable intensity of smile. In: IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), Abu Dhabi, UAE, 7–10 December 2015Google Scholar
  12. 12.
    El Haddad, K., Dupont, S., d’Alessandro, N., Dutoit, T.: An HMM-based speech-smile synthesis system: an approach for amusement synthesis. In: International Workshop on Emotion Representation, Analysis and Synthesis in Continuous Time and Space (EmoSPACE), Ljubljana, Slovenia, 4–8 May 2015Google Scholar
  13. 13.
    El Haddad, K., Dupont, S., Urbain, J., Dutoit, T.: Speech-laughs: an HMM-based approach for amused speech synthesis. In: International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–24 April 2015Google Scholar
  14. 14.
    Émond, C., Ménard, L., Laforest, M.: Perceived prosodic correlates of smiled speech in spontaneous data. In: Bimbot, F., Cerisara, C., Fougeron, C., Gravier, G., Lamel, L., Pellegrino, F., Perrier, P. (eds.) INTERSPEECH, pp. 1380–1383. ISCA (2013)Google Scholar
  15. 15.
    Eyben, F., Scherer, K., Schuller, B., Sundberg, J., André, E., Busso, C., Devillers, L., Epps, J., Laukka, P., Narayanan, S., Truong, K.: The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing. IEEE Trans. Affect. Comput. 7(2), 190–202 (2015). Open accessCrossRefGoogle Scholar
  16. 16.
    Fagel, S.: Effects of smiling on articulation: lips, larynx and acoustics. In: Esposito, A., Campbell, N., Vogel, C., Hussain, A., Nijholt, A. (eds.) Development of Multimodal Interfaces: Active Listening and Synchrony. LNCS, vol. 5967, pp. 294–303. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-12397-9_25 CrossRefGoogle Scholar
  17. 17.
    Fayek, H.M., Lech, M., Cavedon, L.: Evaluating deep learning architectures for speech emotion recognition. Neural Netw. 92, 60–68 (2017)CrossRefGoogle Scholar
  18. 18.
    Garcia-Ceja, E., Osmani, V., Mayora, O.: Automatic stress detection in working environments from smartphones’ accelerometer data: a first step. IEEE J. Biomed. Health Inform. 20(4), 1053–1060 (2016)CrossRefGoogle Scholar
  19. 19.
    Glenn, P.: Laughter in Interaction, vol. 18. Cambridge University Press, Cambridge (2003)CrossRefGoogle Scholar
  20. 20.
    Haakana, M.: Laughter and smiling: notes on co-occurrences. J. Pragmat. 42(6), 1499–1512 (2010)CrossRefGoogle Scholar
  21. 21.
    Haddad, K.E., Çakmak, H., Dupont, S., Dutoit, T.: Amused speech components analysis and classification: towards an amusement arousal level assessment system. Comput. Electr. Eng. (2017). http://www.sciencedirect.com/science/article/pii/S0045790617317135
  22. 22.
    Hoque, M., Morency, L.-P., Picard, R.W.: Are you friendly or just polite? – analysis of smiles in spontaneous face-to-face interactions. In: D’Mello, S., Graesser, A., Schuller, B., Martin, J.-C. (eds.) ACII 2011. LNCS, vol. 6974, pp. 135–144. Springer, Heidelberg (2011). doi: 10.1007/978-3-642-24600-5_17 CrossRefGoogle Scholar
  23. 23.
    Ito, A., Wang, X., Suzuki, M., Makino, S.: Smile and laughter recognition using speech processing and face recognition from conversation video. In: 2005 International Conference on Cyberworlds (CW 2005), pp. 437–444, November 2005Google Scholar
  24. 24.
    Kim, Y., Provost, E.M.: Emotion spotting: discovering regions of evidence in audio-visual emotion expressions. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, ICMI 2016, New York, NY, USA, pp. 92–99. ACM (2016)Google Scholar
  25. 25.
    Kohler, K.J.: “Speech-smile”,“speech-laugh”,“laughter” and their sequencing in dialogic interaction. Phonetica 65(1–2), 1–18 (2008)CrossRefGoogle Scholar
  26. 26.
    Kominek, J., Black, A.W.: The CMU arctic speech databases. In: Fifth ISCA Workshop on Speech Synthesis (2004)Google Scholar
  27. 27.
    Koolagudi, S.G., Rao, K.S.: Emotion recognition from speech: a review. Int. J. Speech Technol. 15(2), 99–117 (2012)CrossRefGoogle Scholar
  28. 28.
    Kraut, R.E., Johnston, R.E.: Social and emotional messages of smiling: an ethological approach. J. Pers. Soc. Psychol. 37(9), 1539 (1979)CrossRefGoogle Scholar
  29. 29.
    Lasarcyk, E., Trouvain, J.: Spread lips+ raised larynx+ higher f0= Smiled Speech?-an articulatory synthesis approach. In: Proceedings of ISSP (2008)Google Scholar
  30. 30.
    Laskowski, K., Burger, S.: Analysis of the occurrence of laughter in meetings. In: Proceedings of the 8th Annual Conference of the International Speech Communication Association (Interspeech 2007), Antwerp, Belgium, pp. 1258–1261, 27–31 August 2007Google Scholar
  31. 31.
    Bradley, M.M., Greenwald, M.K., Petry, M.C., Lang, P.J.: Remembering pictures: pleasure and arousal in memory. J. Exp. Psychol. Learn. Mem. Cogn. 18, 379 (1992)CrossRefGoogle Scholar
  32. 32.
    McKeown, G., Curran, W.: The relationship between laughter intensity and perceived humour. In: Proceedings of the 4th Interdisciplinary Workshop on Laughter and Other Non-verbal Vocalisations in Speech, pp. 27–29 (2015)Google Scholar
  33. 33.
    Ming, H., Huang, D., Xie, L., Wu, J., Dong, M., Li, H.: Deep bidirectional LSTM modeling of timbre and prosody for emotional voice conversion. In: 17th Annual Conference of the International Speech Communication Association, Interspeech 2016, 8–12 September 2016, San Francisco, CA, USA, pp. 2453–2457 (2016)Google Scholar
  34. 34.
    Nwokah, E.E., Hsu, H.C., Davies, P., Fogel, A.: The integration of laughter and speech in vocal communicationa dynamic systems perspective. J. Speech Lang. Hear. Res. 42(4), 880–894 (1999)CrossRefGoogle Scholar
  35. 35.
    Oh, J., Wang, G.: Laughter modulation: from speech to speech-laugh. In: INTERSPEECH, pp. 754–755 (2013)Google Scholar
  36. 36.
    Pickering, L., Corduas, M., Eisterhold, J., Seifried, B., Eggleston, A., Attardo, S.: Prosodic markers of saliency in humorous narratives. Discourse process. 46(6), 517–540 (2009)CrossRefGoogle Scholar
  37. 37.
    Provine, R.R.: Laughter punctuates speech: linguistic, social and gender contexts of laughter. Ethology 95(4), 291–298 (1993)CrossRefGoogle Scholar
  38. 38.
    Robson, J., Janet, B.: Hearing smiles-perceptual, acoustic and production aspects of labial spreading. In: XIVth Proceedings of the XIVth International Congress of Phonetic Sciences, vol. 1, pp. 219–222. International Congress of Phonetic Sciences (1999)Google Scholar
  39. 39.
    Sjölander, K.: The Snack Sound Toolkit [computer program webpage] (consulted on September, 2014). http://www.speech.kth.se/snack/
  40. 40.
    Tartter, V.: Happy talk: perceptual and acoustic effects of smiling on speech. Percept. Psychophys. 27(1), 24–27 (1980)CrossRefGoogle Scholar
  41. 41.
    Tartter, V.C., Braun, D.: Hearing smiles and frowns in normal and whisper registers. J. Acoust. Soc. Am. 96(4), 2101–2107 (1994)CrossRefGoogle Scholar
  42. 42.
    Torre, I.: Production and perception of smiling voice. In: Proceedings of the First Postgraduate and Academic Researchers in Linguistics at York (PARLAY 2013), pp. 100–117 (2014)Google Scholar
  43. 43.
    Trouvain, J.: Phonetic aspects of “speech laughs”. In: Oralité et Gestualité: Actes du colloque ORAGE, Aix-en-Provence. L’Harmattan, Paris, pp. 634–639 (2001)Google Scholar
  44. 44.
    Young, S.J., Young, S.: The HTK hidden Markov model toolkit: design and philosophy. In: Entropic Cambridge Research Laboratory, Ltd. (1994)Google Scholar
  45. 45.
    Zen, H., Nose, T., Yamagishi, J., Sako, S., Masuko, T., Black, A., Tokuda, K.: The HMM-based speech synthesis system (HTS) version 2.0. In: Proceeding 6th ISCA Workshop on Speech Synthesis (SSW-6), August 2007Google Scholar
  46. 46.
    Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Kevin El Haddad
    • 1
    Email author
  • Ilaria Torre
    • 2
  • Emer Gilmartin
    • 3
  • Hüseyin Çakmak
    • 1
  • Stéphane Dupont
    • 1
  • Thierry Dutoit
    • 1
  • Nick Campbell
    • 3
  1. 1.University of MonsMonsBelgium
  2. 2.Plymouth UniversityPlymouthUK
  3. 3.Trinity College DublinDublinIreland

Personalised recommendations