Advertisement

Automatic Motherese Detection for Face-to-Face Interaction Analysis

  • Ammar Mahdhaoui
  • Mohamed Chetouani
  • Cong Zong
  • Raquel Sofia Cassel
  • Catherine Saint-Georges
  • Marie-Christine Laznik
  • Sandra Maestro
  • Fabio Apicella
  • Filippo Muratori
  • David Cohen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5398)

Abstract

This paper deals with emotional speech detection in home movies. In this study, we focus on infant-directed speech also called “motherese” which is characterized by higher pitch, slower tempo, and exaggerated intonation. In this work, we show the robustness of approaches to automatic discrimination between infant-directed speech and normal directed speech. Specifically, we estimate the generalization capability of two feature extraction schemes extracted from supra-segmental and segmental information. In addition, two machine learning approaches are considered: k-nearest neighbors (k-NN) and Gaussian mixture models (GMM). Evaluations are carried out on real-life databases: home movies of the first year of an infant.

Keywords

motherese detection feature and classifier fusion 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Muratori, F., Maestro, S.: Autism as a downstream effect of primary difficulties in intersubjectivity interacting with abnormal development of brain connectivity. International Journal for Dialogical Science Fall 2(1), 93–118 (2007)Google Scholar
  2. 2.
    Fernald, A., Kuhl, P.: Acoustic determinants of infant preference for Motherese speech. Infant Behavior and Development 10, 279–293 (1987)CrossRefGoogle Scholar
  3. 3.
    Kuhl, P.K.: Early language acquisition: Cracking the speech code. Nature Reviews Neuroscience 5, 831–843 (2004)CrossRefGoogle Scholar
  4. 4.
    Schuller, B., Batliner, A., Seppi, D., Steidl, S., Vogt, T., Wagner, J., Devillers, L., Vidrascu, L., Amir, N., Kessous, L., Aharonson, V.: The relevance of feature type for the automatic classification of emotional user states: low level descriptors and functionals. In: Proceedings of Interspeech, pp. 2253–2256 (2007)Google Scholar
  5. 5.
    Reynolds, D.: Speaker identification and verification using Gaussian mixture speaker models. Speech Communication 17, 91–108 (1995)CrossRefGoogle Scholar
  6. 6.
    Duda, R., Hart, P., Stork, D.: Pattern Classification, 2nd edn (2000)Google Scholar
  7. 7.
    Shami, M., Verhelst, W.: An Evaluation of the Robustness of Existing Supervised Machine Learning Approaches to the Classification of Emotions. Speech. Speech Communication 49(3), 201–212 (2007)CrossRefGoogle Scholar
  8. 8.
    Kim, S., Georgiou, P., Lee, S., Narayanan, S.: Real-time emotion detection system using speech: Multi-modal fusion of different timescale features. In: IEEE International Workshop on Multimedia Signal Processing (October 2007)Google Scholar
  9. 9.
    Kuncheva, I.: Combining pattern classifiers: Methods & algorithms. Wiley, Chichester (2004)CrossRefzbMATHGoogle Scholar
  10. 10.

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Ammar Mahdhaoui
    • 1
  • Mohamed Chetouani
    • 1
  • Cong Zong
    • 1
  • Raquel Sofia Cassel
    • 2
    • 3
  • Catherine Saint-Georges
    • 2
    • 3
  • Marie-Christine Laznik
    • 4
  • Sandra Maestro
    • 5
  • Fabio Apicella
    • 5
  • Filippo Muratori
    • 5
  • David Cohen
    • 2
    • 3
  1. 1.Institut des Systèmes Intelligents et de Robotique, CNRS FRE 2507Université Pierre et Marie CurieParisFrance
  2. 2.Department of Child and Adolescent Psychiatry, AP-HP, Groupe Hospitalier Pitié-SalpétrièreUniversité Pierre et Marie CurieParisFrance
  3. 3.Laboratoire Psychologie et Neurosciences CognitivesCNRS UMR 8189ParisFrance
  4. 4.Department of Child and Adolescent PsychiatryAssociation Santé Mentale du 13èmeParisFrance
  5. 5.Scientific Institute Stella MarisUniversity of PisaItaly

Personalised recommendations