The Composite Sensing of Affect

  • Gordon McIntyre
  • Roland Göcke
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4868)


This paper describes some of the issues faced by typical emotion recognition systems and the need to be able to deal with emotions in a natural setting. Studies tend to ignore the dynamic, versatile and personalised nature of affective expression and the influence that social setting, context and culture have on its rules of display. Affective cues can be present in multiple modalities and they can manifest themselves in different temporal order. Thus, fusing the feature sets is challenging. We present a composite approach to affective sensing. The term composite is used to reflect the blending of information from multiple modalities with the available semantic evidence to enhance the emotion recognition process.


Facial Expression Emotion Recognition Facial Expression Recognition Visual Speech Emotional Speech 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    McCann, J., Peppe, S.: PEPS-C: A new speech science software programme for assessing prosody. In: The Fifth Annual Parliamentary Reception for Younger Researchers in Science, Engineering, Medicine and Technology (SET for Britain. Taking science to parliament: The 2003 great British research and R&D show), the House of Commons, London (2003)Google Scholar
  2. 2.
    Devillers, L., Vasilescu, I., Vidrascu, L.: F0 and pause features analysis for anger and fear detection in real-life spoken dialogs. Speech Prosody (2004)Google Scholar
  3. 3.
    Jones, C.M., Jonsson, I.: Automatic recognition of affective cues in the speech of car drivers to allow appropriate responses. Technical report, School of Mathematical and Computer Sciences, Heriot-Watt University, Edinburgh, UK and Department of Communication. Stanford University, California, USA (2005)Google Scholar
  4. 4.
    Jones, C.M., Jonsson, I.: Using paralinguistic cues in speech to recognise emotions in older car drivers. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction. LNCS, vol. 4868. Springer, Heidelberg (2008)Google Scholar
  5. 5.
    Breazeal, C.: Emotion and sociable humanoid robots. Int. J. Human-Computer Studies 59, 119–155 (2003)CrossRefGoogle Scholar
  6. 6.
    Reilly, R., Moran, R., Lacy, P.: Voice pathology assessment based on a dialogue system and speech analysis. Technical report, Department of Electronic and Electrical Engineering, University College Dublin, Ireland and St James’s Hospital, Dublin 8, Ireland (2000)Google Scholar
  7. 7.
    Picard, R.: Helping addicts: A scenario from 2021. Technical report (2005)Google Scholar
  8. 8.
    Kaliouby, R., Robinson, P.: Therapeutic versus prosthetic assistive technologies: The case of autism. Technical report, Computer Laboratory, University of Cambridge (2005)Google Scholar
  9. 9.
    Kaliouby, R., Robinson, P.: The emotional hearing aid: An assistive tool for children with asperger’s syndrome. Technical report, Computer Laboratory, University of Cambridge (2003)Google Scholar
  10. 10.
    Petrushin, V.A.: Emotion in speech: Recognition and application to call centres. In: Artificial Neural Networks in Engineering (1999)Google Scholar
  11. 11.
    Yacoub, S., Simske, S.: X.Lin, Burns, J.: Recognition of emotions in interactive voice response systems. Technical report, HP Laboratories Palo Alto (2003)Google Scholar
  12. 12.
    Ekman, P.: Darwin, deception, and facial expression. Annals New York Academy of Sciences, 205–221 (2003)Google Scholar
  13. 13.
    Fry, D.B.: The Physics of Speech. Cambridge Textbooks in Linguistics. Cambridge University Press, Cambridge (1979)Google Scholar
  14. 14.
    Murray, I., Arnott, L.: Toward the simulation of emotion in synthetic speech. Journal Acoustical Society of America 93(2), 1097–1108 (1993)CrossRefGoogle Scholar
  15. 15.
    Ekman, P., Friesen, W.: Unmasking the Face. Prentice Hall, Englewood Cliffs (1975)Google Scholar
  16. 16.
    Ekman, P., Oster, H.: Emotion in the human face, 2nd edn. Cambridge University Press, New York (1982)Google Scholar
  17. 17.
    Ekman, P., Rosenberg, E.L.: What the Face Reveals. Series in Affective Science. Oxford University Press, Oxford (1997)Google Scholar
  18. 18.
    Ekman, P.: Facial Expressions. In: The Handbook of Cognition and Emotion, pp. 301–320. John Wiley and Sons, Ltd., Sussex (1999)CrossRefGoogle Scholar
  19. 19.
    McNeill, D.: Gesture and language dialectic. Technical report, Department of Psychology. University of Chicago (2002)Google Scholar
  20. 20.
    Lien, J., Kanade, T., Cohn, J., Li, C.: Automated Facial Expression Recognition Based on FACS Action Units. In: International Conference on Automatic Face and Gesture Recognition, pp. 390–395 (1998)Google Scholar
  21. 21.
    Cootes, T., Taylor, C., Cooper, D., Graham, J.: Active shape models - their training and applications. Computer Vision and Image Understanding 61(1), 38–59 (1995)CrossRefGoogle Scholar
  22. 22.
    Nixon, M., Aguado, A.: Feature Extraction and Image Processing. MPG Books Ltd., Brodmin, Cornwall (2001)Google Scholar
  23. 23.
    Castellano, G., Kessous, L., Caridakis, G.: Emotion recognition through multiple modalities: face, body gesture, speech. In: Peter, C., Beale, R. (eds.) Affect and Emotion in Human-Computer Interaction, vol. 4868. Springer, Heidelberg (2008)Google Scholar
  24. 24.
    Fragopanagos, N., Taylor, J.: Emotion recognition in human-computer interaction. Neural Networks 18, 389–405 (2005)CrossRefGoogle Scholar
  25. 25.
    Polzin, T.: Detecting verbal and non-verbal cues in the communication of emotions. PhD thesis, School of Computer Science. Carnegie Mellon University (2000)Google Scholar
  26. 26.
    Scherer, K.R.: Vocal communication of emotion: A review of research paradigms. Speech Communication 40, 227–256 (2003)zbMATHCrossRefGoogle Scholar
  27. 27.
    Cowie, R., Cornelius, R.: Describing the emotional states that are expressed in speech. Speech Communication 40, 5–32 (2003)zbMATHCrossRefGoogle Scholar
  28. 28.
    Cowie, R., Douglas-Cowie, E., Cox, C.: Beyond emotion archetypes: Databases for emotion modelling using neural networks. Neural Networks 18, 371–388 (2005)CrossRefGoogle Scholar
  29. 29.
    Lee, C.M., Narayanan, S., Pieraccini, R.: Recognition of negative emotions from the speech signal. Automatic Speech Recognition and Understanding (2001)Google Scholar
  30. 30.
    Devillers, L., Abrilian, S., Martin, J.: Representing real-life emotions in audiovisual data with non basic emotional patterns and context features. Technical report, LIMSI, Centre national de la recherche scientifique, France (2005)Google Scholar
  31. 31.
    Velten, E.: A laboratory task for induction of mood states. Behaviour Research and Therapy 6, 473–482 (1968)CrossRefGoogle Scholar
  32. 32.
    Schiel, F., Steininger, S., Türk, U.: The SmartKom Multimodal Corpus at BAS. Technical report, Ludwig Maximilans Universität München (2003)Google Scholar
  33. 33.
    Dellaert, F., Polzin, T., Waibel, A.: Recognizing emotion in speech. Technical report, School of Computer Science. Carnegie Mellon University (1995)Google Scholar
  34. 34.
    Lin, Y.L., Wei, G.: Speech emotion recognition based on hmm and svm. In: Proceedings (2005)Google Scholar
  35. 35.
    Scherer, K.R.: Humaine Deliverable D3c: Preliminary plans for exemplars: theory (2004). Retrieved October 26, 2006,
  36. 36.
    Fernandez, R., Picard, R.: Classical and novel discriminant features for affect recognition from speech. In: Interspeech, Lisbon, Portugal, pp. 473–476 (2005)Google Scholar
  37. 37.
    Koike, K., Suzuki, H., Saito, H.: Prosodic parameters in emotional speech. In: International Conference on Spoken Language Processing, pp. 679–682 (1998)Google Scholar
  38. 38.
    Shigeno, S.: Cultural similarities and differences in the recognition of audio-visual speech stimuli. In: International Conference on Spoken Language Processing, 1057th edn., pp. 281–284 (1998)Google Scholar
  39. 39.
    Stibbard, R.: Vocal expression of emotions in non-laboratory speech: An investigation of the Reading/Leeds Emotion in Speech Project annotation data. PhD thesis, University of Reading, UK (2001)Google Scholar
  40. 40.
    Silva, L.D., Hui, S.: Real-time facial feature extraction and emotion recognition. In: ICICS-PCM. IEEE, Singapore (2003)Google Scholar
  41. 41.
    Ward, R., Marsden, P.: Affective computing: Problems, reactions and intentions. Interacting with Computers 16(4), 707–713 (2004)CrossRefGoogle Scholar
  42. 42.
    Liscombe, J., Riccardi, G., Hakkani-Tür, D.: Using context to improve emotion detection in spoken dialog systems. In: EUROSPEECH 2005, 9th European Conference on Speech Communication and Technology, pp. 1845–1848 (2005)Google Scholar
  43. 43.
    Devillers, L., Vidrascu, L., Lamel, L.: Challenges in real-life emotion annotation and machine learning based detection. Neural Networks 18, 407–422 (2005)CrossRefGoogle Scholar
  44. 44.
    Athanaselisa, T., Bakamidisa, S., Dologloua, I., Cowieb, R., Douglas-Cowie, E., Cox, C.: Asr for emotional speech: Clarifying the issues and enhancing performance. Neural Networks 18, 437–444 (2005)CrossRefGoogle Scholar
  45. 45.
    Cowie, R., Douglas-Cowie, E., Taylor, J., Ioannou, S., Wallace, M., Kollias, S.: An intelligent system for facial emotion recognition. IEEE, Los Alamitos (2005)Google Scholar
  46. 46.
    Town, C., Sinclair, D.: A self-referential perceptual inference framework for video interpretation. In: Crowley, J.L., Piater, J.H., Vincze, M., Paletta, L. (eds.) ICVS 2003. LNCS, vol. 2626, pp. 54–67. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  47. 47.
    Millar, J.B., Wagner, M., Göcke, R.: Aspects of speaking-face data corpus design methodology. In: International Conference on Spoken Language Processing 2004, Jeju, Korea, vol. II, pp. 1157–1160 (2004)Google Scholar
  48. 48.
    Schröder, M., Devillers, L., Karpouzis, K., Martin, J.C., Pelachaud, C., Peter, C., Pirker, H., Schuller, B., Tao, J., Wilson, I.: What should a generic emotion markup language be able to represent? In: Proc. 2nd International Conference on Affective Computing and Intelligent Interaction (ACII 2007), Lisbon, Portugal (2007)Google Scholar
  49. 49.
    Schröder, M., Zovato, E., Pirker, H., Peter, C., Burkhardt, F.: W3c emotion incubator group final report. Technical report, W3C (2007)Google Scholar
  50. 50.
    MPEG-7 Committee: Retrieved June 2, 2007,
  51. 51.
    Chiariglione, L.: Introduction to MPEG-7: Multimedia Content Description Interface. Technical report, Telecom Italia Lab, Italy (2001)Google Scholar
  52. 52.
    Salembier, P., Smith, J.: MPEG-7 Multimedia Description Schemes. IEEE Transactions on Circuits and Systems for Video Technology 11, 748–759 (2001)CrossRefGoogle Scholar
  53. 53.
    Rege, M., Dong, M., Fotouhi, F., Siadat, M., Zamorano, L.: Using MPEG-7 to build a Human Brain Image Database for Image-guided Neurosurgery. Medical Imaging 2005: Visualization, Image-Guided Procedures, and Display, 512–519 (2005)Google Scholar
  54. 54.
    Annesley, J., Orwell, J.: On the Use of MPEG-7 for Visual Surveillance. Technical report, Digital Imaging Research Center, Kingston University, Kingston-upon-Thames, Surrey, UK (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Gordon McIntyre
    • 1
  • Roland Göcke
    • 1
    • 2
  1. 1.Research School of Information Sciences and EngineeringAustralian National UniversityCanberraAustralia
  2. 2.Seeing MachinesCanberraAustralia

Personalised recommendations