Skip to main content

Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation

  • Conference paper
  • First Online:
Image Analysis and Recognition (ICIAR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8815))

Included in the following conference series:

Abstract

In this paper, we investigate applying semi-supervised clustering to audio-visual emotion analysis, a complex problem that is traditionally solved using supervised methods. We propose an extension to the semi-supervised aligned cluster analysis algorithm (SSACA), a temporal clustering algorithm that incorporates pairwise constraints in the form of must-link and cannot-link. We incorporate an exhaustive constraint propagation mechanism to further improve the clustering process. To validate the proposed method, we apply it to emotion analysis on a multimodal naturalistic emotion database. Results show substantial improvements compared to the original aligned clustering analysis algorithm (ACA) and to our previously proposed semi-supervised approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Araujo, R., Kamel, M.: A semi-supervised temporal clustering method for facial emotion analysis. In: 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) (to appear, July 2014)

    Google Scholar 

  2. De La Torre, F., Campoy, J., Ambadar, Z., Conn, J.F.: Temporal segmentation of facial behavior. In: International Conference on Computer Vision, pp. 1–8 (2007)

    Google Scholar 

  3. Grimm, M., Kroschel, K.: Emotion estimation in speech using a 3d emotion space concept. In: Grimm, M., Kroschel, K. (eds.) Robust Speech Recognition and Understanding, pp. 281–300. I-Tech Education and Publishing, Vienna (2007)

    Chapter  Google Scholar 

  4. Gunes, H., Pantic, M.: Automatic, dimensional and continuous emotion recognition. Int. J. Synth. Emot. 1(1), 68–99 (2010)

    Article  Google Scholar 

  5. Kulis, B., Basu, S., Dhillon, I.S., Mooney, R.J.: Semi-supervised graph clustering: a kernel approach. Machine Learning 74(1), 1–22 (2009)

    Article  Google Scholar 

  6. Lu, Z., Ip, H.H.S.: Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 1–14. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  7. Nicolaou, M.A., Gunes, H., Pantic, M.: Output-associative rvm regression for dimensional and continuous emotion prediction. Image Vision Comput. 30(3), 186–196 (2012)

    Article  Google Scholar 

  8. Pantic, M., Member, S., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: The state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1424–1445 (2000)

    Article  Google Scholar 

  9. Sayedelahl, A., Araujo, R., Kamel, M.: Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations. In: 2013 IEEE International Conference on Multimedia and Expo Workshops, pp. 1–6 (July 2013)

    Google Scholar 

  10. Schuller, B., Valstar, M., Cowie, R., Pantic, M.: Avec 2012: The continuous audio/visual emotion challenge - an introduction. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction, ICMI 2012, pp. 361–362. ACM, New York (2012)

    Google Scholar 

  11. Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-cowie, E., Cowie, R.: Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies. In: Proceedings Interspeech (2008)

    Google Scholar 

  12. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16, pp. 321–328. MIT Press (2004)

    Google Scholar 

  13. Zhou, F., De la Torre, F., Cohn, J.F.: Unsupervised discovery of facial events. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rodrigo Araujo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Araujo, R., Kamel, M.S. (2014). Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8815. Springer, Cham. https://doi.org/10.1007/978-3-319-11755-3_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-11755-3_1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-11754-6

  • Online ISBN: 978-3-319-11755-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics