Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation

Araujo, Rodrigo; Kamel, Mohamed S.

doi:10.1007/978-3-319-11755-3_1

Rodrigo Araujo¹⁷ &
Mohamed S. Kamel¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8815))

Included in the following conference series:

International Conference Image Analysis and Recognition

2304 Accesses
3 Citations

Abstract

In this paper, we investigate applying semi-supervised clustering to audio-visual emotion analysis, a complex problem that is traditionally solved using supervised methods. We propose an extension to the semi-supervised aligned cluster analysis algorithm (SSACA), a temporal clustering algorithm that incorporates pairwise constraints in the form of must-link and cannot-link. We incorporate an exhaustive constraint propagation mechanism to further improve the clustering process. To validate the proposed method, we apply it to emotion analysis on a multimodal naturalistic emotion database. Results show substantial improvements compared to the original aligned clustering analysis algorithm (ACA) and to our previously proposed semi-supervised approach.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Araujo, R., Kamel, M.: A semi-supervised temporal clustering method for facial emotion analysis. In: 2014 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) (to appear, July 2014)
Google Scholar
De La Torre, F., Campoy, J., Ambadar, Z., Conn, J.F.: Temporal segmentation of facial behavior. In: International Conference on Computer Vision, pp. 1–8 (2007)
Google Scholar
Grimm, M., Kroschel, K.: Emotion estimation in speech using a 3d emotion space concept. In: Grimm, M., Kroschel, K. (eds.) Robust Speech Recognition and Understanding, pp. 281–300. I-Tech Education and Publishing, Vienna (2007)
Chapter Google Scholar
Gunes, H., Pantic, M.: Automatic, dimensional and continuous emotion recognition. Int. J. Synth. Emot. 1(1), 68–99 (2010)
Article Google Scholar
Kulis, B., Basu, S., Dhillon, I.S., Mooney, R.J.: Semi-supervised graph clustering: a kernel approach. Machine Learning 74(1), 1–22 (2009)
Article Google Scholar
Lu, Z., Ip, H.H.S.: Constrained spectral clustering via exhaustive and efficient constraint propagation. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part VI. LNCS, vol. 6316, pp. 1–14. Springer, Heidelberg (2010)
Chapter Google Scholar
Nicolaou, M.A., Gunes, H., Pantic, M.: Output-associative rvm regression for dimensional and continuous emotion prediction. Image Vision Comput. 30(3), 186–196 (2012)
Article Google Scholar
Pantic, M., Member, S., Rothkrantz, L.J.M.: Automatic analysis of facial expressions: The state of the art. IEEE Transactions on Pattern Analysis and Machine Intelligence 22, 1424–1445 (2000)
Article Google Scholar
Sayedelahl, A., Araujo, R., Kamel, M.: Audio-visual feature-decision level fusion for spontaneous emotion estimation in speech conversations. In: 2013 IEEE International Conference on Multimedia and Expo Workshops, pp. 1–6 (July 2013)
Google Scholar
Schuller, B., Valstar, M., Cowie, R., Pantic, M.: Avec 2012: The continuous audio/visual emotion challenge - an introduction. In: Proceedings of the 14th ACM International Conference on Multimodal Interaction, ICMI 2012, pp. 361–362. ACM, New York (2012)
Google Scholar
Wöllmer, M., Eyben, F., Reiter, S., Schuller, B., Cox, C., Douglas-cowie, E., Cowie, R.: Abandoning emotion classes - towards continuous emotion recognition with modelling of long-range dependencies. In: Proceedings Interspeech (2008)
Google Scholar
Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems 16, pp. 321–328. MIT Press (2004)
Google Scholar
Zhou, F., De la Torre, F., Cohn, J.F.: Unsupervised discovery of facial events. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electrical and Computer Engineering, Centre for Pattern Analysis and Machine Intelligence, University of Waterloo, 200 University Ave W, Waterloo, ON, N2L 3G1, Canada
Rodrigo Araujo & Mohamed S. Kamel

Authors

Rodrigo Araujo
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed S. Kamel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rodrigo Araujo .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Porto, Porto, Portugal
Aurélio Campilho
Dept. of Electrical and Computer Eng., University of Waterloo, Waterloo, Ontario, Canada
Mohamed Kamel

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Araujo, R., Kamel, M.S. (2014). Audio-Visual Emotion Analysis Using Semi-Supervised Temporal Clustering with Constraint Propagation. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2014. Lecture Notes in Computer Science(), vol 8815. Springer, Cham. https://doi.org/10.1007/978-3-319-11755-3_1

Download citation

DOI: https://doi.org/10.1007/978-3-319-11755-3_1
Published: 10 October 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-11754-6
Online ISBN: 978-3-319-11755-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics