Unsupervised Temporal Segmentation of Talking Faces Using Visual Cues to Improve Emotion Recognition

Velusamy, Sudha; Gopalakrishnan, Viswanath; Navathe, Bilva; Kannan, Hariprasad; Anand, Balasubramanian; Sharma, Anshul

doi:10.1007/978-3-642-24600-5_45

Sudha Velusamy¹⁹,
Viswanath Gopalakrishnan¹⁹,
Bilva Navathe¹⁹,
Hariprasad Kannan¹⁹,
Balasubramanian Anand¹⁹ &
…
Anshul Sharma¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6974))

Included in the following conference series:

International Conference on Affective Computing and Intelligent Interaction

4440 Accesses

Abstract

The mouth region of human face possesses highly discriminative information regarding the expressions on the face. Facial expression analysis to infer the emotional state of a user becomes very challenging when the user talks, as most of the mouth actions while uttering certain words match with mouth shapes expressing various emotions. We introduce a novel unsupervised method to temporally segment talking faces from the faces displaying only emotions, and use the knowledge of talking face segments to improve emotion recognition. The proposed method uses integrated gradient histogram of local binary patterns to represent mouth features suitably and identifies temporal segments of talking faces online by estimating the uncertainties of mouth movements over a period of time. The algorithm accurately identifies talking face segments on a real-world database where talking and emotion happens naturally. Also, the emotion recognition system, using talking face cues, showed considerable improvement in recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ekman, P., Friesen, W.V.: Facial action coding system: A technique for measurement of facial movements. Consulting Psychologists (1978)
Google Scholar
Velusamy, S., Kannan, H., Anand, B., Navathe, B., Sharma, A.: A Method to Infer Emotions From Facial Action Units. In: IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP (2011)
Google Scholar
Bartlett, M.S., Littlewort, G., Frank, M., Lainscsek, C., Fasel, I., Movellan, J.: Recognizing Facial Expression: Machine Learning and Application to Spontaneous Behavior. In: IEEE Conf. on Computer Vision and Pat. Recog., pp. 568–573 (2005)
Google Scholar
Lien, J.J., Zlochower, A., Cohn, J.F., Kanade, T.: Automated Facial Expression Recognition Based on FACS Action Units. In: Proceedings of IEEE Int. Conference on Automatic Face and Gesture Recognition, pp. 390–395 (1998)
Google Scholar
Kaliouby, R.: Mind-reading machines: the automated inference of complex mental states from video, Ph.D. Thesis, University of Cambridge (2005)
Google Scholar
Ahonen, T., Hadid, A., Pietikainen, M.: Face Description with Local Binary Patterns: Application to Face Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 2037–2041 (2006)
Article MATH Google Scholar
Rudovic, O., Patras, I., Pantic, M.: Coupled Gaussian Process Regression for Pose-Invariant Facial Expression Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part II. LNCS, vol. 6312, pp. 350–363. Springer, Heidelberg (2010)
Chapter Google Scholar
Buciu, I., Kotsia, I., Pitas, I.: Facial expression analysis under partial occlusion.: In: IEEE Int. Conf. on Acoustics, Speech, Signal Proc. (ICASSP), pp. 453–456 (2005)
Google Scholar
Zhao, G., Barnard, M., Pietikainen, M.: Lipreading With Local Spatiotemporal Descriptors. IEEE Trans. Multimedia 11(7), 1254–1265 (2009)
Article Google Scholar
Liu, P., Wang, Z.: Voice activity detection using visual information. In: Proc. of IEEE Int. Conf. on Acoustics, Speech, & Signal Proc. (ICASSP), pp. 609–612 (2004)
Google Scholar
Bendris, M., Charlet, D., Chollet, G.: Lip activity detection for talking faces classification in TV-Content. In: International Conference on Machine Vision (2010)
Google Scholar
Siatras, S., Nikolaidis, N., Krinidis, M., Pitas, I.: Visual Lip Activity Detection and Speaker Detection Using Mouth Region Intensities. IEEE Trans. Circuits and Systems for Video Technology 19(1), 133–137 (2009)
Article Google Scholar
Montse, P., Bonafonte, A., Landabaso, J.L.: Emotion Recognition Based on MPEG4 Facial Animation Parameters. In: Proceedings of IEEE Intl. Conf. on Acoustics, Speech, and Signal Processing (ICASSP), pp. 3624–3627 (2002)
Google Scholar
Zhou, F., De la Tore, F., Jeffrey, F.C.: Unsupervised Discovery of Facial Events. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Google Scholar
Saragih, J., Lucey, S., Cohn, J.: Deformable Model Fitting by Regularized Landmark Mean-Shifts. Interl. Journal of Computer Vision 91(2), 200–215 (2011)
Article MathSciNet MATH Google Scholar
Ojala, T., Pietikainen, M., Harwood, D.: A comparative study of texture measures with classification based on feature distributions. Pattern Recogn., 51–59 (1996)
Google Scholar
Alpaydin, E.: Introduction to Machine Learning. MIT Press, Cambridge (2004)
MATH Google Scholar
Hoque, M.E., Picard, R.W.: I See You (ICU): Towards Robust Recognition of Facial Expressions and Speech Prosody in Real Time. In: International Conference on Computer Vision and Pattern Recognition (CVPR), DEMO (2010)
Google Scholar
Sohn, J., Sung, W.: A voice activity detector employing soft decision based noise spectrum adaptation. In: Proceedings of IEEE Int. Conference on Acoustics, Speech, Signal Processing (ICASSP), pp. 365–368 (1998)
Google Scholar
Bourel, F., Chibelushi, C.C., Low, A.A.: Recognition of facial expressions in the presence of occlusion. In: Proc. of the Twelfth British Machine Vision Conference, vol. 1, pp. 213–222 (2001)
Google Scholar

Download references

Author information

Authors and Affiliations

SAIT India, Samsung India Software Operations, Bangalore, India
Sudha Velusamy, Viswanath Gopalakrishnan, Bilva Navathe, Hariprasad Kannan, Balasubramanian Anand & Anshul Sharma

Authors

Sudha Velusamy
View author publications
You can also search for this author in PubMed Google Scholar
Viswanath Gopalakrishnan
View author publications
You can also search for this author in PubMed Google Scholar
Bilva Navathe
View author publications
You can also search for this author in PubMed Google Scholar
Hariprasad Kannan
View author publications
You can also search for this author in PubMed Google Scholar
Balasubramanian Anand
View author publications
You can also search for this author in PubMed Google Scholar
Anshul Sharma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Memphis, 202 Psychology Building, 38152, Memphis, TN, USA
Sidney D’Mello & Arthur Graesser &
Technische Universität München, Arcisstraße 21, 80333, München, Germany
Björn Schuller
Laboratoire d’Informatique pour la Mécanique et les Sciences de l’Ingénieur (LIMSI-CNRS), Bâtiment 508, 91403, Orsay Cedex, France
Jean-Claude Martin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Velusamy, S., Gopalakrishnan, V., Navathe, B., Kannan, H., Anand, B., Sharma, A. (2011). Unsupervised Temporal Segmentation of Talking Faces Using Visual Cues to Improve Emotion Recognition. In: D’Mello, S., Graesser, A., Schuller, B., Martin, JC. (eds) Affective Computing and Intelligent Interaction. ACII 2011. Lecture Notes in Computer Science, vol 6974. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24600-5_45

Download citation

DOI: https://doi.org/10.1007/978-3-642-24600-5_45
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24599-2
Online ISBN: 978-3-642-24600-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics