Expression Recognition in Videos Using a Weighted Component-Based Feature Descriptor

  • Xiaohua Huang
  • Guoying Zhao
  • Matti Pietikäinen
  • Wenming Zheng
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6688)


In this paper, we propose a weighted component-based feature descriptor for expression recognition in video sequences. Firstly, we extract the texture features and structural shape features in three facial regions: mouth, cheeks and eyes of each face image. Then, we combine these extracted feature sets using confidence level strategy. Noting that for different facial components, the contributions to the expression recognition are different, we propose a method for automatically learning different weights to components via the multiple kernel learning. Experimental results on the Extended Cohn-Kanade database show that our approach combining component-based spatiotemporal features descriptor and weight learning strategy achieves better recognition performance than the state of the art methods.


Spatiotemporal features LBP-TOP EdgeMap Information fusion Multiple kernel learning Facial expression recognition 


  1. 1.
    Chang, C., Lin, C.: Libsvm: a library for support vector machines, Software available as
  2. 2.
    Cootes, T., Edwards, G., Taylor, C.: Active appearance models. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(6), 681–685 (2001)CrossRefGoogle Scholar
  3. 3.
    Dileep, A., Sekhar, C.: Representation and feature selection using multiple kernel learning. In: International Joint Conference on Neural Networks, pp. 717–722. IEEE Press, New York (2009)Google Scholar
  4. 4.
    Fasel, B., Luettin, J.: Automatic facial expression analysis: a survey. Pattern Recognition 36, 259–275 (2003)CrossRefzbMATHGoogle Scholar
  5. 5.
    Gizatdinova, Y., Surakka, V.: Feature-based detection of facial landmarks from neutral and expressive facial images. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(1), 135–139 (2006)CrossRefGoogle Scholar
  6. 6.
    Gizatdinova, Y., Surakka, V., Zhao, G., Makinen, E., Raisamo, R.: Facial expression classification based on local spatiotemporal edge and texture descriptor. In: 7th International Conference on Methods and Techniques in Behavioral Research Measuring (2010)Google Scholar
  7. 7.
    Goene, M., Alpaydin, E.: Localized multiple kernel machines for image recognition. In: NIPS 2009 Workshop on Understanding Multiple Kernel Learning Methods. MIT Press, Cambridge (2009)Google Scholar
  8. 8.
    Heisele, B., Koshizen, B.: Components for face recognition. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 153–158. IEEE Press, New York (2004)Google Scholar
  9. 9.
    Huang, X., Zhao, G., Pietikäinen, M., Zheng, W.: Dynamic facial expression recognition using boosted component-based spatiotemporal features and multi-classifier fusion. In: Blanc-Talon, J., Bone, D., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2010, Part II. LNCS, vol. 6475, pp. 312–322. Springer, Heidelberg (2010)CrossRefGoogle Scholar
  10. 10.
    Ivanov, Y., Heisele, B., Serre, T.: Using component features for face recognition. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 421–426. IEEE Press, New York (2004)Google Scholar
  11. 11.
    Kanade, T., Cohn, J., Tian, Y.: Comprehensive database for facial expression analysis. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 46–53. IEEE Press, New York (2000)Google Scholar
  12. 12.
    Kittler, J., Hatef, M., Duin, R., Matas, J.: On combining classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20(3), 226–239 (1998)CrossRefGoogle Scholar
  13. 13.
    Kotsia, I., Buciu, I., Pitas, I.: An analysis of facial expression recognition under partial facial image occlusion. Image and Vision Computing 26(7), 1052–1067 (2008)CrossRefGoogle Scholar
  14. 14.
    Lanckriet, G., Cristianini, N., Bartlett, P., Ghaoui, L., Jordan, M.: Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research 5, 27–72 (2004)MathSciNetzbMATHGoogle Scholar
  15. 15.
    Li, Z., Imai, J., Kaneko, M.: Facial-component-based bag of words and phog descriptor for facial expression recognition. In: IEEE International Conference on Systems, Man, and Cybernetics, pp. 1353–1358. IEEE Press, New York (2009)Google Scholar
  16. 16.
    Lucey, P., Cohn, J., Kanade, T., Saragih, J., Ambadar, Z.: The extended cohn-kanade dataset (ck+): a complete dataset for action unit and emotion-specified expression. In: IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 94–101. IEEE Press, New York (2010)Google Scholar
  17. 17.
    Maji, S., Berg, A., Malik, J.: Classification using intersection kernel support vector machines is efficient. In: IEEE Conference on Computer Vision and Pattern Recognition. IEEE Press, New York (2008)Google Scholar
  18. 18.
    Ojala, T., Pietikäinen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(7), 971–987 (2002)CrossRefzbMATHGoogle Scholar
  19. 19.
    Pantic, M., Rothkrantz, L.: Expert system for automatic analysis of facial expressions. Image and Vision Computing 18(11), 881–905 (2000)CrossRefGoogle Scholar
  20. 20.
    Sun, Y., Yin, L.: Evaluation of spatio-temporal regional features for 3D face analysis. In: IEEE Computer Vision and Pattern Recognition Workshop, pp. 13–19. IEEE Press, New York (2009)Google Scholar
  21. 21.
    Taini, M., Zhao, G., Pietikäinen, M.: Weight-based facial expression recognition from near-infrared video sequences. In: 16th Scandinavian Conference on Image Analysis, pp. 239–248. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  22. 22.
    Tian, Y., Kanade, T., Cohn, J.: Facial expression analysis. In: Li, S.Z., Jain, A.K. (eds.) Handbook of Face Recognition, pp. 247–276. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  23. 23.
    Zeng, Z., Pantic, M., Roisman, G., Huang, T.: A survey of affective recognition methods: Audio, visual and spontaneous expression. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(1), 39–58 (2009)CrossRefGoogle Scholar
  24. 24.
    Zhang, Y., Ji, Q.: Active and dynamic information fusion for facial expression understanding from image sequences. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(5), 699–714 (2005)CrossRefGoogle Scholar
  25. 25.
    Zhao, G., Barnard, M., Pietikäinen, M.: Lipreading with local spatiotemporal descriptors. IEEE Transaction on Multimedia 11(7), 1254–1265 (2009)CrossRefGoogle Scholar
  26. 26.
    Zhao, G., Huang, X., Gizadinova, Y., Pietikäinen, M.: Combining dynamic texture and structural features for speaker identification. In: ACM Multimedia 2010 Workshop on Multimedia in Forensics, Security and Intelligence. ACM, New York (2010)Google Scholar
  27. 27.
    Zhao, G., Pietikäinen, M.: Dynamic texture recognition using local binary pattern with an application to facial expressions. IEEE Transactions on Pattern Analysis and Machine Intelligence 29(6), 915–928 (2007)CrossRefGoogle Scholar
  28. 28.
    Zhao, G., Pietikäinen, M.: Boosted multi-resolution spatio temporal descriptors for facial expression recognition. Pattern Recognition Letters 30, 1117–1127 (2009)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Xiaohua Huang
    • 1
    • 2
  • Guoying Zhao
    • 1
  • Matti Pietikäinen
    • 1
  • Wenming Zheng
    • 2
  1. 1.Machine Vision Group, Department of Electrical and Information EngineeringUniversity of OuluFinland
  2. 2.Research Center for Learning ScienceSoutheast UniversityChina

Personalised recommendations