Automatic Human Action Recognition in Videos by Graph Embedding

  • Ehsan Zare Borzeshi
  • Richard Xu
  • Massimo Piccardi
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6979)


The problem of human action recognition has received increasing attention in recent years for its importance in many applications. Yet, the main limitation of current approaches is that they do not capture well the spatial relationships in the subject performing the action. This paper presents an initial study which uses graphs to represent the actor’s shape and graph embedding to then convert the graph into a suitable feature vector. In this way, we can benefit from the wide range of statistical classifiers while retaining the strong representational power of graphs. The paper shows that, although the proposed method does not yet achieve accuracy comparable to that of the best existing approaches, the embedded graphs are capable of describing the deformable human shape and its evolution along the time. This confirms the interesting rationale of the approach and its potential for future performance.


Graph edit distance Graph embedding Object classification 


  1. 1.
    Blackburn, J., Ribeiro, E.: Human motion recognition using isomap and dynamic time warping. In: Elgammal, A., Rosenhahn, B., Klette, R. (eds.) Human Motion 2007. LNCS, vol. 4814, pp. 285–298. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  2. 2.
    Chen, T., Haussecker, H., Bovyrin, A., Belenov, R., Rodyushkin, K., Kuranov, A., Eruhimov, V.: Computer vision workload analysis: case study of video surveillance systems. Intel Technology Journal 9(2), 109–118 (2005)Google Scholar
  3. 3.
    Conte, D., Foggia, P., Sansone, C., Vento, M.: Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence 18(3), 265–298 (2004)CrossRefGoogle Scholar
  4. 4.
    Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72. IEEE, Los Alamitos (2006)Google Scholar
  5. 5.
    Gao, X., Xiao, B., Tao, D., Li, X.: A survey of graph edit distance. Pattern Analysis & Applications 13(1), 113–129 (2010)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Guo, K., Ishwar, P., Konrad, J.: Action Recognition Using Sparse Representation on Covariance Manifolds of Optical FlowGoogle Scholar
  7. 7.
    Hjaltason, G., Samet, H.: Properties of embedding methods for similarity searching in metric spaces. IEEE Transactions on Pattern Analysis and Machine Intelligence, 530–549 (2003)Google Scholar
  8. 8.
    Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2), 107–123 (2005)CrossRefGoogle Scholar
  9. 9.
    Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE, Los Alamitos (2008)CrossRefGoogle Scholar
  10. 10.
    Neuhaus, M., Bunke, H.: A probabilistic approach to learning costs for graph edit distance. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 389–393. IEEE, Los Alamitos (2004)CrossRefGoogle Scholar
  11. 11.
    Neuhaus, M., Bunke, H.: Automatic learning of cost functions for graph edit distance. Information Sciences 177(1), 239–247 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Pekalska, E., Duin, R.: The dissimilarity representation for pattern recognition: foundations and applications. World Scientific Pub. Co. Inc., Singapore (2005)CrossRefzbMATHGoogle Scholar
  13. 13.
    Poppe, R.: A survey on vision-based human action recognition. Image and Vision Computing 28(6), 976–990 (2010)CrossRefGoogle Scholar
  14. 14.
    Quattoni, A., Wang, S., Morency, L.P., Collins, M., Darrell, T., Csail., M.: Hidden-state conditional random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence (2007)Google Scholar
  15. 15.
    Rabiner, L.: A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)CrossRefGoogle Scholar
  16. 16.
    Riesen, K., Bunke, H.: Graph classification by means of Lipschitz embedding. IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 39(6), 1472–1483 (2009)CrossRefGoogle Scholar
  17. 17.
    Riesen, K., Neuhaus, M., Bunke, H.: Graph embedding in vector spaces by means of prototype selection. In: Proceedings of the 6th IAPR-TC-15 International Conference on Graph-Based Representations in Pattern Recognition, pp. 383–393. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  18. 18.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local SVM approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3 (2004)Google Scholar
  19. 19.
    Ta, A.-P., Wolf, C., Lavoue, G., Baskurt, A.: Recognizing and localizing individual activities through graph matching, pp. 196–203. IEEE Computer Society, Los Alamitos (2010)Google Scholar
  20. 20.
    Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms (2008)Google Scholar
  21. 21.
    Yamato, J., Ohya, J., Ishii, K.: Recognizing human action in time-sequential images using hidden Markov model. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1992. Proceedings CVPR 1992, pp. 379–385 (1992)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Ehsan Zare Borzeshi
    • 1
  • Richard Xu
    • 1
  • Massimo Piccardi
    • 1
  1. 1.School of Computing and Communications, Faculty of Engineering and ITUniversity of Technology, Sydney (UTS)SydneyAustralia

Personalised recommendations