Recognition of Human Actions from RGB-D Videos Using a Reject Option

  • Vincenzo Carletti
  • Pasquale Foggia
  • Gennaro Percannella
  • Alessia Saggese
  • Mario Vento
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8158)


In this paper we propose a method for recognizing human actions by using depth images acquired through a Kinect sensor. The depth images are represented through the combination of three sets of well-known features, respectively based on Hu moments, depth variations and the \(\mathfrak{R}\) transform, an enhanced version of the Radon transform. A GMM classifier is adopted and finally a reject option is introduced in order to improve the overall reliability of the system. The proposed approach has been tested over two datasets, the Mivia and the MHAD, showing very promising results.


Gaussian Mixture Model Depth Image Kinect Sensor Motion History Image Gaussian Mixture Model Classifier 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Bobick, A.F., Davis, J.W.: The recognition of human movement using temporal templates. IEEE Trans. Pattern Anal. Mach. Intell. 23(3), 257–267 (2001), CrossRefGoogle Scholar
  2. 2.
    Chen, Y., Wu, Q., He, X.: Human action recognition based on radon transform. In: Lin, W., Tao, D., Kacprzyk, J., Li, Z., Izquierdo, E., Wang, H. (eds.) Multimedia Analysis, Processing and Communications. SCI, vol. 346, pp. 369–389. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  3. 3.
    Conte, D., Foggia, P., Percannella, G., Saggese, A., Vento, M.: An ensemble of rejecting classifiers for anomaly detection of audio events. In: 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance (AVSS), pp. 76–81 (September 2012)Google Scholar
  4. 4.
    Davis, J.: Hierarchical motion history images for recognizing human motion. In: Proceedings of the IEEE Workshop on Detection and Recognition of Events in Video, pp. 39–46 (2001)Google Scholar
  5. 5.
    Foggia, P., Sansone, C., Tortorella, F., Vento, M.: Multiclassification: reject criteria for the bayesian combiner. Pattern Recognition 32(8), 1435–1447 (1999), CrossRefGoogle Scholar
  6. 6.
    Le, Q., Zou, W., Yeung, S., Ng, A.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368 (June 2011)Google Scholar
  7. 7.
    Li, J., Zhao, B., Zhang, H.: Face recognition based on pca and lda combination feature extraction. In: 2009 1st International Conference on Information Science and Engineering (ICISE), pp. 1240–1243 (2009)Google Scholar
  8. 8.
    Megavannan, V., Agarwal, B., Babu, R.: Human action recognition using depth maps. In: 2012 International Conference on Signal Processing and Communications (SPCOM), pp. 1–5 (July 2012)Google Scholar
  9. 9.
    Mokhber, A., Achard, C., Qu, X., Milgram, M.: Action recognition with global features. In: Sebe, N., Lew, M., Huang, T.S. (eds.) HCI/ICCV 2005. LNCS, vol. 3766, pp. 110–119. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  10. 10.
    Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., Bajcsy, R.: Berkeley mhad: A comprehensive multimodal human action database. In: IEEE Workshop on Applications on Computer Vision (WACV). IEEE (2013)Google Scholar
  11. 11.
    Poppe, R.: A survey on vision-based human action recognition. Image Vision Comput. 28(6), 976–990 (2010), CrossRefGoogle Scholar
  12. 12.
    Schuldt, C., Laptev, I., Caputo, B.: Recognizing human actions: a local svm approach. In: Proceedings of the 17th International Conference on Pattern Recognition, ICPR 2004, vol. 3, pp. 32–36 (August 2004)Google Scholar
  13. 13.
    Soda, P., Iannello, G., Vento, M.: A multiple expert system for classifying fluorescent intensity in antinuclear autoantibodies analysis. Pattern Anal. Appl. 12(3), 215–226 (2009), MathSciNetCrossRefGoogle Scholar
  14. 14.
    Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured human activity detection from rgbd images. In: ICRA, pp. 842–849. IEEE (2012)Google Scholar
  15. 15.
    Tabbone, S., Wendling, L., Salmon, J.P.: A new shape descriptor defined on the radon transform. Comput. Vis. Image Underst. 102(1), 42–51 (2006), CrossRefGoogle Scholar
  16. 16.
    Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1290–1297 (June 2012)Google Scholar
  17. 17.
    Wang, Y., Huang, K., Tan, T.: Human activity recognition based on r transform. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8 (June 2007)Google Scholar
  18. 18.
    Zhang, H., Parker, L.: 4-dimensional local spatio-temporal features for human activity recognition. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 2044–2049 (September 2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Vincenzo Carletti
    • 1
  • Pasquale Foggia
    • 1
  • Gennaro Percannella
    • 1
  • Alessia Saggese
    • 1
  • Mario Vento
    • 1
  1. 1.Dept. of Information Eng., Electrical Eng. and Applied Mathematics (DIEM)University of SalernoItaly

Personalised recommendations