Multimedia Tools and Applications

, Volume 78, Issue 22, pp 31319–31345 | Cite as

A weighting scheme for mining key skeletal joints for human action recognition

  • Elham ShabaniniaEmail author
  • Ahmad Reza Naghsh-Nilchi
  • Shohreh Kasaei


A novel class-dependent joint weighting method is proposed to mine the key skeletal joints for human action recognition. Existing deep learning methods or those based on hand-crafted features may not adequately capture the relevant joints of different actions which are important to recognize the actions. In the proposed method, for each class of human actions, each joint is weighted according to its temporal variations and its inherent ability in extension or flexion. These weights can be used as a prior knowledge in skeletal joints-based methods. Here, a novel human action recognition algorithm is also proposed in order to use these weights in two different ways. First, for each frame of a skeletal sequence, the histogram of 3D joints is weighted according to the contribution of joints in the corresponding class of human action. Second, a weighted motion energy function is defined to dynamically divide the temporal pyramid of actions. Experimental results on three benchmark datasets show the efficiency of proposed weighting method, especially when occlusion occurs.


Class-dependent joint weight Human activity recognition Kinect Hierarchical extended histogram (HEH) 



  1. 1.
    Aggarwal J, Ryoo MS (2011) Human activity analysis: A review. ACM Computing Surveys (CSUR) 43(3):16CrossRefGoogle Scholar
  2. 2.
    Aggarwal J, Xia L (2014) Human activity recognition from 3d data: A review. Pattern Recogn LettGoogle Scholar
  3. 3.
    Amor BB, Su J, Srivastava A (2016) Action recognition using rate-invariant analysis of skeletal shape trajectories. IEEE Trans Pattern Anal Mach Intell 38(1):1–13CrossRefGoogle Scholar
  4. 4.
    Chaaraoui AA, Padilla-López JR, Climent-Pérez P, Flórez-Revuelta F (2014) Evolutionary joint selection to improve human action recognition with RGB-D devices. Expert Syst Appl 41(3):786–794CrossRefGoogle Scholar
  5. 5.
    Chen G, Clarke D, Giuliani M, Gaschler A, Knoll A (2015) Combining unsupervised learning and discrimination for 3D action recognition. Signal Process 110:67–81CrossRefGoogle Scholar
  6. 6.
    Chen L, Wei H, Ferryman J (2013) A survey of human motion analysis using depth imagery. Pattern Recogn Lett 34(15):1995–2006CrossRefGoogle Scholar
  7. 7.
    Cippitelli E, Gasparrini S, Gambi E, Spinsante S (2016) A human activity recognition system using skeleton data from rgbd sensors. Computational Intelligence and Neuroscience 2016:21CrossRefGoogle Scholar
  8. 8.
    Costantini L, Seidenari L, Serra G, Capodiferro L, Del Bimbo A (2011) Space-time Zernike moments and pyramid kernel descriptors for action classification. In: International Conference on Image Analysis and Processing. Springer, pp 199–208Google Scholar
  9. 9.
    Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A (2013) Space-time pose representation for 3D human action recognition. In: International Conference on Image Analysis and Processing. Springer, pp 456–464Google Scholar
  10. 10.
    Devanne M, Wannous H, Berretti S, Pala P, Daoudi M, Del Bimbo A (2015) 3-D human action recognition by shape analysis of motion trajectories on Riemannian manifold. IEEE Transactions on Cybernetics 45(7):1340–1352CrossRefGoogle Scholar
  11. 11.
    Du Y, Wang W, Wang L (2015) Hierarchical recurrent neural network for skeleton based action recognition. Proc IEEE Conf Comput Vis Pattern Recognit:1110–1118Google Scholar
  12. 12.
    Faria DR, Premebida C, Nunes U (2014) A probabilistic approach for human everyday activities recognition using body motion from RGB-D images. In: Robot and Human Interactive Communication, 2014 RO-MAN: The 23rd IEEE International Symposium on. IEEE, pp 732–737Google Scholar
  13. 13.
    Gaglio S, Re GL, Morana M (2015) Human activity recognition process using 3-D posture data. IEEE Transactions on Human-Machine Systems 45(5):586–597CrossRefGoogle Scholar
  14. 14.
    Guo Y, Li Y, Shao Z (2018) DSRF: A flexible trajectory descriptor for articulated human action recognition. Pattern Recogn 76:137–148. CrossRefGoogle Scholar
  15. 15.
    Gupta R, Chia AY-S, Rajan D (2013) Human activities recognition using depth images. In: Proceedings of the 21st ACM international conference on Multimedia. ACM, pp 283–292Google Scholar
  16. 16.
    Han F, Reily B, Hoff W, Zhang H (2017) Space-time representation of people based on 3D skeletal data: A review. Comput Vis Image Underst 158:85–105CrossRefGoogle Scholar
  17. 17.
    Hershey JR, Olsen PA (2007) Approximating the Kullback Leibler divergence between Gaussian mixture models. In: Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on. IEEE, pp IV-317-IV-320Google Scholar
  18. 18.
    Ijjina EP, Mohan CK (2014) Human action recognition based on mocap information using convolution neural networks. In: Machine Learning and Applications (ICMLA), 2014 13th International Conference on. IEEE, pp 159–164Google Scholar
  19. 19.
    Ji X, Cheng J, Tao D, Wu X, Feng W (2017) The spatial Laplacian and temporal energy pyramid representation for human action recognition using depth sequences. Knowl-Based SystGoogle Scholar
  20. 20.
    Jiang M, Kong J, Bebis G, Huo H (2015) Informative joints based human action recognition using skeleton contexts. Signal Process Image Commun 33:29–40CrossRefGoogle Scholar
  21. 21.
    Johansson G (1973) Visual perception of biological motion and a model for its analysis. Percept Psychophys 14(2):201–211CrossRefGoogle Scholar
  22. 22.
    Koppula HS, Gupta R, Saxena A (2013) Learning human activities and object affordances from rgb-d videos. The International Journal of Robotics Research 32(8):951–970CrossRefGoogle Scholar
  23. 23.
    Li M, Leung H (2017) Graph-based approach for 3D human skeletal action recognition. Pattern Recogn Lett 87:195–202CrossRefGoogle Scholar
  24. 24.
    Liu Y, Nie L, Han L, Zhang L, Rosenblum DS (2015) Action2Activity: recognizing complex activities from sensor data. In: Twenty-fourth international joint conference on artificial intelligenceGoogle Scholar
  25. 25.
    Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115CrossRefGoogle Scholar
  26. 26.
    Liu J, Wang G, Duan L-Y, Abdiyeva K, Kot AC (2018) Skeleton-based human action recognition with global context-aware attention LSTM networks. IEEE Trans Image Process 27(4):1586–1599MathSciNetCrossRefGoogle Scholar
  27. 27.
    Luo J, Wang W, Qi H (2014) Spatio-temporal feature extraction and representation for RGB-D human action recognition. Pattern Recogn Lett 50:139–148CrossRefGoogle Scholar
  28. 28.
    Masood SZ, Ellis C, Nagaraja A, Tappen MF, LaViola JJ Jr, Sukthankar R (2011) Measuring and reducing observational latency when recognizing actions. In: Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on. IEEE, pp 422–429Google Scholar
  29. 29.
    Moreno PJ, Ho PP, Vasconcelos N (2003) A Kullback-Leibler divergence based kernel for SVM classification in multimedia applications. In: Advances in neural information processing systems. p NoneGoogle Scholar
  30. 30.
    Ni B, Pei Y, Moulin P, Yan S (2013) Multilevel depth and image fusion for human activity detection. IEEE Transactions on Cybernetics 43(5):1383–1394CrossRefGoogle Scholar
  31. 31.
    Ofli F, Chaudhry R, Kurillo G, Vidal R, Bajcsy R (2014) Sequence of the most informative joints (SMIJ): A new representation for human skeletal action recognition. J Vis Commun Image Represent 25(1):24–38. CrossRefGoogle Scholar
  32. 32.
    Panero J, Zelnik M (2014) Human dimension and interior space: a source book of design reference standards. Watson-Guptill, New YorkGoogle Scholar
  33. 33.
    Parisi GI, Weber C, Wermter S (2015) Self-organizing neural integration of pose-motion features for human action recognition. Front Neurorobot 9:3CrossRefGoogle Scholar
  34. 34.
    Pham H-H, Khoudour L, Crouzil A, Zegers P, Velastin SA (2018) Exploiting deep residual networks for human action recognition from skeletal data. Comput Vis Image UnderstGoogle Scholar
  35. 35.
    Posada D, Buckley TR (2004) Model selection and model averaging in phylogenetics: advantages of Akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst Biol 53(5):793–808CrossRefGoogle Scholar
  36. 36.
    Presti LL, La Cascia M (2016) 3D skeleton-based human action classification: A survey. Pattern Recogn 53:130–147CrossRefGoogle Scholar
  37. 37.
    Presti LL, La Cascia M, Sclaroff S, Camps O (2014) Gesture modeling by hanklet-based hidden markov model. In: Asian Conference on Computer Vision. Springer, pp 529–546Google Scholar
  38. 38.
    Shabaninia E, Naghsh-Nilchi AR, Kasaei S (2018) Extended histogram: probabilistic modelling of video content temporal evolutions. Multidim Syst Sign Process:1–19Google Scholar
  39. 39.
    Slama R, Wannous H, Daoudi M, Srivastava A (2015) Accurate 3D action recognition using learning on the Grassmann manifold. Pattern Recogn 48(2):556–567CrossRefGoogle Scholar
  40. 40.
    Sung J, Ponce C, Selman B, Saxena A (2012) Unstructured human activity detection from rgbd images. In: Robotics and Automation (ICRA), 2012 IEEE International Conference on. IEEE, pp 842–849Google Scholar
  41. 41.
    Theodorakopoulos I, Kastaniotis D, Economou G, Fotopoulos S (2014) Pose-based human action recognition via sparse representation in dissimilarity space. J Vis Commun Image Represent 25(1):12–23CrossRefGoogle Scholar
  42. 42.
    Vemulapalli R, Arrate F, Chellappa R (2014) Human action recognition by representing 3d skeletons as points in a lie group. 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE:588–595Google Scholar
  43. 43.
    Wang J, Liu Z, Wu Y, Yuan J (2014) Learning actionlet ensemble for 3D human action recognition. IEEE Trans Pattern Anal Mach Intell 36(5):914–927CrossRefGoogle Scholar
  44. 44.
    Weng J, Weng C, Yuan J (2017) Spatio-temporal naive-bayes nearest-neighbor (st-nbnn) for skeleton-based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp 4171–4180Google Scholar
  45. 45.
    Wu D, Shao L (2014) Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition. Proc IEEE Conf Comput Vis Pattern Recognit:724–731Google Scholar
  46. 46.
    Xia L, Chen C-C, Aggarwal J (2012) View invariant human action recognition using histograms of 3d joints. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, pp 20–27Google Scholar
  47. 47.
    Yang X, Tian Y (2012) Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Computer Vision and Pattern Recognition Workshops (CVPRW), 2012 IEEE Computer Society Conference on. IEEE, pp 14–19Google Scholar
  48. 48.
    Yang X, Tian Y (2014) Effective 3D action recognition using eigenjoints. J Vis Commun Image Represent 25(1):2–11MathSciNetCrossRefGoogle Scholar
  49. 49.
    Zhang C, Tian Y (2012) Rgb-d camera-based daily living activity recognition. Journal of Computer Vision and Image Processing 2(4):12Google Scholar
  50. 50.
    Zhu Y, Chen W, Guo G (2013) Fusing spatiotemporal features and joints for 3d action recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops:486–491Google Scholar
  51. 51.
    Zhu W, Lan C, Xing J, Zeng W, Li Y, Shen L, Xie X (2016) Co-Occurrence Feature Learning for Skeleton Based Action Recognition Using Regularized Deep LSTM Networks. In: AAAI. p 8Google Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  • Elham Shabaninia
    • 1
    Email author
  • Ahmad Reza Naghsh-Nilchi
    • 1
  • Shohreh Kasaei
    • 2
  1. 1.Department of Artificial Intelligence, Faculty of Computer EngineeringUniversity of IsfahanIsfahanIran
  2. 2.Department of Computer EngineeringSharif University of TechnologyTehranIran

Personalised recommendations