Abstract
Action recognition is a leading research topic in the field of computer vision. This paper proposes an effective method for action recognition task based on the skeleton data. Four features are proposed based on the joint differences from 3D skeleton data. From the differences of 3D coordinates of corresponding joints in successive frames, three maps are extracted related to x, y and z coordinates respectively and then these maps are encoded into 2D color images, named as Joint Difference Maps (JDMs). The fourth JDM is formed by mapping the individual x, y and z difference maps into red, green and blue values. Hence, the 3D action recognition problem is converted into 2D image classification problem. It enables us to fine tune CNNs to learn informative features for 3D action recognition problem. The proposed method achieved 79.30% recognition rate on UTD MHAD dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1297–1304. IEEE (2011).https://doi.org/10.1109/CVPR.2011.5995316
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 9–14. IEEE (2010).https://doi.org/10.1109/CVPRW.2010.5543273
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3D action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012. LNCS, pp. 872–885. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33709-3_62
Vieira, A.W., Nascimento, E.R., Oliveira, G.L., Liu, Z., Campos, M.F.M.: STOP: space-time occupancy patterns for 3D action recognition from depth map sequences. In: Alvarez, L., Mejail, M., Gomez, L., Jacobo, J. (eds.) CIARP 2012. LNCS, vol. 7441, pp. 252–259. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-33275-3_31
Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM international conference on Multimedia, pp. 1057–1060. ACM (2012).https://doi.org/10.1145/2393347.2396382
Chen, C., Liu, K., Kehtarnavaz, N.: Real-time human action recognition based on depth motion maps. J. Real-Time Image Process. 12, 155–163 (2016). https://doi.org/10.1007/s11554-013-0370-1
Bulbul, M.F., Jiang, Y., Ma, J.: Human action recognition based on DMMs, HOGs and Contourlet transform. In: 2015 IEEE International Conference on Multimedia Big Data (BigMM), pp. 389–394. IEEE (2015).https://doi.org/10.1109/BigMM.2015.82
Shotton, J., et al.: Real-time human pose recognition in parts from single depth images. Commun. ACM 56, 116–124 (2013). https://doi.org/10.1145/2398356.2398381
Yang, X., Tian, Y.: Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 14–19. IEEE (2012).https://doi.org/10.1109/CVPRW.2012.6239232
Kerola, T., Inoue, N., Shinoda, K.: Spectral graph skeletons for 3D action recognition. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9006, pp. 417–432. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-16817-3_27
Naveenkumar, M., Domnic, S.: Vector quantization based pairwise joint distance maps (VQ-PJDM) for 3D action recognition. Procedia Comput. Sci. 133, 27–36 (2018). https://doi.org/10.1016/j.procs.2018.07.005
Li, C., Hou, Y., Wang, P., Li, W.: Joint distance maps based action recognition with convolutional neural networks. IEEE Signal Process. Lett. 24, 624–628 (2017). https://doi.org/10.1109/LSP.2017.2678539
Du, Y., Fu, Y., Wang, L.: Skeleton based action recognition with convolutional neural network. In: 2015 3rd IAPR Asian Conference on Pattern Recognition (ACPR), pp. 579–583. IEEE (2015).https://doi.org/10.1109/ACPR.2015.7486569
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Zhou, L., Li, W., Zhang, Y., Ogunbona, P., Nguyen, D.T., Zhang, H.: Discriminative key pose extraction using extended LC-KSVD for action recognition. In: 2014 International Conference on Digital lmage Computing: Techniques and Applications (DlCTA), pp. 1–8. IEEE (2014). https://doi.org/10.1109/DICTA.2014.7008101
Chen, C., Jafari, R., Kehtarnavaz, N.: UTD-MHAD: a multimodal dataset for human action recognition utilizing a depth camera and a wearable inertial sensor. In: 2015 IEEE International Conference on Image Processing (ICIP), pp. 168–172. IEEE (2015). https://doi.org/10.1109/ICIP.2015.7350781
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Naveenkumar, M., Domnic, S. (2019). Skeleton Joint Difference Maps for 3D Action Recognition with Convolutional Neural Networks. In: Santosh, K., Hegadi, R. (eds) Recent Trends in Image Processing and Pattern Recognition. RTIP2R 2018. Communications in Computer and Information Science, vol 1035. Springer, Singapore. https://doi.org/10.1007/978-981-13-9181-1_13
Download citation
DOI: https://doi.org/10.1007/978-981-13-9181-1_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-9180-4
Online ISBN: 978-981-13-9181-1
eBook Packages: Computer ScienceComputer Science (R0)