Human Activity Recognition by Fusion of RGB, Depth, and Skeletal Data
A significant increase in research of human activity recognition can be seen in recent years due to availability of low-cost RGB-D sensors and advancement of deep learning algorithms. In this paper, we augmented our previous work on human activity recognition (Imran et al., IEEE international conference on advances in computing, communications, and informatics (ICACCI), 2016)  by incorporating skeletal data for fusion. Three main approaches are used to fuse skeletal data with RGB, depth data, and the results are compared with each other. A challenging UTD-MHAD activity recognition dataset with intraclass variations, comprising of twenty-seven activities, is used for testing and experimentation. Proposed fusion results in accuracy of 95.38% (nearly 4% improvement over previous work), and it also justifies the fact that recognition improves with an increase in number of evidences in support.
KeywordsConvolutional neural networks Deep learning Depth motion map RGB-D sensors Skeleton UTD-MHAD Motion history image and fusion
This research was supported by Science and Engineering Research Board (SERB) under project no. ECR/2016/000387, in cooperation with the Department of Science & Technology (DST), Government of India. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of DST-SERB or the Government of India. The DST-SERB or Government of India is authorized to reproduce and distribute reprints for Government purposes notwithstanding any copyright notation thereon.
- 1.Imran, J., Kumar, P.: Human Action Recognition using RGB-D Sensor and Deep Convolutional Neural Networks. In: IEEE International Conference on Advances in Computing, Communications and Informatics (ICACCI), pp. 144–148. Jaipur, India (2016)Google Scholar
- 2.Chen, C., Jafari, R., Kehtarnavaz, N.: UTD-MHAD: A Multimodal Dataset for Human Action Recognition Utilizing a Depth Camera and a Wearable Inertial Sensor. In: IEEE International Conference on Image Processing (ICIP), pp. 168–172 (2015)Google Scholar
- 3.Li, W., Zhang, Z., and Liu, Z.: Action recognition based on a bag of 3D points. In: Proc. IEEE Conf. Comput. Vis. Pattern Recog. Workshops, San Francisco, CA, USA, pp. 9–14. Jun. (2010)Google Scholar
- 4.Ofli, F., Chaudhry, R., Kurillo, G., Vidal, R., and Bajcsy R.: Berkeley MHAD: A Comprehensive Multimodal Human Action Database. In: Proc. IEEE Workshop Appl. Comput. Vision, pp. 53–60. Jan. (2013)Google Scholar
- 5.Sung, J., Ponce, C., Selman, B., Saxena, A.: Unstructured Human Activity Detection from RGBD Images. In: IEEE International Conference on Robotics and Automation RiverCentre, Saint Paul, Minnesota, USA, pp. 842–849, (2012)Google Scholar
- 6.Wang, P., Li, W., Gao, Z., Zhang, J., Tang, C., and Ogunbona, P. O.: Action Recognition From Depth Maps Using Deep Convolutional Neural Networks, IEEE Transactions on Human-Machine Systems, Vol. 46, No. 4, pp. 498–509 August (2016)Google Scholar
- 7.Chen, C., Liu, M., Zhang, B., Han, J., Jiang, J., Liu, H.: 3D Action Recognition Using Multi-temporal Depth Motion Maps and Fisher Vector, In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence, pp. 3331–3337 (2016)Google Scholar
- 8.Xia, L., Chen, C. C., and Aggarwal, J. K.: View Invariant Human Action Recognition Using Histograms of 3D Joints. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 20–27, Providence, RI, (2012)Google Scholar
- 9.Gaglio, S., Lo Re, G., and Morana, M.: Human Activity Recognition Process Using 3-D Posture Data. IEEE Transactions on Human-Machine Systems, Vol. 45, No. 5, pp. 586–597 (2015)Google Scholar
- 10.Cippitelli, E., Gasparrini, S., Gambi, E., and Spinsante, S.: A Human Activity Recognition System Using Skeleton Data from RGBD Sensors, Computational Intelligence and Neuroscience, Article ID 4351435, pp. 1–14, Volume 2016 (2016)Google Scholar
- 11.Farhad, M. B., Jiang, Y., and Ma, J.: DMMs- Based Multiple Features Fusion for Human Action Recognition. International Journal of Multimedia Data Engineering and Management (IJMDEM) Volume 6, Issue 4, pp. 23–39 (2015)Google Scholar
- 12.Aggarwal, J.K., Xia, L.: Human Activity Recognition from 3D Data: A Review. Pattern Recognition Letters, 48, pp. 70–80 (2014)Google Scholar