Abstract
The human interaction recognition methods based on motion co-occurrence have been an efficient solution for its reasonable expression and simple operation. However this kind of methods has relatively low recognition accuracy. An innovative and effective way based on the co-occurring visual matrix sequence was proposed to improve the accuracy in this paper, which sufficiently utilized the superiority of co-occurring visual matrix and probability graph model. In the individual segmentation framework, ROI was firstly extracted by frame difference and the distance analysis between two interacting persons, and segmented into two separate interacting persons with prior knowledge, such as color and body outline. Then the k-means algorithm was utilized to build the bag of visual words (BOVW) with HOG feature from all the training videos, and each frame in a video was described by co-occurring visual matrix with BOVW, and the video was represented by the co-occurring visual matrix sequence. Finally, HMM method was utilized to model and recognize the human interactions. Experimental results on the UT-Interaction dataset show that the method achieved better recognition performance with simple implementation.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Li, N., Cheng, X., Guo, H., Wu, Z.: A hybrid method for human interaction recognition using spatio-temporal interest points. In: Proceedings of 22nd International Conference on Pattern Recognition. Stockholm, Sweden, pp. 2513–2518 (2014)
Zhang, B., Rota, P., Conci, N., et al.: Human interaction recognition in the wild: analyzing trajectory clustering from multiple-instance-learning perspective. In: Proceedings of IEEE International Conference on Multimedia and Expo, Torino, Italy, pp. 1–6 (2015)
Peng, X., Peng, Q., Qiao, Y.: Exploring dense trajectory feature and encoding methods for human interaction recognition. In: Proceedings of Conference on Internet Multimedia Computing and Service, Huangshan, China, pp. 23–27 (2013)
Gaur, U., Zhu, Y., Song, B., Roy-Chowdhury, A.: A “string of feature graphs” model for recognition of complex activities in natural videos. In: Proceedings of IEEE Conference on Computer Vision, Barcelona, Spain, pp. 2595–2602 (2011)
Ryoo, M.S., Aggarwal, J.K.: Recognition of composite human activities through context-free grammar based representation. In: Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, NY, USA, pp. 1709–1719 (2006)
Kong, Y., Jia, Y., Fu, Y.: Interactive phrases: semantic descriptions for human interaction recognition. IEEE Trans. Pattern Anal. Mach. Intell. 36(9), 1775–1788 (2014)
Brendle, W., Todorovic, S.: Learning spatiotemporal graphs of human activities. In: Proceedings of IEEE Conference on Computer Vision, Barcelona, Spain, pp. 778–785 (2011)
Dong, Z., Kong, Y., Liu, C., Li, H., Jia, Y.: Recognizing human interaction by multiple features. In: Proceedings of 1st Asian Conference Pattern Recognition, Beijing, China, pp. 77–81 (2011)
Kong, Y., Liang, W., Dong, Z., Jia, Y.: Recognising human interaction from videos by a discriminative model. Inst. Eng. Technol. Comput. Vis. 8(4), 277–286 (2014)
Yuan, F., Prinet, V., Yuan, J.: Middle-level representation for human activities recognition: the role of spatio-temporal relationships. In: Proceedings of 11th European Conference on Computer Vision, Heraklion, Greece, pp. 168–180 (2010)
Slimani, K., Benezeth, Y., Souami, F.: Human interaction recognition based on the co-occurrence of visual words. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition Workshops, Columbus, Ohio, USA, pp. 461–466 (2014)
Scovanner, P., Ali, S., Shah, M.: A 3-dimensional sift descriptor and its application to action recognition. In: Proceedings of ACM International Multimedia Conference and Exhibition, Augsburg, Bavaria, Germany, pp. 357–360 (2007)
Weizman, L., Goldberger, J.: Urban-area segmentation using visual words. Proc. IEEE Geosci. Remote. Sens. Lett. 6(3), 388–392 (2009)
Rabiner, L.R.: Tutorial on hidden Markov models and selected applications in speech recognition. Proc. IEEE 77(2), 257–286 (1989)
Yu, L., et al.: Recognition of fracture image based on gray level co-occurrence matrix. Comput. Simul. 27(4), 224–227 (2010)
Ryoo, M., Aggarwal, J.: Spatio-temporal relationship match: video structure comparison for recognition of complex human activities. In: Proceedings of the IEEE International Conference on Computer Vision. Kyoto, pp. 1593–1600 (2009)
Acknowledgements
This work is supported by the Program for Science Research Local Project of Education Department of Liaoning province (No. L201708) and Scientific Research Youth Project of Education Department of Liaoning Province, China (No. L201745).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, X., Qin, L., Zuo, X. (2019). Human Interaction Recognition Based on the Co-occurring Visual Matrix Sequence. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds) Intelligent Robotics and Applications. ICIRA 2019. Lecture Notes in Computer Science(), vol 11744. Springer, Cham. https://doi.org/10.1007/978-3-030-27541-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-030-27541-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27540-2
Online ISBN: 978-3-030-27541-9
eBook Packages: Computer ScienceComputer Science (R0)