Advancing Multi-actor Graph Convolutions for Skeleton-Based Action Recognition

Zhang, Yiqun; Qin, Zhenyu; Liu, Yang; Gedeon, Tom; Song, Wu

doi:10.1007/978-3-031-55722-4_7

Yiqun Zhang¹⁸,
Zhenyu Qin¹⁹,
Yang Liu¹⁹,
Tom Gedeon²⁰ &
…
Wu Song²¹

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 560))

Included in the following conference series:

International Conference on Intelligent Technologies for Interactive Entertainment

52 Accesses

Abstract

Human skeleton motion recognition, notable for its lightweight, interference-resistant, and resource-saving properties, plays a crucial role in human motion recognition and has found widespread applications. The common approach to capture motion features from human skeleton videos involves extracting skeleton features temporally or spatially using Graph Convolution Networks (GCN) or their improved variants. Nevertheless, existing extraction methods encounter two primary limitations: variability in the number of actors involved in an action and disconnected subgraphs representing multiple actors’ actions, resulting in a loss of inter-subgraph features. To overcome these challenges, we propose Human Mirror and Human Link strategies, which replicate diverse human data to fill and interlink multiple subgraphs. Empirically, our proposed methods applied to the NTU RGB+D 120 dataset significantly enhanced the performance of the base model MSG3D, demonstrating the effectiveness of our approach in handling multi-actor scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Li, Y., et al.: TEA: temporal excitation and aggregation for action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 909–918 (2020)
Google Scholar
Fanello, S.R., et al.: Keep it simple and sparse: real-time action recognition. J. Mach. Learn. Res. 14, 2617–2640 (2013)
Google Scholar
Tran, D., et al.: A closer look at spatiotemporal convolutions for action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6450–6459 (2018)
Google Scholar
Saggese, A., et al.: Learning skeleton representations for human action recognition. Pattern Recogn. Lett. 118, 23–31 (2019)
Article Google Scholar
Du, Y., Wang, W., Wang, L.: Hierarchical recurrent neural network for skeleton based action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1110–1118 (2015)
Google Scholar
Ke, Q., et al.: A new representation of skeleton sequences for 3d action recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3288–3297 (2017)
Google Scholar
Cheng, K., et al.: Extremely lightweight skeleton-based action recognition with ShiftGCN++. IEEE Trans. Image Process. 30, 7333–7348 (2021)
Article Google Scholar
Liu, J., et al.: NTU RGB+ D 120: a large-scale benchmark for 3d human activity understanding. IEEE Tran. Pattern Anal. Mach. Intell. 42(10), 2684–2701 (2019)
Article Google Scholar
Yan, S., Xiong, Y., Lin, D.: Spatial temporal graph convolutional networks for skeleton-based action recognition. In: Thirty-Second AAAI Conference on Artificial Intelligence (2018)
Google Scholar
Cheng, K., et al.: Decoupling GCN with DropGraph module for skeleton-based action recognition. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020, Part XXIV. LNCS, vol. 12369, pp. 536–553. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58586-0_32
Chapter Google Scholar
Zhang, P., et al.: Semantics-guided neural networks for efficient skeleton based human action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1112–1121 (2020)
Google Scholar
Shi, L., et al.: Skeleton-based action recognition with directed graph neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7912–7921 (2019)
Google Scholar
Vemulapalli, R., Arrate, F., Chellappa, R.: Human action recognition by representing 3d skeletons as points in a lie group. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 588–595 (2014)
Google Scholar
Wang, L., Koniusz, P., Huynh, D.: Hallucinating IDT descriptors and I3D optical flow features for action recognition with CNNs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019). https://doi.org/10.1109/ICCV.2019.00879
Liu, M., Liu, H., Chen, C.: Enhanced skeleton visualization for view invariant human action recognition. Pattern Recogn. 68, 346–362 (2017)
Article Google Scholar
Li, M., et al.: Actional-structural graph convolutional networks for skeleton-based action recognition. In: Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3595–3603 (2019)
Google Scholar
Si, C., et al.: An attention enhanced graph convolutional LSTM network for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1227–1236 (2019)
Google Scholar
Shi, L., et al.: Two-stream adaptive graph convolutional networks for skeleton-based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12026–12035 (2019)
Google Scholar
Liu, Z., et al.: Disentangling and unifying graph convolutions for skeleton based action recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 143–152 (2020)
Google Scholar
Hamilton, W., Ying, Z., Leskovec, J.: Inductive representation learning on large graphs. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Cortes, C., et al.: Advances in neural information processing systems 28. In: NIPS 2015 (2015)
Google Scholar
Bruna, J., et al.: Spectral networks and locally connected networks on graphs. In: arXiv preprint arXiv:1312.6203 (2013)
Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. In: Advances in Neural Information Processing Systems, vol. 29 (2016)
Google Scholar
Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. In: arXiv preprint arXiv:1609.02907 (2016)
Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Appl. Comput. Harmon. Anal. 30(2), 129–150 (2011)
Article MathSciNet Google Scholar
Wan, S., et al.: Multiscale dynamic graph convolutional network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 58(5), 3162–3177 (2019)
Article Google Scholar
Dang, L., et al.: MSR-GCN: multi-scale residual graph convolution networks for human motion prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 11467–11476 (2021)
Google Scholar
Zhang, Y., et al.: STST: spatial-temporal specialized transformer for skeleton-based action recognition. In: Proceedings of the 29th ACM International Conference on Multimedia, pp. 3229–3237 (2021)
Google Scholar
Veličković, P., et al.: Graph attention networks. In: arXiv preprint arXiv:1710.10903 (2017)
Shahroudy, A., et al.: NTU RGB+ D: a large scale dataset for 3d human activity analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1010–1019 (2016)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: arXiv preprint arXiv:1412.6980 (2014)
Chen, Y., et al.: Channel-wise topology refinement graph convolution for skeleton-based action recognition. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 13359–13368 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Australian National University, Canberra, Australia
Yiqun Zhang
Seeing Machines, Canberra, Australia
Zhenyu Qin & Yang Liu
Curtin University, Perth, Australia
Tom Gedeon
Rongcheng Cloud-Intelligence Co., Ltd., Rongcheng, China
Wu Song

Authors

Yiqun Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhenyu Qin
View author publications
You can also search for this author in PubMed Google Scholar
Yang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Tom Gedeon
View author publications
You can also search for this author in PubMed Google Scholar
Wu Song
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yiqun Zhang .

Editor information

Editors and Affiliations

Durham University, Durham, UK
Martin Clayton
University of Milano Bicocca, Milan, Italy
Mauro Passacantando
University of Genova, Genova, Genova, Italy
Marcello Sanguineti

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Y., Qin, Z., Liu, Y., Gedeon, T., Song, W. (2024). Advancing Multi-actor Graph Convolutions for Skeleton-Based Action Recognition. In: Clayton, M., Passacantando, M., Sanguineti, M. (eds) Intelligent Technologies for Interactive Entertainment. INTETAIN 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 560. Springer, Cham. https://doi.org/10.1007/978-3-031-55722-4_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-55722-4_7
Published: 23 March 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-55721-7
Online ISBN: 978-3-031-55722-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Advancing Multi-actor Graph Convolutions for Skeleton-Based Action Recognition