Human Action Recognition via Body Part Region Segmented Dense Trajectories

Yamada, Kaho; Ito, Seiya; Kaneko, Naoshi; Sumi, Kazuhiko

doi:10.1007/978-3-030-21074-8_6

Kaho Yamada¹⁶,
Seiya Ito¹⁶,
Naoshi Kaneko¹⁶ &
…
Kazuhiko Sumi¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11367))

Included in the following conference series:

Asian Conference on Computer Vision

1593 Accesses
1 Citations

Abstract

We propose a novel action recognition framework based on trajectory features with human-aware spatial segmentation. Our insight is that the critical features for recognition are appeared in the partial regions of human, thus we segment a video frame into spatial regions based on the human body parts to enhance feature representation. We utilize an object detector and a pose estimator to segment four regions, namely full body, left/right arm, and upper body. From these regions, we extract dense trajectory features and feed them into a shallow RNN to effectively consider the long-term relationships. The evaluation result shows that our framework outperforms previous approaches on the standard two benchmarks, i.e. J-HMDB and MPII Cooking Activities.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: CVPR (2017)
Google Scholar
Chéron, G., Laptev, I., Schmid, C.: P-CNN: pose-based CNN features for action recognition. In: ICCV (2015)
Google Scholar
Farnebäck, G.: Two-frame motion estimation based on polynomial expansion. In: Bigun, J., Gustavsson, T. (eds.) SCIA 2003. LNCS, vol. 2749, pp. 363–370. Springer, Heidelberg (2003). https://doi.org/10.1007/3-540-45103-X_50
Chapter Google Scholar
Gkioxari, G., Malik, J.: Finding action tubes. In: CVPR (2015)
Google Scholar
He, Y., Shirakabe, S., Satoh, Y., Kataoka, H.: Human action recognition without human. In: Hua, G., Jégou, H. (eds.) ECCV 2016. LNCS, vol. 9915, pp. 11–17. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-49409-8_2
Chapter Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Article Google Scholar
Jhuang, H., Gall, J., Zuffi, S., Schmid, C., Black, M.J.: Towards understanding action recognition. In: ICCV (2013)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: ICCV (2011)
Google Scholar
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Chapter Google Scholar
Matikainen, P., Hebert, M., Sukthankar, R.: Trajectons: action recognition through the motion analysis of tracked features. In: ICCV Workshop (2009)
Google Scholar
Peng, X., Zou, C., Qiao, Y., Peng, Q.: Action recognition with stacked fisher vectors. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8693, pp. 581–595. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_38
Chapter Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the fisher kernel for large-scale image classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15561-1_11
Chapter Google Scholar
Rohrbach, M., Amin, S., Andriluka, M., Schiele, B.: A database for fine grained activity detection of cooking activities. In: CVPR (2012)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)
Google Scholar
Sun, J., Mu, Y., Yan, S., Cheong, L.F.: Activity recognition using dense long-duration trajectories. In: ICME (2010)
Google Scholar
Sun, J., Wu, X., Yan, S., Cheong, L.F., Chua, T.S., Li, J.: Hierarchical spatio-temporal context modeling for action recognition. In: CVPR (2009)
Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: CVPR (2011)
Google Scholar
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: ICCV (2013)
Google Scholar
Wang, L., Qiao, Y., Tang, X.: Action recognition with trajectory-pooled deep-convolutional descriptors. In: CVPR (2015)
Google Scholar
Wang, L., et al.: Temporal segment networks: towards good practices for deep action recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 20–36. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_2
Chapter Google Scholar
Yamada, K., Yoshida, T., Sumi, K., Habe, H., Mitsugami, I.: Spatial and temporal segmented dense trajectories for gesture recognition. In: QCAV (2017)
Google Scholar
Zhou, Y., Ni, B., Hong, R., Wang, M., Tian, Q.: Interaction part mining: a mid-level approach for fine-grained action recognition. In: CVPR (2015)
Google Scholar

Download references

Acknowledgments

This work was partially supported by Aoyama Gakuin University-Supported Program “Early Eagle Program”.

Author information

Authors and Affiliations

Aoyama Gakuin University, Kanagawa, Japan
Kaho Yamada, Seiya Ito, Naoshi Kaneko & Kazuhiko Sumi

Authors

Kaho Yamada
View author publications
You can also search for this author in PubMed Google Scholar
Seiya Ito
View author publications
You can also search for this author in PubMed Google Scholar
Naoshi Kaneko
View author publications
You can also search for this author in PubMed Google Scholar
Kazuhiko Sumi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kaho Yamada .

Editor information

Editors and Affiliations

School of Computer Science, University of Adelaide, Adelaide, Australia
Gustavo Carneiro
Data61, Commonwealth Scientific and Industrial Research Organization, Canberra, Australia
Shaodi You

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Yamada, K., Ito, S., Kaneko, N., Sumi, K. (2019). Human Action Recognition via Body Part Region Segmented Dense Trajectories. In: Carneiro, G., You, S. (eds) Computer Vision – ACCV 2018 Workshops. ACCV 2018. Lecture Notes in Computer Science(), vol 11367. Springer, Cham. https://doi.org/10.1007/978-3-030-21074-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-030-21074-8_6
Published: 19 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21073-1
Online ISBN: 978-3-030-21074-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics