Skip to main content

Cross-view Action Recognition via Dual-Codebook and Hierarchical Transfer Framework

  • Conference paper
  • First Online:
Book cover Computer Vision -- ACCV 2014 (ACCV 2014)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9007))

Included in the following conference series:

Abstract

In this paper, we focus on the challenging cross-view action recognition problem. The key to this problem is to find the correspondence between source and target views, which is realized in two stages in this paper. Firstly, we construct a Dual-Codebook for the two views, which is composed of two codebooks corresponding to source and target views, respectively. Each codeword in one codebook has a corresponding codeword in the other codebook, which is different from traditional methods that implement independent codebooks in the two views. We propose an effective co-clustering algorithm based on semi-nonnegative matrix factorization to derive the Dual-Codebook. With the Dual-Codebook, an action can be represented based on Bag-of-Dual-Codes (BoDC) no matter it is in the source view or in the target view. Therefore, the Dual-Codebook establishes a sort of codebook-to-codebook correspondence, which is the foundation for the second stage. In the second stage, we observe that, although the appearance of action samples will change significantly with viewpoints, the temporal relationship between atom actions within an action should be stable across views. Therefore, we further propose a hierarchical transfer framework to obtain the feature-to-feature correspondence at atom-level between source and target views. The framework is based on a temporal structure that can effectively capture the temporal relationship between atom actions within an action. It performs transfer at atom levels of multiple timescales, while most existing methods only perform video-level transfer. We carry out a series of experiments on the IXMAS dataset. The results demonstrate that our method obtained superior performance compared to state-of-the-art approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aggarwal, J., Ryoo, M.: Human activity analysis: a review. ACM Comput. Surv. 43, 1–43 (2011)

    Article  Google Scholar 

  2. Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. TSP 54, 4311–4322 (2006)

    Google Scholar 

  3. Cheung, G., Baker, S., Kanade, T.: Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture. In: CVPR (2003)

    Google Scholar 

  4. Ding, C., Li, T.: Convex and semi-nonnegative matrix factorizations. PAMI 32, 45–55 (2010)

    Article  Google Scholar 

  5. Dollar, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: VS-PETS (2005)

    Google Scholar 

  6. Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV (2003)

    Google Scholar 

  7. Farhadi, A., Tabrizi, M.K.: Learning to recognize activities from the wrong view point. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 154–166. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  8. Farhadi, A., Tabrizi, M., Endres, I., Forsyth, D.: A latent model of discriminative aspect. In: ICCV (2009)

    Google Scholar 

  9. Gavrila, D., Davis, L.S.: 3D model-based tracking of humans in action: a multi-view approach. In: CVPR (1996)

    Google Scholar 

  10. Holte, M.B., Moeslund, T.B., Tran, C., Trivedi, M.: Human action recognition using multiple views: a comparative perspective on recent developments. In: HGBU (2011)

    Google Scholar 

  11. Ji, X., Liu, H.: Advances in view-invariant human motion analysis: a review. TCSVT 40, 13–24 (2010)

    Google Scholar 

  12. Junejo, I., Dexter, E., Laptev, I., Patrick, P.: View-independent action recognition from temporal self-similarities. PAMI 33, 172–185 (2011)

    Article  Google Scholar 

  13. Junejo, I.N., Dexter, E., Laptev, I., Pérez, P.: Cross-view action recognition from temporal self-similarities. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 293–306. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  14. Laptev, I., Lindeberg, T.: Space-time interest points. In: ICCV (2003)

    Google Scholar 

  15. Lee, D., Seung, H.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, vol. 13. MIT Press, Cambridge (2001)

    Google Scholar 

  16. Li, R., Zickler, T.: Discriminative virtual views for cross-view action recognition. In: CVPR (2012)

    Google Scholar 

  17. Lin, Z., Jiang, Z., Davis, L.: Recognizing actions by shape-motion prototype trees. In: ICCV (2009)

    Google Scholar 

  18. Liu, J., Ali, S., Shah, M.: Recognizing human actions using multiple features. In: CVPR (2008)

    Google Scholar 

  19. Liu, J., Shah, M.: Learning human actions via information maximization. In: CVPR (2008)

    Google Scholar 

  20. Liu, J., Shah, M., Kuipers, B., Savarese, S.: Cross-view action recognition via view knowledge transfer. In: CVPR (2011)

    Google Scholar 

  21. Lv, F., Nevatia, R.: Single view human action recognition using key pose matching and viterbi path searching. In: CVPR (2007)

    Google Scholar 

  22. Paramesmaran, V., Chellappa, R.: View invariance for human action recognition. IJCV 66, 83–101 (2006)

    Article  Google Scholar 

  23. Rao, C., Yilmaz, A., Shah, M.: View-invariant representation and recognition of actions. IJCV 50, 203–226 (2002)

    Article  MATH  Google Scholar 

  24. Tropp, J., Gilbert, A.: Signal recovery from random measurements via orthogonal matching pursuit. TIT 53, 4655–4666 (2007)

    MATH  MathSciNet  Google Scholar 

  25. Turaga, P., Chellappa, R., Subrahmanian, V., Udrea, O.: Machine recognition of human activities: a survey. TCSVT 18, 1473–1488 (2008)

    Google Scholar 

  26. Valera, M., Velastin, S.: Intelligent distributed surveillance systems: a review. VISP 152, 192–204 (2005)

    Google Scholar 

  27. Weinland, D., Boyer, E., Ronfard, R.: Action recognition from arbitrary views using 3D examplars. In: ICCV (2007)

    Google Scholar 

  28. Weinland, D., Ozuysal, M., Fua, P.: Making action recognition robust to occlusions and viewpoint changes. In: ECCV (2010)

    Google Scholar 

  29. Wright, J., Yang, A., Ganesh, A., Sastry, S., Ma, Y.: Robust face recognition via sparse representation. PAMI 31, 210–227 (2009)

    Article  Google Scholar 

  30. Yan, P., Khan, S.M., Shah, M.: Learning 4D action feature models for arbitrary view action recognition. In: CVPR (2008)

    Google Scholar 

  31. Yilmaz, A., Shah, M.: Actions sketch: a novel action representation. In: CVPR (2005)

    Google Scholar 

  32. Zhang, Z., Wang, Y., Zhang, Z.: Face synthesis from near-infrared to visual light via sparse representation. In: IJCB (2011)

    Google Scholar 

  33. Zheng, J., Jiang, Z.: Learning view-invariant sparse representations for cross-view action recognition. In: ICCV (2013)

    Google Scholar 

  34. Zheng, J., Jiang, Z., Phillips, P., Chellappa, R.: Cross-view action recognition via a transferable dictionary pair. In: BMVC (2012)

    Google Scholar 

Download references

Acknowledgement

This work is supported by National Natural Science Foundation of China (No. 61172141), Key Projects in the National Science & Technology Pillar Program during the 12th Five-Year Plan Period (No. 2012BAK16B06), and Science and Technology Program of Guangzhou, China (2014J4100092).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huicheng Zheng .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material (pdf 59 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Zhang, C., Zheng, H., Lai, J. (2015). Cross-view Action Recognition via Dual-Codebook and Hierarchical Transfer Framework. In: Cremers, D., Reid, I., Saito, H., Yang, MH. (eds) Computer Vision -- ACCV 2014. ACCV 2014. Lecture Notes in Computer Science(), vol 9007. Springer, Cham. https://doi.org/10.1007/978-3-319-16814-2_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16814-2_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16813-5

  • Online ISBN: 978-3-319-16814-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics