International Journal of Computer Vision

, Volume 100, Issue 1, pp 16-37

First online:

Coupled Action Recognition and Pose Estimation from Multiple Views

  • Angela YaoAffiliated withComputer Vision Laboratory, ETH Zurich Email author 
  • , Juergen GallAffiliated withComputer Vision Laboratory, ETH ZurichMax Planck Institute for Intelligent Systems
  • , Luc Van GoolAffiliated withComputer Vision Laboratory, ETH ZurichDepartment of Electrical Engineering/IBBT, K.U. Leuven

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for coupled action recognition and pose estimation by formulating pose estimation as an optimization over a set of action-specific manifolds. The framework allows for integration of a 2D appearance-based action recognition system as a prior for 3D pose estimation and for refinement of the action labels using relational pose features based on the extracted 3D poses. Our experiments show that our pose estimation system is able to estimate body poses with high degrees of freedom using very few particles and can achieve state-of-the-art results on the HumanEva-II benchmark. We also thoroughly investigate the impact of pose estimation and action recognition accuracy on each other on the challenging TUM kitchen dataset. We demonstrate not only the feasibility of using extracted 3D poses for action recognition, but also improved performance in comparison to action recognition using low-level appearance features.


Human pose estimation Human action recognition Tracking Stochastic optimization Hough transform