Skip to main content

FlowCap: 2D Human Pose from Optical Flow

  • Conference paper
  • First Online:
Pattern Recognition (DAGM 2015)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9358))

Included in the following conference series:

Abstract

We estimate 2D human pose from video using only optical flow. The key insight is that dense optical flow can provide information about 2D body pose. Like range data, flow is largely invariant to appearance but unlike depth it can be directly computed from monocular video. We demonstrate that body parts can be detected from dense flow using the same random forest approach used by the Microsoft Kinect. Unlike range data, however, when people stop moving, there is no optical flow and they effectively disappear. To address this, our FlowCap method uses a Kalman filter to propagate body part positions and velocities over time and a regression method to predict 2D body pose from part centers. No range sensor is required and FlowCap estimates 2D human pose from monocular video sources containing human motion. Such sources include hand-held phone cameras and archival television video. We demonstrate 2D body pose estimation in a range of scenarios and show that the method works with real-time optical flow. The results suggest that optical flow shares invariances with range data that, when complemented with tracking, make it valuable for pose estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Notes

  1. 1.

    http://ps.is.tuebingen.mpg.de/project/FlowCap.

  2. 2.

    http://ps.is.tuebingen.mpg.de/project/FlowCap.

References

  1. http://gpu4vision.icg.tugraz.at

  2. Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.: A database and evaluation methodology for optical flow. IJCV 92(1), 1–31 (2011)

    Article  Google Scholar 

  3. Bissacco, A., Yang, M.-H., Soatto, S.: Fast human pose estimation using appearance and motion via multi-dimensional boosting regression. In: CVPR, pp. 1–8 (2007)

    Google Scholar 

  4. Bogo, F., Romero, J., Loper, M., Black, M.J.: FAUST: dataset and evaluation for 3D mesh registration. In: CVPR, pp. 3794–3801 (2014)

    Google Scholar 

  5. Bradski, G.R., Davis, J.W.: Motion segmentation and pose recognition with motion history gradients. Mach. Vis. Appl. 13(3), 174–184 (2002)

    Article  Google Scholar 

  6. Bregler, C., Malik, J.: Tracking people with twists and exponential maps. In: CVPR, pp. 8–15 (1998)

    Google Scholar 

  7. Efros, A., Berg, A., Mori, G., Malik, J.: Recognizing action at a distance. In: ICCV, pp. 726–733 (2003)

    Google Scholar 

  8. Eichner, M., Marin-Jimenez, M., Zisserman, A., Ferrari, V.: 2D articulated human pose estimation and retrieval in (almost) unconstrained still images. IJCV 99, 190–214 (2012)

    Article  MathSciNet  Google Scholar 

  9. Elhayek, A., Aguiar, E., Jain, A., Tompson, J., Pishchulin, L., Andriluka, M., Bregler, C., Schiele, B., Theobalt, C.: Efficient convnet-based marker-less motion capture in general scenes with a low number of cameras. In: CVPR, pp. 3810–3818 (2015)

    Google Scholar 

  10. Fablet, R., Black, M.J.: Automatic detection and tracking of human motion with a view-based representation. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002, Part I. LNCS, vol. 2350, pp. 476–491. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  11. Felzenszwalb, P. Girshick, R., McAllester, D.: Cascade object detection with deformable part models. In: CVPR, pp. 2241–2248 (2010)

    Google Scholar 

  12. Ferrari, V., Marin-Jimenez, M., Zisserman, A.: Progressive search space reduction for human pose estimation. In: CVPR, pp. 1–8 (2008)

    Google Scholar 

  13. Fragkiadaki, K., Hu, H., Shi, J.: Pose from flow and flow from pose estimation. In: CVPR, pp. 2059–2066 (2013)

    Google Scholar 

  14. Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M.J.: DRAPE: DRessing any PErson. SIGGRAPH 31(4), 35:1–35:10 (2012)

    Google Scholar 

  15. Hirshberg, D.A., Loper, M., Rachlin, E., Black, M.J.: Coregistration: simultaneous alignment and modeling of articulated 3D shape. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 242–255. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Jain, A., Tompson, J., LeCun, Y., Bregler, C.: MoDeep: a deep learning framework using motion features for human pose estimation. In: Cremers, D., Reid, I., Saito, H., Yang, M.-H. (eds.) ACCV 2014. LNCS, vol. 9004, pp. 302–315. Springer, Heidelberg (2015)

    Chapter  Google Scholar 

  17. Ju, S., Black, M.J., Yacoob, Y.: Cardboard people: a parameterized model of articulated motion. In: Face and Gesture, pp. 38–44 (1996)

    Google Scholar 

  18. Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in python. JMLR 12, 2825–2830 (2011)

    MathSciNet  Google Scholar 

  19. Sapp, B., Weiss, D., Taskar, B.: Parsing human motion with stretchable models. In: CVPR, pp. 1281–1288 (2011)

    Google Scholar 

  20. Schwarz, L., Mkhitaryan, A., Mateus, D., Navab, N.: Estimating human 3D pose from time-of-flight images based on geodesic distances and optical flow. In: Face and Gesture, pp. 700–706 (2011)

    Google Scholar 

  21. Shotton, J., Girshick, R., Fitzgibbon, A., Sharp, T., Cook, M., Finocchio, M., Moore, R., Kohli, P., Criminisi, A., Kipman, A., Blake, A.: Efficient human pose estimation from single depth images. PAMI 35(12), 2821–2840 (2013)

    Article  Google Scholar 

  22. Sidenbladh, H., Black, M.J., Fleet, D.J.: Stochastic tracking of 3D human figures using 2D image motion. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1843, pp. 702–718. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  23. Sigal, L., Balan, A., Black, M.J.: HumanEva: Synchronized video and motion capture dataset and baseline algorithm for evaluation of articulated human motion. IJCV 87(1), 4–27 (2010)

    Article  Google Scholar 

  24. Wachter, S., Nagel, H.: Tracking persons in monocular image sequences. CVIU 74(3), 174–192 (1999)

    Google Scholar 

  25. Welch, G., Bishop, G.: An introduction to the Kalman filter. UNC, TR 95–041, (2006)

    Google Scholar 

  26. Werlberger, M., Trobin, W., Pock, T., Wedel, A., Cremers, D., Bischof, H.: Anisotropic Huber-L1 optical flow. In: BMVC, pp. 108.1–108.11 (2009)

    Google Scholar 

  27. Yang,Y., Ramanan, D.: Articulated pose estimation using flexible mixtures of parts. In: CVPR, pp. 1385–1392 (2011)

    Google Scholar 

  28. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  Google Scholar 

  29. Zuffi, S., Romero, J., Schmid, C., Black, M.J.: Estimating human pose with flowing puppets. In: ICCV, pp. 3312–3319 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Javier Romero .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Romero, J., Loper, M., Black, M.J. (2015). FlowCap: 2D Human Pose from Optical Flow. In: Gall, J., Gehler, P., Leibe, B. (eds) Pattern Recognition. DAGM 2015. Lecture Notes in Computer Science(), vol 9358. Springer, Cham. https://doi.org/10.1007/978-3-319-24947-6_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-24947-6_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-24946-9

  • Online ISBN: 978-3-319-24947-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics