Single Camera Hand Pose Estimation from Bottom-Up and Top-Down Processes

Periquito, Davide; Nascimento, Jacinto C.; Bernardino, Alexandre; Sequeira, João

doi:10.1007/978-3-662-44911-0_14

Davide Periquito⁶,
Jacinto C. Nascimento⁶,
Alexandre Bernardino⁶ &
…
João Sequeira⁶

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 458))

Included in the following conference series:

International Conference on Computer Vision, Imaging and Computer Graphics

753 Accesses

Abstract

In this paper we present a methodology for hand pose estimation from a single image, combining bottom-up and top-down processes. A fast bottom-up algorithm generates, from coarse visual cues, hypotheses about the possible locations and postures of hands in the images. The best ranked hypotheses are then analysed by a precise, but slower, top-down process. The complementary nature of bottom-up and top-down processes in terms of computational speed and precision permits the design of pose estimation algorithms with desirable characteristics, taking into account constraints in the available computational resources. We analyse the trade-off between precision and speed in a series of simulations and qualitatively illustrate the performance of the method with real imagery.

This work was supported by the European Commission project POETICON++ (FP7-ICT- 288382) and the Portuguese FCT projects [PEst-OE/EEI/LA0009/2011] and VISTA (PTDC/ EIA-EIA/105062/2008).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The time results shown in Table 1 (left) were obtained in a non-optimised Matlab code. This could be drastically reduced using a C++ base programming or by optimising the algorithm in order to take advantage of GPU and/or by using multi-core computation.

References

Turk, M.: Gesture recognition. In: Stanney, K.M. (ed.) Handbook of Virtual Environments: Design, Implementation, and Applications, pp. 223–238. Lawrence Erlbaum Associates, Mahwah (2002)
Google Scholar
Lenman, S., Bretzner, L., Thuresson, B.: Using marking menus to develop command sets for computer vision based hand gesture interfaces. In: 2nd Nordic Conference on Human- Computer Interaction, pp. 239–242. ACM Press (2002)
Google Scholar
Nielsen, M., Storring, M., Moeslund, T.B., Granum, E.: A procedure for developing intuitive and ergonomic gesture interfaces for HCI. In: 5th International Gesture Workshop, pp. 409–420 (2003)
Google Scholar
Quek, F., McNeill, D., Bryll, R., Duncan, S., Ma, X.-F., Kirbas, C., McCullough, K.E., Ansari, R.: Multimodal human discourse: gesture and speech. ACM Trans. Comput.-Hum. Interact. 9(3), 171–193 (2002)
Article Google Scholar
Bowman, D.: Principles for the design of performance-oriented interaction techniques. In: Stanney, K.M. (ed.) Handbook of Virtual Environments: Design, Implementation, and Applications, pp. 201–207. Lawrence Erlbaum Associates, Mahwah (2002)
Google Scholar
Buchmann, V., Violich, S., Billinghurst, M., Cockburn, A.: FingARtips: gesture based direct manipulation in augmented reality. In: 2nd International Conference on Computer Graphics and Interactive Techniques in Australasia and South East Asia, pp. 212–221. ACM Press (2004)
Google Scholar
Liu, A., Tendick, F., Cleary, K., Kaufmann, C.: A survey of surgical simulation: applications, technology, and education. Presence: Teleoper. Virtual Environ. 12(6), 599–614 (2003)
Article Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: A review on vision-based full DOF hand motion estimation. In: CVPR (2005)
Google Scholar
Rehg, J.M., Kanade, T.: Visual tracking of high DOF articulated structures: an application to human hand tracking. In: Eklundh, J.-O. (ed.) ECCV 1994. LNCS, vol. 801, pp. 35–46. Springer, Heidelberg (1994)
Chapter Google Scholar
Ramanan, D., Forsyth, D.A., Zisserman, A.: Tracking people by learning their appearance. IEEE Trans. PAMI 29(1), 65–81 (2007)
Article Google Scholar
Erol, A., Bebis, G., Nicolescu, M., Boyle, R.D., Twombly, X.: Vision-based hand pose estimation: a review. CVIU 108, 52–73 (2007)
Google Scholar
Carneiro, G., Nascimento, J.C.: Incremental on-line semi-supervised learning for segmenting the left ventricle of the heart from ultrasound data. In: ICCV (2013)
Google Scholar
Carneiro, G., Nascimento, J.C.: The use of incremental co-training to reduce the training set size in pattern recognition methods: application to left ventricle segmentation in ultrasound. In: CVPR (2012)
Google Scholar
O’Hagan, R.G., Zelinsky, A., Rougeaux, S.: Visual gesture interfaces for virtual environments. Interact. Comput. 14, 231–250 (2002)
Article Google Scholar
Sato, Y., Saito, M., Koik, H.: Real-time input of 3D pose and gestures of a user’s hand and its applications for HCI. In: Proceedings of the Virtual Reality 2001 Conference (VR’01), p. 79 (2001)
Google Scholar
Rehg, J., Kanade, T.: Digiteyes: vision-based hand tracking for human-computer interaction. In: Workshop on Motion of Non- Rigid and Articulated Bodies, pp. 16–24 (1994)
Google Scholar
Ouhaddi, H., Horain, P.: 3D hand gesture tracking by model registration. In: International Workshop on Synthetic-Natural Hybrid Coding and Three Dimensional Imaging (1999)
Google Scholar
Lin, J.Y., Wu, Y., Huang, T.S.: 3D model-based hand tracking using stochastic direct search method. In: 6th IEEE International Conference on Automatic Face and Gesture Recognition, p. 693 (2004)
Google Scholar
Stenger, B., Mendonca, P.R.S., Cipolla, R.: Model-based 3D tracking of an articulated hand. In: CVPR (2001)
Google Scholar
Lin, J., Wu, Y., Huang, T.S.: Capturing human hand motion in image sequences. In: Workshop on Motion and Video, Computing, pp. 99–104 (2002)
Google Scholar
Bray, M., Koller-Meier, E., Gool, L.V.: Smart particle filtering for 3D hand tracking. In: 6th IEEE International Conference on Automatic Face and Gesture Recognition, pp. 675–680 (2004)
Google Scholar
Stenger, B., Thayananthan, A., Torr, P.H.S., Cipolla, R.: Filtering using a tree-based estimator. In: ICCV, pp. 1063–1070 (2003)
Google Scholar
Thayananthan, A., Stenger, B., Torr, P.H.S., Cipolla, R.: Learning a kinematic prior for tree-based filtering. BMVC 2, 589–598 (2003)
Google Scholar
Sudderth, E.B., Mandel, M.I., Freeman, W.T., Willsky, A.S.: Visual hand tracking using nonparametric belief propagation. In: IEEE CVPR Workshop on Generative Model Based Vision, p. 189 (2004)
Google Scholar
Tomasi, C., Petrov, S., Sastry, A.: 3D tracking = classification + interpolation. ICCB 2, 1441–1448 (2003)
Google Scholar
Stenger, B., Thayananthan, A., Tor, P.H.S., Cipolla, R.: Hand Pose estimation using hierarchical detection. In: Sebe, N., Lew, M., Huang, T.S. (eds.) ECCV/HCI 2004. LNCS, vol. 3058, pp. 105–116. Springer, Heidelberg (2004)
Chapter Google Scholar
Athitsos, V., Sclaroff, S.: Estimating 3D hand pose from a cluttered image. In: CVPR, vol. 2, pp. 432–439 (2003)
Google Scholar
Zhou, H., Huang, T.: Okapi-chamfer matching for articulated object recognition. In: ICCV,pp. 1026–1033 (2005)
Google Scholar
Rosales, R., Athitsos, V., Sigal, L., Sclaroff, S.: 3D Hand pose reconstruction using specialized mappings. In: ICCV, vol. 1, pp. 378–385 (2001)
Google Scholar
Rosales, R., Sclaroff, S.: Algorithms for inference in specialized maps for recovering 3D hand Pose. In: 5th IEEE International Conference on Automatic Face and Gesture Recognition, p. 0143 (2002)
Google Scholar
Micilotta, A.S., Ong, E.-J., Bowden, R.: Real-time upper body detection and 3D Pose estimation in monoscopic images. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 139–150. Springer, Heidelberg (2006)
Chapter Google Scholar
Gavrila, D.M.: The visual analysis of human movement: a survey. CVIU 73, 82–98 (1999)
MATH Google Scholar
Gavrila, D.M., Davis, L.S.: Tracking of humans in action: a 3-D model-based approach. In: Proceedings of the ARPA Image Understanding, Workshop (1996)
Google Scholar
Delamarre, Q., Faugeras, O.: 3D articulated models and multi-view tracking with physical forces. CVIU 81(3), 328–357 (2001)
MATH Google Scholar
Borenstein, E., Ullman, S.: Combined top-down/bottom-up segmentation. IEEE Trans. PAMI 30(12), 4–18 (2008)
Article Google Scholar
Brandao, M., Bernardino, A., Santos-Victor, J.: Image driven generation of pose hypotheses for 3D model-based tracking. In: 12th IAPR Conference on Machine Vision Applications (2011)
Google Scholar
Poppe, R.: Vision-based human motion analysis: an overview. CVIU 108, 1–17 (2007)
Google Scholar
Ramanan, D., Forsyth, D.A., Zisserman, A.: Tracking people by learning their appearance. IEEE Trans. PAMI 29(1), 65–81 (2007)
Article Google Scholar
Kyrki, V.: Integration of model-based and model-free cues for visual object tracking in 3cd. In: International Conference on Robotics and Automation, pp. 1554–1560 (2005)
Google Scholar
Okuma, K., Taleghani, A., de Freitas, N., Little, J.J., Lowe, D.G.: A boosted particle filter: multitarget detection and tracking. In: Pajdla, T., Matas, J.G. (eds.) ECCV 2004. LNCS, vol. 3021, pp. 28–39. Springer, Heidelberg (2004)
Chapter Google Scholar
Diankov, R.: Openrave: a planning architecture for autonomous robotics. Technical report. Robotics Institute, Pittsburgh, PA (2008)
Google Scholar
Nascimento, J.C., Marques, J.S.: Robust shape tracking with multiple models in ultrasound images. IEEE Trans. Image process. 17(3), 392–406 (2008)
Article MathSciNet Google Scholar
Hammoude, A.: Computer-assited endocardial border identification from a sequence of two-dimensional echocardiographic images. Ph.D. thesis. University Washington (1988)
Google Scholar
Swain, M.J., Ballard, D.H.: Color Indexing. IJCV 7(1), 11–32 (1991)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Instituto de Sistemas e Robótica, Instituto Superior Técnico, Lisboa, Portugal
Davide Periquito, Jacinto C. Nascimento, Alexandre Bernardino & João Sequeira

Authors

Davide Periquito
View author publications
You can also search for this author in PubMed Google Scholar
Jacinto C. Nascimento
View author publications
You can also search for this author in PubMed Google Scholar
Alexandre Bernardino
View author publications
You can also search for this author in PubMed Google Scholar
João Sequeira
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Davide Periquito .

Editor information

Editors and Affiliations

Università di Catania, Catania, Catania, Italy
Sebastiano Battiato
Inria/ZIRST, Saint Ismier, France
Sabine Coquillart
Swansea University, Swansea, United Kingdom
Robert S. Laramee
Linnaeus University, Växjö, Sweden
Andreas Kerren
Escola Superior de Tecnologia do IPS, Setúbal, Portugal
José Braz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Periquito, D., Nascimento, J.C., Bernardino, A., Sequeira, J. (2014). Single Camera Hand Pose Estimation from Bottom-Up and Top-Down Processes. In: Battiato, S., Coquillart, S., Laramee, R., Kerren, A., Braz, J. (eds) Computer Vision, Imaging and Computer Graphics -- Theory and Applications. VISIGRAPP 2013. Communications in Computer and Information Science, vol 458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44911-0_14

Download citation

DOI: https://doi.org/10.1007/978-3-662-44911-0_14
Published: 30 September 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44910-3
Online ISBN: 978-3-662-44911-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics