Abstract
We present an Augmented Reality solution to allow users to manipulate and inspect 3D virtual objects freely with their bare hands on wearable devices. To this end, we use a head-mounted depth camera to capture the RGB-D hand images from egocentric view, and propose a unified framework to jointly recover the 6D palm pose and recognize the hand gesture from the depth images. The random forest is utilized to regress for the palm pose and classify the hand gesture simultaneously via a spatial-voting framework. With a real-world annotated training dataset, the proposed method shows to predict the palm pose and gesture accurately. The output of the forest is used to render the 3D virtual objects, which are overlaid onto the hand region in input RGB images with camera calibration parameters to provide seamless virtual and real scene synthesis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Agarwal, A., Triggs, B.: Recovering 3D human pose from monocular images. IEEE Trans. Pattern Anal. Mach. Intell. 28(1), 44–58 (2006)
Akman, O., Poelman, R., Caarls, W., Jonker, P.: Multi-cue hand detection and tracking for a head-mounted augmented reality system. Mach. Vis. Appl. 24(5), 931–946 (2013)
Asad, M., Slabaugh, G.: Hand orientation regression using random forest for augmented reality. In: De Paolis, L.T. (ed.) Augmented and Virtual Reality. LNCS, vol. 8853, pp. 159–174. Springer, Switzerland (2014)
Baak, A., Müller, M., Bharaj, G., Seidel, H.-P., Theobalt, C.: A data-driven approach for real-time full body pose reconstruction from a depth camera. In: Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K. (eds.) Consumer Depth Cameras for Computer Vision. Advances in Computer Vision and Pattern Recognition, pp. 71–98. Springer, London (2013)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Chen, F.-S., Fu, C.-M., Huang, C.-L.: Hand gesture recognition using a real-time tracking method and hidden Markov models. Image vis. comput. 21(8), 745–758 (2003)
Chen, Q., Georganas, N.D., Petriu, E.M.: Real-time vision-based hand gesture recognition using Haar-like features. In: IEEE Instrumentation and Measurement Technology Conference Proceedings, pp. 1–6. IEEE (2007)
Davis, J.W., Bobick, A.E.: The representation and recognition of human movement using temporal templates. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 928–934. IEEE (1997)
de La Gorce, M., Fleet, D.J., Paragios, N.: Model-based 3D hand pose estimation from monocular video. IEEE Trans. Pattern Anal. Mach. Intell. 33(9), 1793–1805 (2011)
Freeman, W.T., Roth, M.: Orientation histograms for hand gesture recognition. In: International Workshop on Automatic Face and Gesture Recognition, vol. 12, pp. 296–301 (1995)
Guan, H., Feris, R.S., Turk, M.: The isometric self-organizing map for 3D hand pose estimation. In: IEEE International Conference on Automatic Face and Gesture Recognition, pp. 263–268. IEEE (2006)
Herdtweck, C., Curio, C.: Monocular car viewpoint estimation with circular regression forests. In: IEEE Intelligent Vehicles Symposium (2013)
Hsieh, C.C., Liou, D.H., Lee, D.: A real time hand gesture recognition system using motion history image. In: International Conference on Signal Processing Systems, pp. V2-394–V2-398. IEEE (2010)
Isard, M., Blake, A.: Condensation—conditional density propagation for visual tracking. Int. J. Comput. Vis. 29(1), 5–28 (1998)
Keskin, C., Kıraç, F., Kara, Y.E., Akarun, L.: Hand pose estimation and hand shape classification using multi-layered randomized decision forests. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part VI. LNCS, vol. 7577, pp. 852–863. Springer, Heidelberg (2012)
Kirac, F., Kara, Y.E., Akarun, L.: Hierarchically constrained 3D hand pose estimation using regression forests from single frame depth data. Pattern Recogn. Lett. 50, 91–100 (2014)
Kolsch, M.: Vision based hand gesture interfaces for wearable computing and virtual environments. Doctoral thesis, University of California, Santa Barbara (2004)
Lee, T., Hollerer, T.: Handy ar: markerless inspection of augmented reality objects using fingertip tracking. In: IEEE International Symposium on Wearable Computers, pp. 83–90 (2007)
Liang, H., Yuan, J., Thalmann, D.: Parsing the hand in depth images. IEEE Trans. Multimedia 16(5), 1241–1253 (2014)
Lin, J.Y., Wu, Y., Huang, T.S.: 3D model-based hand tracking using stochastic direct search method. In: IEEE International Conference On Automatic Face and Gesture Recognition, pp. 693–698. IEEE (2004)
Lo, R., Chen, A., Rampersad, V., Huang, J., Wu, H., Mann, S.: Augmediated reality system based on 3D camera selfgesture sensing. In: IEEE International Symposium on Technology and Society, pp. 20–31 (2013)
Oikonomidis, I., Kyriazis, N., Argyros, A.A.: Efficient model-based 3D tracking of hand articulations using kinect. In: British Machine Vision Conference, vol. 1, p. 3 (2011)
Pellegrini, S., Schindler, K., Nardi, D.: A generalisation of the icp algorithm for articulated bodies. In: British Machine Vision Conference, vol. 3, p. 4. Citeseer (2008)
Qian, C., Sun, X., Wei, Y., Tang, X., Sun, J.: Realtime and robust hand tracking from depth. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1106–1113. IEEE (2014)
Ren, Z., Yuan, J., Meng, J., Zhang, Z.: Robust part-based hand gesture recognition using kinect sensor. IEEE Trans. Multimedia 15(5), 1110–1120 (2013)
Schroder, M., Maycock, J., Ritter, H., Botsch, M.: Real-time hand tracking using synergistic inverse kinematics. In: IEEE International Conference on Robotics and Automation, pp. 5447–5454. IEEE (2014)
Sun, X., Wei, Y., Liang, S., Tang, X., Sun, J.: Cascaded hand pose regression. In: IEEE Conference on Computer Vision and Pattern Recognition (2015)
Tang, D., Yu, T.-H., Kim, T.-K.: Real-time articulated hand pose estimation using semi-supervised transductive regression forests. In: IEEE International Conference on Computer Vision, pp. 3224–3231. IEEE (2013)
Thayananthan, A., Navaratnam, R., Stenger, B., Torr, P.H., Cipolla, R.: Pose estimation and tracking using multivariate regression. Pattern Recogn. Lett. 29(9), 1302–1310 (2008)
Ueda, E., Matsumoto, Y., Imai, M., Ogasawara, T.: Hand pose estimation using multi-viewpoint silhouette images. In: IEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 4, pp. 1989–1996. IEEE (2001)
Wang, R., Paris, S., Popović, J.: 6D hands: markerless hand-tracking for computer aided design. In: The Annual ACM Symposium on User Interface Software and Technology, pp. 549–558. ACM (2011)
Wei, X., Zhang, P., Chai, J.: Accurate realtime full-body motion capture using a single depth camera. ACM Trans. Graph. 31(6), 188 (2012)
Xu, C., Cheng, L.: Efficient hand pose estimation from a single depth image. In: IEEE International Conference on Computer Vision, pp. 3456–3462. IEEE (2013)
Zhang, C., Yang, X., Tian, Y.: Histogram of 3D facets: a characteristic descriptor for hand gesture recognition. In: IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, pp. 1–8. IEEE (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Thalmann, D., Liang, H., Yuan, J. (2016). First-Person Palm Pose Tracking and Gesture Recognition in Augmented Reality. In: Braz, J., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2015. Communications in Computer and Information Science, vol 598. Springer, Cham. https://doi.org/10.1007/978-3-319-29971-6_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-29971-6_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-29970-9
Online ISBN: 978-3-319-29971-6
eBook Packages: Computer ScienceComputer Science (R0)