Abstract
Human pose estimation in the operating room (OR) can benefit many applications, such as surgical activity recognition, radiation exposure monitoring and performance assessment. However, the OR is a very challenging environment for computer vision systems due to limited camera positioning possibilities, severe illumination changes and similar colors of clothes and equipments. This paper tackles the problem of human pose estimation in the OR using RGB-D images, hypothesizing that the combination of depth and color information will improve the pose estimation results in such a difficult environment. We propose an approach based on pictorial structures that makes use of both channels of the RGB-D camera and also introduce a new feature descriptor for depth images, called histogram of depth differences (HDD), that captures local depth level changes. To quantitatively evaluate the proposed approach, we generate a novel dataset by manually annotating images recorded from different camera views during several days of live surgeries. Our experiments show that the pictorial structures (PS) approach applied on depth images using HDD outperforms the state-of-the art PS approach applied on the corresponding color images by over 11%. Furthermore, the proposed HDD descriptor has superior performance when compared to two other classical descriptors applied on depth images. Finally, the appearance models generated from the depth images perform better than those generated from the color images, and the combination of both improves the overall results. We therefore conclude that it is highly beneficial to use depth information in the pictorial structure model and also for human pose estimation in operating rooms.
Chapter PDF
Similar content being viewed by others
References
Andriluka, M., Roth, S., Schiele, B.: Discriminative appearance models for pictorial structures. Int. J. Comput. Vision 99(3) (2012)
Felzenszwalb, P.F., Huttenlocher, D.P.: Pictorial structures for object recognition. Int. J. Comput. Vision 61(1), 55–79 (2005)
Kadkhodamohammadi, A., Gangi, A., de Mathelin, M., Padoy, N.: Temporally consistent 3D pose estimation in the interventional room using discrete MRF optimization over RGBD sequences. In: Stoyanov, D., Collins, D.L., Sakuma, I., Abolmaesumi, P., Jannin, P. (eds.) IPCAI 2014. LNCS, vol. 8498, pp. 168–177. Springer, Heidelberg (2014)
Lea, C., Facker, J.C., Hager, G.D., Taylor, R.H., Saria, S.: 3D sensing algorithms towards building an intelligent intensive care unit. In: AMIA CRI (2013)
Loy Rodas, N., Padoy, N.: Seeing is believing: increasing intraoperative awareness to scattered radiation in interventional procedures by combining augmented reality, monte carlo simulations and wireless dosimeters. International Journal of Computer Assisted Radiology and Surgery, 1–11 (2015)
Mason, J., Ansell, J., Warren, N., Torkington, J.: Is motion analysis a valid tool for assessing laparoscopic skill? Surgical Endoscopy 27(5), 1468–1477 (2013)
Noonan, D.P., Mylonas, G.P., Darzi, A., Yang, G.Z.: Gaze contingent articulated robot control for robot assisted min. invasive surgery. In: IROS (2008)
Padoy, N., Mateus, D., Weinland, D., Berger, M.O., Navab, N.: Workflow Monitoring based on 3D Motion Features. In: VOEC-ICCV, pp. 585–592 (2009)
Schwarz, L., Bigdelou, A., Navab, N.: Learning gestures for customizable human-computer interaction in the operating room. In: Fichtinger, G., Martel, A., Peters, T. (eds.) MICCAI 2011, Part I. LNCS, vol. 6891, pp. 129–136. Springer, Heidelberg (2011)
Shotton, J., Fitzgibbon, A.W., Cook, M., Sharp, T., Finocchio, M., Moore, R., Kipman, A., Blake, A.: Real-time human pose recognition in parts from single depth images. In: CVPR, pp. 1297–1304 (2011)
Tang, S., Wang, X., Lv, X., Han, T., Keller, J., He, Z., Skubic, M., Lao, S.: Histogram of oriented normal vectors for object recognition with a depth sensor. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part II. LNCS, vol. 7725, pp. 525–538. Springer, Heidelberg (2013)
Toshev, A., Szegedy, C.: DeepPose: Human pose estimation via deep neural networks. In: CVPR (2014)
Yang, Y., Ramanan, D.: Articulated human detection with flexible mixtures of parts. PAMI 35(12), 2878–2890 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Kadkhodamohammadi, A., Gangi, A., de Mathelin, M., Padoy, N. (2015). Pictorial Structures on RGB-D Images for Human Pose Estimation in the Operating Room. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9349. Springer, Cham. https://doi.org/10.1007/978-3-319-24553-9_45
Download citation
DOI: https://doi.org/10.1007/978-3-319-24553-9_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24552-2
Online ISBN: 978-3-319-24553-9
eBook Packages: Computer ScienceComputer Science (R0)