Regularized Landmark Detection with CAEs for Human Pose Estimation in the Operating Room
Robust estimation of the human pose is a critical requirement for the development of context aware assistance and monitoring systems in clinical settings. Environments like operating rooms or intensive care units pose different visual challenges for the problem of human pose estimation such as frequent occlusions, clutter and difficult lighting conditions. Moreover, privacy concerns play a major role in health care applications and make it necessary to use unidentifiable data, e.g. blurred RGB images or depth frames. Since, for this reason, the data basis is much smaller than for human pose estimation in common scenarios, pose priors could be beneficial for regularization to train robust estimation models. In this work, we investigate to what extent existing pose estimation methods are suitable for the challenges of clinical environments and propose a CAE based regularization method to correct estimated poses that are anatomically implausible. We show that our models trained solely on depth images reach similar results on the MVOR dataset  as RGB based pose estimators while intrinsically being non-identifiable. In further experiments we prove that our CAE regularization can cope with several pose perturbations, e.g. missing parts or left-right flips of joints.
Unable to display preview. Download preview PDF.
- 1.Srivastav V, Issenhuth T, Kadkhodamohammadi A, et al. MVOR: a multiview RGB-D operating room dataset for 2D and 3D human pose estimation. arXiv:180808180. 2018;.
- 2.Andriluka M, Pishchulin L, Gehler P, et al. 2D human pose estimation: new benchmark and state of the art analysis. Proc CVPR. 2014; p. 3686-3693.Google Scholar
- 3.Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. Proc CVPR. 2008; p. 1-8.Google Scholar
- 4.Pishchulin L, Andriluka M, Gehler P, et al. Strong appearance and expressive spatial models for human pose estimation. Proc ICCV. 2013; p. 3487-3494.Google Scholar
- 5.Toshev A, Szegedy C. Deeppose: human pose estimation via deep neural networks. Proc CVPR. 2014; p. 1653-1660.Google Scholar
- 6.Wei SE, Ramakrishna V, Kanade T, et al. Convolutional pose machines. Proc CVPR. 2016; p. 4724-4732.Google Scholar
- 7.Newell A, Yang K, Deng J. Stacked hourglass networks for human pose estimation. Proc ECCV. 2016; p. 483-499.Google Scholar
- 8.Masci J, Meier U, Cire_san D, et al. Stacked convolutional auto-encoders for hierarchical feature extraction. Int Conf Artif Neural Netw. 2011; p. 52-59.Google Scholar
- 10.Tekin B, Katircioglu I, Salzmann M, et al. Structured prediction of 3D human pose with deep neural networks. arXiv:160505180. 2016;.
- 11.Cao Z, Simon T, Wei SE, et al. Realtime multi-person 2D pose estimation using part affinity fields. Proc CCVPR. 2017; p. 7291-7299.Google Scholar