Learning Camera Pose from Optical Colonoscopy Frames Through Deep Convolutional Neural Network (CNN)

Armin, Mohammad Ali; Barnes, Nick; Alvarez, Jose; Li, Hongdong; Grimpen, Florian; Salvado, Olivier

doi:10.1007/978-3-319-67543-5_5

Learning Camera Pose from Optical Colonoscopy Frames Through Deep Convolutional Neural Network (CNN)

Mohammad Ali Armin^28,29,
Nick Barnes^28,31,
Jose Alvarez²⁸,
Hongdong Li³¹,
Florian Grimpen³⁰ &
…
Olivier Salvado²⁹

Conference paper
First Online: 08 September 2017

1463 Accesses
7 Citations

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10550))

Abstract

Optical colonoscopy is performed by insertion of a long flexible colonoscope into the colon. Estimating the position of the colonoscope tip with respect to the colon surface is important as it would help localization of cancerous polyps for subsequent surgery and facilitate navigation. Knowing camera pose is also essential for 3D automatic scene reconstruction, which could support clinicians inspecting the whole colon surface thereby reducing missed polyps. This paper presents a method to estimate the pose of the colonoscope camera with six degrees of freedom (DoF) using deep convolutional neural network (CNN). Because obtaining a ground truth to train the CNN for camera pose from actual colonoscopy videos is extremely challenging, we trained the CNN using realistic synthetic videos generated with a colonoscopy simulator, which could generate the exact camera pose parameters. We validated the trained CNN on unseen simulated video datasets and on actual colonoscopy videos from 10 patients. Our results showed that the colonoscopy camera pose could be estimated with higher accuracy and speed than feature based computer vision methods such as the classical structure from motion (SfM) pipeline. This paper demonstrates that transfer learning from surgical simulation to actual endoscopic based surgery is a possible approach for deep learning technologies.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Australian Institute of Health and Welfare. http://www.aihw.gov.au/
World Health Organization (WHO). Fact sheet # 297: Cancer. http://www.who.int/mediacentre/factsheets/fs297/en/
Hewett, D.G., Kahi, C.J., Rex, D.K.: Does colonoscopy work? J. Natl. Compr. Cancer Netw. JNCCN 8, 67–76 (2010). quiz 77
Article Google Scholar
Cotton, P.B., Williams, C.B.: Practical Gastrointestinal Endoscopy. Wiley-Blackwell, Oxford (2008)
Book Google Scholar
Puerto-Souza, G.A., Staranowicz, A.N., Bell, C.S., Valdastri, P., Mariottini, G.-L.: A comparative study of ego-motion estimation algorithms for teleoperated robotic endoscopes. In: Luo, X., Reichl, T., Mirota, D., Soper, T. (eds.) CARE 2014. LNCS, vol. 8899, pp. 64–76. Springer, Cham (2014). doi:10.1007/978-3-319-13410-9_7
Google Scholar
Liu, J., Subramanian, K.R., Yoo, T.S.: A robust method to track colonoscopy videos with non-informative images. Int. J. Comput. Assist. Radiol. Surg. 8, 575–592 (2013)
Article Google Scholar
Armin, M.A., Chetty, G., De Visser, H., Dumas, C., Grimpen, F., Salvado, O.: Automated visibility map of the internal colon surface from colonoscopy video. Int. J. Comput. Assist. Radiol. Surg. 11, 1599–1610 (2016)
Article Google Scholar
Rai, L., Helferty, J.P., Higgins, W.E.: Combined video tracking and image-video registration for continuous bronchoscopic guidance. Int. J. Comput. Assist. Radiol. Surg. 3, 315–329 (2008)
Article Google Scholar
Bao, G., Pahlavan, K., Mi, L.: Hybrid localization of microrobotic endoscopic capsule inside small intestine by data fusion of vision and RF sensors. IEEE Sens. J. 15, 2669–2678 (2015)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Aubry, M., Maturana, D., Efros, A.A., Russell, B.C., Sivic, J.: Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models, June 2014
Google Scholar
Dosovitskiy, A., Fischery, P., Ilg, E., Hazirbas, C., Golkov, V., van der Smagt, P., Cremers, D., Brox, T.: Flownet: learning optical flow with convolutional networks. In: 2015 IEEE International Conference on Computer Vision (ICCV), pp. 2758–2766. IEEE (2015)
Google Scholar
Zhou, T., Krähenbühl, P., Aubry, M., Huang, Q., Efros, A.A.: Learning Dense Correspondence via 3D-guided Cycle Consistency. ArXiv Prepr. arXiv:1604.05383 (2016)
Bell, C.S., Obstein, K.L., Valdastri, P.: Image partitioning and illumination in image-based pose detection for teleoperated flexible endoscopes. Artif. Intell. Med. 59, 185–196 (2013)
Article Google Scholar
Kendall, A., Grimes, M., Cipolla, R.: Convolutional networks for real-time 6-DOF camera relocalization. Proceedings of the International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Su, H., Qi, C.R., Li, Y., Guibas, L.J.: Render for CNN: viewpoint estimation in images using CNNs trained with rendered 3D model views. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2686–2694 (2015)
Google Scholar
Mayer, N., Ilg, E., Hausser, P., Fischer, P., Cremers, D., Dosovitskiy, A., Brox, T.: A large dataset to train convolutional networks for disparity, optical flow, and scene flow estimation. In: Presented at the Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Armin, M.A., De Visser, H., Chetty, G., Dumas, C., Conlan, D., Grimpen, F., Salvado, O.: Visibility map: a new method in evaluation quality of optical colonoscopy. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9349, pp. 396–404. Springer, Cham (2015). doi:10.1007/978-3-319-24553-9_49
Chapter Google Scholar
Liu, C., Yuen, J., Torralba, A., Sivic, J., Freeman, W.T.: SIFT flow: dense correspondence across different scenes. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008. LNCS, vol. 5304, pp. 28–42. Springer, Heidelberg (2008). doi:10.1007/978-3-540-88690-7_3
Chapter Google Scholar
Armin, M.A., Chetty, G., Jurgen, F., De Visser, H., Dumas, C., Fazlollahi, A., Grimpen, F., Salvado, O.: Uninformative frame detection in colonoscopy through motion, edge and color features. In: Luo, X., Reichl, T., Reiter, A., Mariottini, G.-L. (eds.) CARE 2015. LNCS, vol. 9515, pp. 153–162. Springer, Cham (2016). doi:10.1007/978-3-319-29965-5_15
Chapter Google Scholar
Huynh, D.Q.: Metrics for 3D rotations: comparison and analysis. J. Math. Imaging Vis. 35, 155–164 (2009)
Article MathSciNet Google Scholar
Vedaldi, A., Lenc, K.: MatConvNet: Convolutional Neural Networks for MATLAB (2015)
Google Scholar
De Visser, H., Passenger, J., Conlan, D., Russ, C., Hellier, D., Cheng, M., Acosta, O., Ourselin, S., Salvado, O.: Developing a next generation colonoscopy simulator. Int. J. Image Graph. 10, 203–217 (2010)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

CSIRO (Data61), Canberra, Australia
Mohammad Ali Armin, Nick Barnes & Jose Alvarez
Biomedical Informatics Group, Brisbane, Australia
Mohammad Ali Armin & Olivier Salvado
Department of Gastroenterology and Hepatology, Royal Brisbane and Women’s Hospital, Brisbane, Australia
Florian Grimpen
College of Engineering and Computer Science (ANU), Canberra, Australia
Nick Barnes & Hongdong Li

Authors

Mohammad Ali Armin
View author publications
You can also search for this author in PubMed Google Scholar
Nick Barnes
View author publications
You can also search for this author in PubMed Google Scholar
Jose Alvarez
View author publications
You can also search for this author in PubMed Google Scholar
Hongdong Li
View author publications
You can also search for this author in PubMed Google Scholar
Florian Grimpen
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Salvado
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mohammad Ali Armin .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
M. Jorge Cardoso
McGill University, Montreal, Québec, Canada
Tal Arbel
Xiamen University, Xiamen, China
Xiongbiao Luo
Fraunhofer IGD, Darmstadt, Hessen, Germany
Stefan Wesarg
KUKA Laboratories GmbH, Augsburg, Germany
Tobias Reichl
ICREA - Universitat Pompeu Fabra, Barcelona, Spain
Miguel Ángel González Ballester
University of Western Ontario, London, Ontario, Canada
Jonathan McLeod
Fraunhofer IGD, Darmstadt, Hessen, Germany
Klaus Drechsler
University of Western Ontario, London, Ontario, Canada
Terry Peters
Fraunhofer, Singapore, Singapore
Marius Erdt
Nagoya University, Nagoya, Japan
Kensaku Mori
Children's National Health System, Washington, DC, USA
Marius George Linguraru
University of Salzburg, Salzburg, Austria
Andreas Uhl
Fraunhofer IGD, Darmstadt, Germany
Cristina Oyarzun Laura
Children's National Health System, Washington, DC, USA
Raj Shekhar

1 Electronic Supplementary Material

Supplementary material 1 (AVI 16369 kb)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Armin, M.A., Barnes, N., Alvarez, J., Li, H., Grimpen, F., Salvado, O. (2017). Learning Camera Pose from Optical Colonoscopy Frames Through Deep Convolutional Neural Network (CNN). In: Cardoso, M., et al. Computer Assisted and Robotic Endoscopy and Clinical Image-Based Procedures. CARE CLIP 2017 2017. Lecture Notes in Computer Science(), vol 10550. Springer, Cham. https://doi.org/10.1007/978-3-319-67543-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-67543-5_5
Published: 08 September 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67542-8
Online ISBN: 978-3-319-67543-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics