Skip to main content

AGNet: Attention-Guided Network for Surgical Tool Presence Detection

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10553))

Abstract

We propose a novel approach to automatically recognize the presence of surgical tools in surgical videos, which is quite challenging due to the large variation and partially appearance of surgical tools, the complicated surgical scenes, and the co-occurrence of some tools in the same frame. Inspired by human visual attention mechanism, which first orients and selects some important visual cues and then carefully analyzes these focuses of attention, we propose to first leverage a global prediction network to obtain a set of visual attention maps and a global prediction for each tool, and then harness a local prediction network to predict the presence of tools based on these attention maps. We apply a gate function to obtain the final prediction results by balancing the global and the local predictions. The proposed attention-guided network (AGNet) achieves state-of-the-art performance on m2cai16-tool dataset and surpasses the winner in 2016 by a significant margin.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   69.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR, pp. 770–778 (2016)

    Google Scholar 

  2. Letouzey, A., Decrouez, M., Agustinos, A., Voros, S.: Instruments localisation and identification for laparoscopic surgeries (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Letouzey-Tool.pdf

  3. Luo, H., Hu, Q., Jia, F.: Surgical tool detection via multiple convolutional neural networks (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Luo-Tool.pdf

  4. Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)

    Article  Google Scholar 

  5. Raju, A., Wang, S., Huang, J.: M2CAI surgical tool detection challenge report (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Raju-Tool.pdf

  6. Rosen, M.L., Stern, C.E., Michalka, S.W., Devaney, K.J., Somers, D.C.: Cognitive Control Network Contributions to Memory-Guided Visual Attention. Cerebral Cortex, New York (2015). bhv028

    Google Scholar 

  7. Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., et al.: Imagenet large scale visual recognition challenge. IJCV 115(3), 211–252 (2015)

    Article  MathSciNet  Google Scholar 

  8. Sahu, M., Mukhopadhyay, A., Szengel, A., Zachow, S.: Tool and phase recognition using contextual CNN features. arXiv preprint arXiv:1610.08854 (2016)

  9. Shrivastava, A., Gupta, A., Girshick, R.: Training region-based object detectors with online hard example mining. In: CVPR, pp. 761–769 (2016)

    Google Scholar 

  10. Twinanda, A.P., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: Single-and multi-task architectures for tool presence detection challenge at M2CAI 2016. arXiv preprint arXiv:1610.08851 (2016)

  11. Twinanda, A.P., Shehata, S., Mutter, D., Marescaux, J., de Mathelin, M., Padoy, N.: EndoNet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans. Med. Imaging 36(1), 86–97 (2017)

    Article  Google Scholar 

  12. Zia, A., Castro, D., Essa, I.: Fine-tuning deep architectures for surgical tool detection (2016). http://camma.u-strasbg.fr/m2cai2016/reports/Zia-Tool.pdf

Download references

Acknowledgements

The work described in this paper was supported by the following grants from the Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CUHK 14202514 and CUHK 14203115).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiaowei Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Hu, X., Yu, L., Chen, H., Qin, J., Heng, PA. (2017). AGNet: Attention-Guided Network for Surgical Tool Presence Detection. In: Cardoso, M., et al. Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support . DLMIA ML-CDS 2017 2017. Lecture Notes in Computer Science(), vol 10553. Springer, Cham. https://doi.org/10.1007/978-3-319-67558-9_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-67558-9_22

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-67557-2

  • Online ISBN: 978-3-319-67558-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics