Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Assisted phase and step annotation for surgical videos

  • 49 Accesses



Annotation of surgical videos is a time-consuming task which requires specific knowledge. In this paper, we present and evaluate a deep learning-based method that includes pre-annotation of the phases and steps in surgical videos and user assistance in the annotation process.


We propose a classification function that automatically detects errors and infers temporal coherence in predictions made by a convolutional neural network. First, we trained three different architectures of neural networks to assess the method on two surgical procedures: cholecystectomy and cataract surgery. The proposed method was then implemented in an annotation software to test its ability to assist surgical video annotation. A user study was conducted to validate our approach, in which participants had to annotate the phases and the steps of a cataract surgery video. The annotation and the completion time were recorded.


The participants who used the assistance system were 7% more accurate on the step annotation and 10 min faster than the participants who used the manual system. The results of the questionnaire showed that the assistance system did not disturb the participants and did not complicate the task.


The annotation process is a difficult and time-consuming task essential to train deep learning algorithms. In this publication, we propose a method to assist the annotation of surgical workflows which was validated through a user study. The proposed assistance system significantly improved annotation duration and accuracy.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2


  1. 1.

    Lalys F, Jannin P (2014) Surgical process modelling: a review. Int J Comput Assist Radiol Surg 9(3):495–511

  2. 2.

    Loukas C (2018) Video content analysis of surgical procedures. Surg Endosc 32(2):553–568

  3. 3.

    Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

  4. 4.

    Dergachyova O, Bouget D, Huaulmé A, Morandi X, Jannin P (2016) Automatic data-driven real-time segmentation and recognition of surgical workflow. Int J Comput Assist Radiol Surg 11(6):1081–1089

  5. 5.

    Charrière K, Quellec G, Lamard M, Martiano D, Cazuguel G, Coatrieux G, Cochener B (2017) Real-time analysis of cataract surgery videos using statistical models. Multimed Tools Appl 76(21):22473–22491

  6. 6.

    Meeuwsen F, van Luyn F, Blikkendaal MD, Jansen F, van den Dobbelsteen J (2019) Surgical phase modelling in minimal invasive surgery. Surg Endosc 33(5):1426–1432

  7. 7.

    Twinanda AP, Shehata S, Mutter D, Marescaux J, De Mathelin M, Padoy N (2016) Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36(1):86–97

  8. 8.

    Lea C, Choi J.H, Reiter A, Hager G (2016) Surgical phase recognition: from instrumented ors to hospitals around the world. In: Medical image computing and computer-assisted intervention M2CAI—MICCAI workshop, pp 45–54

  9. 9.

    Chen Y, Sun QL, Zhong K (2018) Semi-supervised spatio-temporal CNN for recognition of surgical workflow. EURASIP J Image Video Process 2018(1):76

  10. 10.

    Yengera G, Mutter D, Marescaux J, Padoy N (2018) Less is more: surgical phase recognition with less annotations through self-supervised pre-training of CNN–LSTM networks. arXiv preprint arXiv:1805.08569

  11. 11.

    Yu T, Mutter D, Marescaux J, Padoy N (2018) Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition. arXiv preprint arXiv:1812.00033

  12. 12.

    Lalys F, Bouget D, Riffaud L, Jannin P (2013) Automatic knowledge-based recognition of low-level tasks in ophthalmological procedures. Int J Comput Assist Radiol Surg 8(1):39–49

  13. 13.

    Quellec G, Lamard M, Cochener B, Cazuguel G (2014) Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans Med Imaging 33(12):2352–2360

  14. 14.

    Yu F, Croso GS, Kim TS, Song Z, Parker F, Hager GD, Reiter A, Vedula SS, Ali H, Sikder S (2019) Assessment of automated identification of phases in videos of cataract surgery using machine learning and deep learning techniques. JAMA Netw Open 2(4):e191860–e191860

  15. 15.

    Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018) Deepphase: surgical phase recognition in cataracts videos. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 265–272

  16. 16.

    Primus M.J, Putzgruber-Adamitsch D, Taschwer M, Münzer B, El-Shabrawi Y, Böszörmenyi L, Schoeffmann K (2018) Frame-based classification of operation phases in cataract surgery videos. In: International conference on multimedia modeling. Springer, Berlin, pp 241–253

  17. 17.

    Bianco S, Ciocca G, Napoletano P, Schettini R, Margherita R, Marini G, Pantaleo G (2013) Cooking action recognition with iVAT: an interactive video annotation tool. In: International conference on image analysis and processing. Springer, Berlin, pp 631–641

  18. 18.

    D’Orazio T, Leo M, Mosca N, Spagnolo P, Mazzeo P.L (2009) A semi-automatic system for ground truth generation of soccer video sequences. In: 2009 sixth IEEE international conference on advanced video and signal based surveillance. IEEE, pp 559–564

  19. 19.

    Kavasidis I, Palazzo S, Di Salvo R, Giordano D, Spampinato C (2014) An innovative web-based collaborative platform for video annotation. Multimed Tools Appl 70(1):413–432

  20. 20.

    Vondrick C, Patterson D, Ramanan D (2013) Efficiently scaling up crowdsourced video annotation. Int J Comput Vis 101(1):184–204

  21. 21.

    Neumuth T, Durstewitz N, Fischer M, Strauß G, Dietz A, Meixensberger J, Jannin P, Cleary K, Lemke HU, Burgert O (2006) Structured recording of intraoperative surgical workflows. In: Medical imaging 2006: PACS and imaging informatics, vol 6145. International Society for Optics and Photonics, p 61450A

  22. 22.

    Garraud C, Gibaud B, Penet C, Gazuguel G, Dardenne G, Jannin P (2014) An ontology-based software suite for the analysis of surgical process model. In: Proceedings of surgetica, pp 243–245

  23. 23.

    Hajj HA, Lamard M, Conze PH, Roychowdhury S, Hu X, Maršalkaitė G, Zisimopoulos O, Dedmari MA, Zhao F, Prellberg J, Sahu M, Galdran A, Araújo T, Vo DM, Panda C, Dahiya N, Kondo S, Bian Z, Vahdat A, Bialopetravičius J, Flouty E, Qiu C, Dill S, Mukhopadhyay A, Costa P, Aresta G, Ramamurthy S, Lee SW, Campilho A, Zachow S, Xia S, Conjeti S, Stoyanov D, Armaitis J, Heng PA, Macready WG, Cochener B, Quellec G (2019) Cataracts: challenge on automatic tool annotation for cataract surgery. Med Image Anal 52:24–41

  24. 24.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556

  25. 25.

    Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826

  26. 26.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  27. 27.

    Venkatesh V, Bala H (2008) Technology acceptance model 3 and a research agenda on interventions. Decis Sci 39(2):273–315

Download references


This study was supported by French state funds managed by ANR under the reference ANR-10-AIRT-07.

Author information

Correspondence to Gurvan Lecuyer.

Ethics declarations

Conflict of interest

Gurvan Lecuyer, Martin Ragot, Nicolas Martin, Laurent Launay and Pierre Jannin declare that they have no conflict of interest

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lecuyer, G., Ragot, M., Martin, N. et al. Assisted phase and step annotation for surgical videos. Int J CARS (2020).

Download citation


  • Assisted annotation
  • Surgical workflow
  • Phase recognition
  • Step recognition
  • Deep learning