Annotation of surgical videos is a time-consuming task which requires specific knowledge. In this paper, we present and evaluate a deep learning-based method that includes pre-annotation of the phases and steps in surgical videos and user assistance in the annotation process.
We propose a classification function that automatically detects errors and infers temporal coherence in predictions made by a convolutional neural network. First, we trained three different architectures of neural networks to assess the method on two surgical procedures: cholecystectomy and cataract surgery. The proposed method was then implemented in an annotation software to test its ability to assist surgical video annotation. A user study was conducted to validate our approach, in which participants had to annotate the phases and the steps of a cataract surgery video. The annotation and the completion time were recorded.
The participants who used the assistance system were 7% more accurate on the step annotation and 10 min faster than the participants who used the manual system. The results of the questionnaire showed that the assistance system did not disturb the participants and did not complicate the task.
The annotation process is a difficult and time-consuming task essential to train deep learning algorithms. In this publication, we propose a method to assist the annotation of surgical workflows which was validated through a user study. The proposed assistance system significantly improved annotation duration and accuracy.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Lalys F, Jannin P (2014) Surgical process modelling: a review. Int J Comput Assist Radiol Surg 9(3):495–511
Loukas C (2018) Video content analysis of surgical procedures. Surg Endosc 32(2):553–568
Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge
Dergachyova O, Bouget D, Huaulmé A, Morandi X, Jannin P (2016) Automatic data-driven real-time segmentation and recognition of surgical workflow. Int J Comput Assist Radiol Surg 11(6):1081–1089
Charrière K, Quellec G, Lamard M, Martiano D, Cazuguel G, Coatrieux G, Cochener B (2017) Real-time analysis of cataract surgery videos using statistical models. Multimed Tools Appl 76(21):22473–22491
Meeuwsen F, van Luyn F, Blikkendaal MD, Jansen F, van den Dobbelsteen J (2019) Surgical phase modelling in minimal invasive surgery. Surg Endosc 33(5):1426–1432
Twinanda AP, Shehata S, Mutter D, Marescaux J, De Mathelin M, Padoy N (2016) Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE Trans Med Imaging 36(1):86–97
Lea C, Choi J.H, Reiter A, Hager G (2016) Surgical phase recognition: from instrumented ors to hospitals around the world. In: Medical image computing and computer-assisted intervention M2CAI—MICCAI workshop, pp 45–54
Chen Y, Sun QL, Zhong K (2018) Semi-supervised spatio-temporal CNN for recognition of surgical workflow. EURASIP J Image Video Process 2018(1):76
Yengera G, Mutter D, Marescaux J, Padoy N (2018) Less is more: surgical phase recognition with less annotations through self-supervised pre-training of CNN–LSTM networks. arXiv preprint arXiv:1805.08569
Yu T, Mutter D, Marescaux J, Padoy N (2018) Learning from a tiny dataset of manual annotations: a teacher/student approach for surgical phase recognition. arXiv preprint arXiv:1812.00033
Lalys F, Bouget D, Riffaud L, Jannin P (2013) Automatic knowledge-based recognition of low-level tasks in ophthalmological procedures. Int J Comput Assist Radiol Surg 8(1):39–49
Quellec G, Lamard M, Cochener B, Cazuguel G (2014) Real-time segmentation and recognition of surgical tasks in cataract surgery videos. IEEE Trans Med Imaging 33(12):2352–2360
Yu F, Croso GS, Kim TS, Song Z, Parker F, Hager GD, Reiter A, Vedula SS, Ali H, Sikder S (2019) Assessment of automated identification of phases in videos of cataract surgery using machine learning and deep learning techniques. JAMA Netw Open 2(4):e191860–e191860
Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018) Deepphase: surgical phase recognition in cataracts videos. In: International conference on medical image computing and computer-assisted intervention. Springer, Berlin, pp 265–272
Primus M.J, Putzgruber-Adamitsch D, Taschwer M, Münzer B, El-Shabrawi Y, Böszörmenyi L, Schoeffmann K (2018) Frame-based classification of operation phases in cataract surgery videos. In: International conference on multimedia modeling. Springer, Berlin, pp 241–253
Bianco S, Ciocca G, Napoletano P, Schettini R, Margherita R, Marini G, Pantaleo G (2013) Cooking action recognition with iVAT: an interactive video annotation tool. In: International conference on image analysis and processing. Springer, Berlin, pp 631–641
D’Orazio T, Leo M, Mosca N, Spagnolo P, Mazzeo P.L (2009) A semi-automatic system for ground truth generation of soccer video sequences. In: 2009 sixth IEEE international conference on advanced video and signal based surveillance. IEEE, pp 559–564
Kavasidis I, Palazzo S, Di Salvo R, Giordano D, Spampinato C (2014) An innovative web-based collaborative platform for video annotation. Multimed Tools Appl 70(1):413–432
Vondrick C, Patterson D, Ramanan D (2013) Efficiently scaling up crowdsourced video annotation. Int J Comput Vis 101(1):184–204
Neumuth T, Durstewitz N, Fischer M, Strauß G, Dietz A, Meixensberger J, Jannin P, Cleary K, Lemke HU, Burgert O (2006) Structured recording of intraoperative surgical workflows. In: Medical imaging 2006: PACS and imaging informatics, vol 6145. International Society for Optics and Photonics, p 61450A
Garraud C, Gibaud B, Penet C, Gazuguel G, Dardenne G, Jannin P (2014) An ontology-based software suite for the analysis of surgical process model. In: Proceedings of surgetica, pp 243–245
Hajj HA, Lamard M, Conze PH, Roychowdhury S, Hu X, Maršalkaitė G, Zisimopoulos O, Dedmari MA, Zhao F, Prellberg J, Sahu M, Galdran A, Araújo T, Vo DM, Panda C, Dahiya N, Kondo S, Bian Z, Vahdat A, Bialopetravičius J, Flouty E, Qiu C, Dill S, Mukhopadhyay A, Costa P, Aresta G, Ramamurthy S, Lee SW, Campilho A, Zachow S, Xia S, Conjeti S, Stoyanov D, Armaitis J, Heng PA, Macready WG, Cochener B, Quellec G (2019) Cataracts: challenge on automatic tool annotation for cataract surgery. Med Image Anal 52:24–41
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Venkatesh V, Bala H (2008) Technology acceptance model 3 and a research agenda on interventions. Decis Sci 39(2):273–315
This study was supported by French state funds managed by ANR under the reference ANR-10-AIRT-07.
Conflict of interest
Gurvan Lecuyer, Martin Ragot, Nicolas Martin, Laurent Launay and Pierre Jannin declare that they have no conflict of interest
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Informed consent was obtained from all individual participants included in the study.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lecuyer, G., Ragot, M., Martin, N. et al. Assisted phase and step annotation for surgical videos. Int J CARS (2020). https://doi.org/10.1007/s11548-019-02108-8
- Assisted annotation
- Surgical workflow
- Phase recognition
- Step recognition
- Deep learning