Objective assessment of intraoperative technical skill in capsulorhexis using videos of cataract surgery
Objective assessment of intraoperative technical skill is necessary for technology to improve patient care through surgical training. Our objective in this study was to develop and validate deep learning techniques for technical skill assessment using videos of the surgical field.
We used a data set of 99 videos of capsulorhexis, a critical step in cataract surgery. One expert surgeon annotated each video for technical skill using a standard structured rating scale, the International Council of Ophthalmology’s Ophthalmology Surgical Competency Assessment Rubric:phacoemulsification (ICO-OSCAR:phaco). Using two capsulorhexis indices in this scale (commencement of flap and follow-through, formation and completion), we specified an expert performance when at least one of the indices was 5 and the other index was at least 4, and novice otherwise. In addition, we used scores for capsulorhexis commencement and capsulorhexis formation as separate ground truths (Likert scale of 2 to 5; analyzed as 2/3, 4 and 5). We crowdsourced annotations of instrument tips. We separately modeled instrument trajectories and optical flow using temporal convolutional neural networks to predict a skill class (expert/novice) and score on each item for capsulorhexis in ICO-OSCAR:phaco. We evaluated the algorithms in a five-fold cross-validation and computed accuracy and area under the receiver operating characteristics curve (AUC).
The accuracy and AUC were 0.848 and 0.863 for instrument tip velocities, and 0.634 and 0.803 for optical flow fields, respectively.
Deep neural networks effectively model surgical technical skill in capsulorhexis given structured representation of intraoperative data such as optical flow fields extracted from video or crowdsourced tool localization information.
KeywordsSurgical skill assessment Neural networks Deep learning Capsulorhexis Cataract surgery Tool trajectories Crowdsourcing
Dr. Anand Malpani advised on crowdsourcing for annotation of instruments, and Adit Murali supported on cleaning the data.
This study was supported by funds from the Wilmer Eye Institute Pooled Professor’s Fund (PI: Dr. Sikder), an unrestricted research Grant to the Wilmer Eye Institute from Research to Prevent Blindness, and a research Grant from The Mitchell Jr. Trust (PI: Dr. Sikder).
Compliance with ethical standards
Conflict of interest
The authors declare that they have no conflict of interest.
All procedures performed in studies involving human participants were in accordance with the ethical standards of the Institutional Review Board and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.
Informed consent was obtained from all individual participants included in the study.
- 2.Bouget D, Lalys F, Jannin P (2012) Surgical tools recognition and pupil segmentation for cataract surgical process modeling. In Medicine meets virtual reality—nextmed, 173, 78–84. IOS press books, Newport beach. http://www.hal.inserm.fr/inserm-00669660
- 3.Du X, Kurmann T, Chang PL, Allan M, Ourselin S, Sznitman R, Kelly JD, Stoyanov D (2018) Articulated multi-instrument 2d pose estimation using fully convolutional networks. IEEE Trans Med Imaging 1(1):99Google Scholar
- 4.Gao Y, Vedula SS, Reiley C, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Bejar B, Yuh DD, Chen C, Vidal R, Khudanpur S, Hager GD (2014) The jhu-isi gesture and skill assessment working set (jigsaws): a surgical activity dataset for human motion modeling. In: In modeling and monitoring of computer assisted interventions (M2CAI), MICCAIGoogle Scholar
- 5.Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Bach FR, Blei DM (eds) ICML, JMLR workshop and conference proceedings, vol 37, pp 448–456. JMLR.orgGoogle Scholar
- 6.Kim TS, Malpani A, Reiter A, Hager GD, Sikder S, Vedula SS (2018) Crowdsourcing annotation of surgical instruments in videos of cataract surgery. In: Medical image computing and computer assisted intervention labels workshop (MICCAI-LABELS), pp 121–130Google Scholar
- 7.Kim TS, Reiter A (2017) Interpretable 3d human action analysis with temporal convolutional networks. In 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1623–1631Google Scholar
- 8.Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd international conference for learning representations, San Diego. arXiv:1412.6980; Accessed on January 26, 2019
- 9.Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In: CVPRGoogle Scholar
- 10.Little G, Chilton LB, Goldman M, Miller RC (2009) Turkit: Tools for iterative tasks on mechanical turk. In Proceedings of the ACM SIGKDD workshop on human computation, HCOMP ’09, pp 29–30. ACM, New York. https://doi.org/10.1145/1600150.1600159
- 12.Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorchGoogle Scholar
- 15.Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
- 16.Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958. http://jmlr.org/papers/v15/srivastava14a.html
- 17.Vedula SS, Ishii M, Hager GD (2017) Objective assessment of surgical technical skill and competency in the operating room. Annu Rev Biomed Eng 19(1):301–325. https://doi.org/10.1146/annurev-bioeng-071516-044435 PMID: 28375649CrossRefGoogle Scholar
- 18.Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018) Deepphase: surgical phase recognition in cataracts videos. In MICCAIGoogle Scholar