Objective assessment of intraoperative technical skill in capsulorhexis using videos of cataract surgery

  • Tae Soo Kim
  • Molly O’Brien
  • Sidra Zafar
  • Gregory D. Hager
  • Shameema Sikder
  • S. Swaroop VedulaEmail author
Original Article



Objective assessment of intraoperative technical skill is necessary for technology to improve patient care through surgical training. Our objective in this study was to develop and validate deep learning techniques for technical skill assessment using videos of the surgical field.


We used a data set of 99 videos of capsulorhexis, a critical step in cataract surgery. One expert surgeon annotated each video for technical skill using a standard structured rating scale, the International Council of Ophthalmology’s Ophthalmology Surgical Competency Assessment Rubric:phacoemulsification (ICO-OSCAR:phaco). Using two capsulorhexis indices in this scale (commencement of flap and follow-through, formation and completion), we specified an expert performance when at least one of the indices was 5 and the other index was at least 4, and novice otherwise. In addition, we used scores for capsulorhexis commencement and capsulorhexis formation as separate ground truths (Likert scale of 2 to 5; analyzed as 2/3, 4 and 5). We crowdsourced annotations of instrument tips. We separately modeled instrument trajectories and optical flow using temporal convolutional neural networks to predict a skill class (expert/novice) and score on each item for capsulorhexis in ICO-OSCAR:phaco. We evaluated the algorithms in a five-fold cross-validation and computed accuracy and area under the receiver operating characteristics curve (AUC).


The accuracy and AUC were 0.848 and 0.863 for instrument tip velocities, and 0.634 and 0.803 for optical flow fields, respectively.


Deep neural networks effectively model surgical technical skill in capsulorhexis given structured representation of intraoperative data such as optical flow fields extracted from video or crowdsourced tool localization information.


Surgical skill assessment Neural networks Deep learning Capsulorhexis Cataract surgery Tool trajectories Crowdsourcing 



Dr. Anand Malpani advised on crowdsourcing for annotation of instruments, and Adit Murali supported on cleaning the data.


This study was supported by funds from the Wilmer Eye Institute Pooled Professor’s Fund (PI: Dr. Sikder), an unrestricted research Grant to the Wilmer Eye Institute from Research to Prevent Blindness, and a research Grant from The Mitchell Jr. Trust (PI: Dr. Sikder).

Compliance with ethical standards

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the Institutional Review Board and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards.

Informed consent

Informed consent was obtained from all individual participants included in the study.


  1. 1.
    Bouget D, Allan M, Stoyanov D, Jannin P (2017) Vision-based and marker-less surgical tool detection and tracking: a review of the literature. Med Image Anal 35:633–654CrossRefGoogle Scholar
  2. 2.
    Bouget D, Lalys F, Jannin P (2012) Surgical tools recognition and pupil segmentation for cataract surgical process modeling. In Medicine meets virtual reality—nextmed, 173, 78–84. IOS press books, Newport beach.
  3. 3.
    Du X, Kurmann T, Chang PL, Allan M, Ourselin S, Sznitman R, Kelly JD, Stoyanov D (2018) Articulated multi-instrument 2d pose estimation using fully convolutional networks. IEEE Trans Med Imaging 1(1):99Google Scholar
  4. 4.
    Gao Y, Vedula SS, Reiley C, Ahmidi N, Varadarajan B, Lin HC, Tao L, Zappella L, Bejar B, Yuh DD, Chen C, Vidal R, Khudanpur S, Hager GD (2014) The jhu-isi gesture and skill assessment working set (jigsaws): a surgical activity dataset for human motion modeling. In: In modeling and monitoring of computer assisted interventions (M2CAI), MICCAIGoogle Scholar
  5. 5.
    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Bach FR, Blei DM (eds) ICML, JMLR workshop and conference proceedings, vol 37, pp 448–456. JMLR.orgGoogle Scholar
  6. 6.
    Kim TS, Malpani A, Reiter A, Hager GD, Sikder S, Vedula SS (2018) Crowdsourcing annotation of surgical instruments in videos of cataract surgery. In: Medical image computing and computer assisted intervention labels workshop (MICCAI-LABELS), pp 121–130Google Scholar
  7. 7.
    Kim TS, Reiter A (2017) Interpretable 3d human action analysis with temporal convolutional networks. In 2017 IEEE conference on computer vision and pattern recognition workshops (CVPRW), pp 1623–1631Google Scholar
  8. 8.
    Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In 3rd international conference for learning representations, San Diego. arXiv:1412.6980; Accessed on January 26, 2019
  9. 9.
    Lea C, Flynn MD, Vidal R, Reiter A, Hager GD (2017) Temporal convolutional networks for action segmentation and detection. In: CVPRGoogle Scholar
  10. 10.
    Little G, Chilton LB, Goldman M, Miller RC (2009) Turkit: Tools for iterative tasks on mechanical turk. In Proceedings of the ACM SIGKDD workshop on human computation, HCOMP ’09, pp 29–30. ACM, New York.
  11. 11.
    McDonnell PJ, Kirwan TJ, Brinton GS (2007) Perceptions of recent ophthalmology residency graduates regarding preparation for practice. Ophthalmology 114(2):387–391CrossRefGoogle Scholar
  12. 12.
    Paszke A, Gross S, Chintala S, Chanan G, Yang E, DeVito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in pytorchGoogle Scholar
  13. 13.
    Puri S, Srikumaran D, Prescott C, Tian J, Sikder S (2017) Assessment of resident training and preparedness for cataract surgery. J Cataract Refract Surg 43(3):364–368CrossRefGoogle Scholar
  14. 14.
    Randleman J, Wolfe JD, Woodward M, Lynn MJ, Cherwek D, Srivastava SK (2007) The resident surgeon phacoemulsification learning curve. Arch Ophthalmol 125(9):1215–1219CrossRefGoogle Scholar
  15. 15.
    Ranjan A, Black MJ (2017) Optical flow estimation using a spatial pyramid network. In IEEE conference on computer vision and pattern recognition (CVPR)Google Scholar
  16. 16.
    Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958.
  17. 17.
    Vedula SS, Ishii M, Hager GD (2017) Objective assessment of surgical technical skill and competency in the operating room. Annu Rev Biomed Eng 19(1):301–325. PMID: 28375649CrossRefGoogle Scholar
  18. 18.
    Zisimopoulos O, Flouty E, Luengo I, Giataganas P, Nehme J, Chow A, Stoyanov D (2018) Deepphase: surgical phase recognition in cataracts videos. In MICCAIGoogle Scholar
  19. 19.
    Zisimopoulos O, Flouty E, Stacey M, Muscroft S, Giataganas P, Nehme J, Chow A, Stoyanov D (2017) Can surgical simulation be used to train detection and classification neural networks? Healthc Technol Lett 4(5):216–222CrossRefGoogle Scholar

Copyright information

© CARS 2019

Authors and Affiliations

  • Tae Soo Kim
    • 1
  • Molly O’Brien
    • 1
  • Sidra Zafar
    • 2
  • Gregory D. Hager
    • 1
  • Shameema Sikder
    • 2
  • S. Swaroop Vedula
    • 1
    Email author
  1. 1.Johns Hopkins UniversityBaltimoreUSA
  2. 2.Wilmer Eye InstituteJohns Hopkins UniversityBaltimoreUSA

Personalised recommendations