Multi-task Learning for Fine-Grained Eye Disease Prediction

  • Sahil ChelaramaniEmail author
  • Manish GuptaEmail author
  • Vipul Agarwal
  • Prashant Gupta
  • Ranya Habash
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 12047)


Recently, deep learning techniques have been widely used for medical image analysis. While there exists some work on deep learning for ophthalmology, there is little work on multi-disease predictions from retinal fundus images. Also, most of the work is based on small datasets. In this work, given a fundus image, we focus on three tasks related to eye disease prediction: (1) predicting one of the four broad disease categories – diabetic retinopathy, age-related macular degeneration, glaucoma, and melanoma, (2) predicting one of the 320 fine disease sub-categories, (3) generating a textual diagnosis. We model these three tasks under a multi-task learning setup using ResNet, a popular deep convolutional neural network architecture. Our experiments on a large dataset of 40658 images across 3502 patients provides \(\sim \)86% accuracy for task 1, \(\sim \)67% top-5 accuracy for task 2, and \(\sim \)32 BLEU for the diagnosis captioning task.


Retinal imaging Deep learning Multi-task learning Convolutional Neural Networks Ophthalmology Diagnosis caption generation 


  1. 1.
    Avendi, M.R., Kheradvar, A., Jafarkhani, H.: A combined deep-learning and deformable-model approach to fully automatic segmentation of the left ventricle in cardiac MRI. Med. Image Anal. 30, 108–119 (2016)CrossRefGoogle Scholar
  2. 2.
    Bowd, C., et al.: Glaucomatous patterns in frequency doubling technology (FDT) perimetry data identified by unsupervised machine learning classifiers. PLoS ONE 9(1), e85941 (2014)CrossRefGoogle Scholar
  3. 3.
    Caruana, R.: Multitask learning: a knowledge-based source of inductive bias. In: ICML, pp. 41–48 (1993)Google Scholar
  4. 4.
    Cheng, P.M., Malhi, H.S.: Transfer learning with convolutional neural networks for classification of abdominal ultrasound images. J. Digit. Imaging 30(2), 234–243 (2017)CrossRefGoogle Scholar
  5. 5.
    Collobert, R., Weston, J.: A unified architecture for natural language processing: deep neural networks with multitask learning. In: ICML, pp. 160–167 (2008)Google Scholar
  6. 6.
    de Vos, B.D., Berendsen, F.F., Viergever, M.A., Staring, M., Išgum, I.: End-to-end unsupervised deformable image registration with a convolutional neural network. In: Cardoso, M.J., et al. (eds.) DLMIA/ML-CDS -2017. LNCS, vol. 10553, pp. 204–212. Springer, Cham (2017). Scholar
  7. 7.
    Deng, L., Hinton, G., Kingsbury, B.: New types of deep neural network learning for speech recognition and related applications: an overview. In: ICASSP, pp. 8599–8603 (2013)Google Scholar
  8. 8.
    Duong, L., Cohn, T., Bird, S., Cook, P.: Low resource dependency parsing: cross-lingual parameter sharing in a neural network parser. In: IJCNLP, pp. 845–850 (2015)Google Scholar
  9. 9.
    Esteva, A., et al.: Dermatologist-level classification of skin cancer with deep neural networks. Nature 542(7639), 115 (2017)CrossRefGoogle Scholar
  10. 10.
    Fraccaro, P., et al.: Combining macula clinical signs and patient characteristics for age-related macular degeneration diagnosis: a machine learning approach. BMC Ophthalmol. 15(1) (2015). Article number: 10 Google Scholar
  11. 11.
    Fraz, M.M., et al.: An ensemble classification-based approach applied to retinal blood vessel segmentation. Biomed. Eng. 59(9), 2538–2548 (2012)Google Scholar
  12. 12.
    Fu, H., et al.: Disc-aware ensemble network for glaucoma screening from fundus image. TMI 37(11), 2493–2501 (2018)Google Scholar
  13. 13.
    Girshick, R.: Fast R-CNN. In: ICCV, pp. 1440–1448 (2015)Google Scholar
  14. 14.
    Greenspan, H., Van Ginneken, B., Summers, R.M.: Guest editorial deep learning in medical imaging: overview and future promise of an exciting new technique. TMI 35(5), 1153–1159 (2016)Google Scholar
  15. 15.
    Gulshan, V., et al.: Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316(22), 2402–2410 (2016)CrossRefGoogle Scholar
  16. 16.
    Gupta, M., Gupta, P., Vaddavalli, P.K., Fatima, A.: Predicting post-operative visual acuity for LASIK surgeries. In: Bailey, J., Khan, L., Washio, T., Dobbie, G., Huang, J.Z., Wang, R. (eds.) PAKDD 2016. LNCS (LNAI), vol. 9651, pp. 489–501. Springer, Cham (2016). Scholar
  17. 17.
    Hammernik, K., et al.: Learning a variational network for reconstruction of accelerated MRI data. Magn. Reson. Med. 79(6), 3055–3071 (2018)CrossRefGoogle Scholar
  18. 18.
    Harbour, J.W.: Molecular prediction of time to metastasis from ocular melanoma fine needle aspirates. Clin. Cancer Res. 12(19 Supplement), A77 (2006)Google Scholar
  19. 19.
    He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)Google Scholar
  20. 20.
    Hirasawa, H., Murata, H., Mayama, C., Araie, M., Asaoka, R.: Evaluation of various machine learning methods to predict vision-related quality of life from visual field data and visual acuity in patients with glaucoma. Br. J. Ophthalmol. 98(9), 1230–1235 (2014)CrossRefGoogle Scholar
  21. 21.
    Janowczyk, A., Madabhushi, A.: Deep learning for digital pathology image analysis: a comprehensive tutorial with selected use cases. J. Pathol. Inform. 7, 29 (2016)CrossRefGoogle Scholar
  22. 22.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  23. 23.
    Lakhani, P., Sundaram, B.: Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 284(2), 574–582 (2017)CrossRefGoogle Scholar
  24. 24.
    Lalonde, M., Gagnon, L., Boucher, M.-C., et al.: Automatic visual quality assessment in optical fundus images. In: Vision Interface, vol. 32, pp. 259–264 (2001)Google Scholar
  25. 25.
    Lee, C.S., Baughman, D.M., Lee, A.Y.: Deep learning is effective for classifying normal versus age-related macular degeneration OCT images. Ophthalmol. Retin. 1(4), 322–327 (2017)CrossRefGoogle Scholar
  26. 26.
    Litjens, G., et al.: A survey on deep learning in medical image analysis. Med. Image Anal. 42, 60–88 (2017)CrossRefGoogle Scholar
  27. 27.
    Liu, F., Zhou, Z., Jang, H., Samsonov, A., Zhao, G., Kijowski, R.: Deep convolutional neural network and 3D deformable approach for tissue segmentation in musculoskeletal magnetic resonance imaging. Magn. Reson. Med. 79(4), 2379–2391 (2018)CrossRefGoogle Scholar
  28. 28.
    Long, M., Wang, J.: Learning multiple tasks with deep relationship networks. arXiv, 2 (2015)Google Scholar
  29. 29.
    Lu, Y., Kumar, A., Zhai, S., Cheng, Y., Javidi, T., Feris, R.: Fully-adaptive feature sharing in multi-task networks with applications in person attribute classification. In: CVPR, pp. 5334–5343 (2017)Google Scholar
  30. 30.
    Misra, I., Shrivastava, A., Gupta, A., Hebert, M.: Cross-stitch networks for multi-task learning. In: CVPR, pp. 3994–4003 (2016)Google Scholar
  31. 31.
    Nie, D., Zhang, H., Adeli, E., Liu, L., Shen, D.: 3D deep learning for multi-modal imaging-guided survival time prediction of brain tumor patients. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016. LNCS, vol. 9901, pp. 212–220. Springer, Cham (2016). Scholar
  32. 32.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.-J.: BLEU: a method for automatic evaluation of machine translation. In: ACL, pp. 311–318 (2002)Google Scholar
  33. 33.
    Rao, H.L., et al.: Accuracy of ordinary least squares and empirical bayes estimates of short term visual field progression rates to predict long term outcomes in glaucoma. Investig. Ophthalmol. Vis. Sci. 53(14), 182 (2012)Google Scholar
  34. 34.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). Scholar
  35. 35.
    Sample, P.A., et al.: Using machine learning classifiers to identify glaucomatous change earlier in standard visual fields. Investig. Ophthalmol. Vis. Sci. 43(8), 2660–2665 (2002)Google Scholar
  36. 36.
    Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., Batra, D.: Grad-CAM: visual explanations from deep networks via gradient-based localization. In: ICCV, pp. 618–626 (2017)Google Scholar
  37. 37.
    Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv (2014)Google Scholar
  38. 38.
    Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: CVPR, pp. 2818–2826 (2016)Google Scholar
  39. 39.
    Torquetti, L., Ferrara, G., Ferrara, P.: Predictors of clinical outcomes after intrastromal corneal ring segments implantation. Int. J. Keratoconus Ectatic Corneal Dis. 1, 26–30 (2012)CrossRefGoogle Scholar
  40. 40.
    Jun, X., et al.: Stacked sparse autoencoder (SSAE) for nuclei detection on breast cancer histopathology images. TMI 35(1), 119–130 (2015)Google Scholar
  41. 41.
    Xu, Y., et al.: Deep learning of feature representation with multiple instance learning for medical image analysis. In: ICASSP, pp. 1626–1630 (2014)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.MicrosoftHyderabadIndia
  2. 2.Bascom Palmer Eye InstituteMiamiUSA

Personalised recommendations