Comparison of Active Learning Strategies Applied to Lung Nodule Segmentation in CT Scans
Abstract
Supervised machine learning techniques require large amounts of annotated training data to attain good performance. Active learning aims to ease the data collection process by automatically detecting which instances an expert should annotate in order to train a model as quickly and effectively as possible. Such strategies have been previously reported for medical imaging, but for other tasks than focal pathologies where there is high class imbalance and heterogeneous background appearance. In this study we evaluate different data selection approaches (random, uncertain, and representative sampling) and a semi-supervised model training procedure (pseudo-labelling), in the context of lung nodule segmentation in CT volumes from the publicly available LIDC-IDRI dataset. We find that active learning strategies allow us to train a model with equal performance but less than half of the annotation effort; data selection by uncertainty sampling offers the most gain, with the incorporation of representativeness or the addition of pseudo-labelling giving further small improvements. We conclude that active learning is a valuable tool and that further development of these strategies can play a key role in making diagnostic algorithms viable.
Keywords
Active learning Lung nodule segmentation Pseudo-labellingReferences
- 1.Abraham, J.: Reduced lung cancer mortality with low-dose computed tomographic screening. Commun. Oncol. 8(10), 441–442 (2011)CrossRefGoogle Scholar
- 2.Armato III, S.G., et al.: Data from lidc-idri. the cancer imaging archive (2015)Google Scholar
- 3.Bank, D., Greenfeld, D., Hyams, G.: Improved training for self training by confidence assessments. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) SAI 2018. AISC, vol. 858, pp. 163–173. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01174-1_13CrossRefGoogle Scholar
- 4.Becker, N., et al.: Lung cancer mortality reduction by ldct screening-results from the randomised german lusi trial. International Journal of Cancer (2019)Google Scholar
- 5.Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: international Conference on Machine Learning, pp. 1050–1059 (2016)Google Scholar
- 6.Golan, R., Jacob, C., Denzinger, J.: Lung nodule detection in ct images using deep convolutional neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 243–250. IEEE (2016)Google Scholar
- 7.Gorriz, M., Carlier, A., Faure, E., Giro-i Nieto, X.: Cost-effective active learning for melanoma segmentation (2017). arXiv preprint arXiv:1711.09168
- 8.Hua, K.L., Hsu, C.H., Hidayati, S.C., Cheng, W.H., Chen, Y.J.: Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Therapy 8, 2015–2022 (2015)Google Scholar
- 9.Jesson, A., et al.: CASED: curriculum adaptive sampling for extreme data imbalance. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 639–646. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_73CrossRefGoogle Scholar
- 10.Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
- 11.Kohl, S.A., et al.: A hierarchical probabilistic u-net for modeling multi-scale ambiguities (2019). arXiv preprint arXiv:1905.13077
- 12.Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML. vol. 3, p. 2 (2013)Google Scholar
- 13.Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)Google Scholar
- 14.Maaten, L.V.D., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)zbMATHGoogle Scholar
- 15.Park, S., Hwang, W., Jung, K.H.: Integrating reinforcement learning to self training for pulmonary nodule segmentation in chest x-rays. NeurIPS ML4 Health Workshop (2018)Google Scholar
- 16.Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28CrossRefGoogle Scholar
- 17.Segal, R., Miller, K., Jemal, A.: Cancer statistics (2018). https://www.ncbi.nlm.nih.gov/pubmed/29313949
- 18.Settles, B.: Active learning literature survey. University of Wisconsin-Madison Department of Computer Sciences, Technical report (2009)Google Scholar
- 19.Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
- 20.Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)Google Scholar
- 21.Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2016)CrossRefGoogle Scholar
- 22.Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 399–407. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_46CrossRefGoogle Scholar