Abstract
Supervised machine learning techniques require large amounts of annotated training data to attain good performance. Active learning aims to ease the data collection process by automatically detecting which instances an expert should annotate in order to train a model as quickly and effectively as possible. Such strategies have been previously reported for medical imaging, but for other tasks than focal pathologies where there is high class imbalance and heterogeneous background appearance. In this study we evaluate different data selection approaches (random, uncertain, and representative sampling) and a semi-supervised model training procedure (pseudo-labelling), in the context of lung nodule segmentation in CT volumes from the publicly available LIDC-IDRI dataset. We find that active learning strategies allow us to train a model with equal performance but less than half of the annotation effort; data selection by uncertainty sampling offers the most gain, with the incorporation of representativeness or the addition of pseudo-labelling giving further small improvements. We conclude that active learning is a valuable tool and that further development of these strategies can play a key role in making diagnostic algorithms viable.
D. Zotova and A. Lisowska—Equal contribution
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abraham, J.: Reduced lung cancer mortality with low-dose computed tomographic screening. Commun. Oncol. 8(10), 441–442 (2011)
Armato III, S.G., et al.: Data from lidc-idri. the cancer imaging archive (2015)
Bank, D., Greenfeld, D., Hyams, G.: Improved training for self training by confidence assessments. In: Arai, K., Kapoor, S., Bhatia, R. (eds.) SAI 2018. AISC, vol. 858, pp. 163–173. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-01174-1_13
Becker, N., et al.: Lung cancer mortality reduction by ldct screening-results from the randomised german lusi trial. International Journal of Cancer (2019)
Gal, Y., Ghahramani, Z.: Dropout as a bayesian approximation: representing model uncertainty in deep learning. In: international Conference on Machine Learning, pp. 1050–1059 (2016)
Golan, R., Jacob, C., Denzinger, J.: Lung nodule detection in ct images using deep convolutional neural networks. In: 2016 International Joint Conference on Neural Networks (IJCNN), pp. 243–250. IEEE (2016)
Gorriz, M., Carlier, A., Faure, E., Giro-i Nieto, X.: Cost-effective active learning for melanoma segmentation (2017). arXiv preprint arXiv:1711.09168
Hua, K.L., Hsu, C.H., Hidayati, S.C., Cheng, W.H., Chen, Y.J.: Computer-aided classification of lung nodules on computed tomography images via deep learning technique. Onco Targets Therapy 8, 2015–2022 (2015)
Jesson, A., et al.: CASED: curriculum adaptive sampling for extreme data imbalance. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 639–646. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_73
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014). arXiv preprint arXiv:1412.6980
Kohl, S.A., et al.: A hierarchical probabilistic u-net for modeling multi-scale ambiguities (2019). arXiv preprint arXiv:1905.13077
Lee, D.H.: Pseudo-label: the simple and efficient semi-supervised learning method for deep neural networks. In: Workshop on Challenges in Representation Learning, ICML. vol. 3, p. 2 (2013)
Lin, T.Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence (2018)
Maaten, L.V.D., Hinton, G.: Visualizing data using t-sne. J. Mach. Learn. Res. 9, 2579–2605 (2008)
Park, S., Hwang, W., Jung, K.H.: Integrating reinforcement learning to self training for pulmonary nodule segmentation in chest x-rays. NeurIPS ML4 Health Workshop (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Segal, R., Miller, K., Jemal, A.: Cancer statistics (2018). https://www.ncbi.nlm.nih.gov/pubmed/29313949
Settles, B.: Active learning literature survey. University of Wisconsin-Madison Department of Computer Sciences, Technical report (2009)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition (2014). arXiv preprint arXiv:1409.1556
Smith, L.N.: Cyclical learning rates for training neural networks. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 464–472. IEEE (2017)
Wang, K., Zhang, D., Li, Y., Zhang, R., Lin, L.: Cost-effective active learning for deep image classification. IEEE Trans. Circ. Syst. Video Technol. 27(12), 2591–2600 (2016)
Yang, L., Zhang, Y., Chen, J., Zhang, S., Chen, D.Z.: Suggestive annotation: a deep active learning framework for biomedical image segmentation. In: Descoteaux, M., Maier-Hein, L., Franz, A., Jannin, P., Collins, D.L., Duchesne, S. (eds.) MICCAI 2017. LNCS, vol. 10435, pp. 399–407. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-66179-7_46
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Zotova, D., Lisowska, A., Anderson, O., Dilys, V., O’Neil, A. (2019). Comparison of Active Learning Strategies Applied to Lung Nodule Segmentation in CT Scans. In: Zhou, L., et al. Large-Scale Annotation of Biomedical Data and Expert Label Synthesis and Hardware Aware Learning for Medical Imaging and Computer Assisted Intervention. LABELS HAL-MICCAI CuRIOUS 2019 2019 2019. Lecture Notes in Computer Science(), vol 11851. Springer, Cham. https://doi.org/10.1007/978-3-030-33642-4_1
Download citation
DOI: https://doi.org/10.1007/978-3-030-33642-4_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33641-7
Online ISBN: 978-3-030-33642-4
eBook Packages: Computer ScienceComputer Science (R0)