Accurate Identification of Tomograms of Lung Nodules Using CNN: Influence of the Optimizer, Preprocessing and Segmentation

Loeza Mejía, Cecilia Irene; Biswal, R. R.; Rodriguez-Tello, Eduardo; Ochoa-Ruiz, Gilberto

doi:10.1007/978-3-030-49076-8_23

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12088))

Included in the following conference series:

Mexican Conference on Pattern Recognition

1765 Accesses
1 Citations

Abstract

The diagnosis of pulmonary nodules plays an important role in the treatment of lung cancer, thus improving the diagnosis is the primary concern. This article shows a comparison of the results in the identification of computed tomography scans with pulmonary nodules, through the use of different optimizers (Adam and Nadam); the effect of the use of pre-processing and segmentation techniques using CNNs is also thoroughly explored. The dataset employed was Lung TIME which is publicly available. When no preprocessing or segmentation was applied, training accuracy above 90.24% and test accuracy above 86.8% were obtained. In contrast, when segmentation was applied without preprocessing, a training accuracy above 97.19% and test accuracy above 95.07% were reached. On the other hand, when preprocessing and segmentation was applied, a training accuracy above 96.41% and test accuracy above 94.71% were achieved. On average, the Adam optimizer scored a training accuracy of 96.17% and a test accuracy of 95.23%. Whereas, the Nadam optimizer obtained 96.25% and 95.2%, respectively. It is concluded that CNN has a good performance even when working with images with noise. The performance of the network was similar when working with preprocessing and segmentation than when using only segmentation. Also, it can be inferred that, the application of preprocessing and segmentation is an excellent option when it is required to improve accuracy in CNNs.

You have full access to this open access chapter, Download conference paper PDF

Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI database

Article Open access 06 October 2015

Hierarchical approach for pulmonary-nodule identification from CT images using YOLO model and a 3D neural network classifier

Article 18 November 2023

Lung cancer classification and identification framework with automatic nodule segmentation screening using machine learning

Article 15 March 2023

Keywords

1 Introduction

At present, there has been an incredible growth in the use of machine learning techniques in medical research, mainly applied to genetics [1], disease detection, biomedical image segmentation [2, 3] and classification, thus showing the efficacy of machine learning in clinical decisions and monitoring systems [4]. The use of convolutional neural networks (CNN) in deep learning has helped in the automatic detection of various diseases particularly through the processing of biomedical images and clinical data. Recently, CNN research related to lung cancer, has focused on the automatic diagnosis of cancer [5, 6], lung segmentation [7,8,9], segmentation of pulmonary nodules [10,11,12,13], lung nodules detection [14, 15], cancer classification [16], nodule categorization [17] and nodule malignancy assessment [18,19,20,21,22,23,24,25,26]. Various investigations, related to lung nodules, report the influence of the data augmentation [14, 24, 26], number of input channels [20] and the use of dropout [8, 14, 18, 20, 21, 24, 26], in order to improve the accuracy of the network and to avoid overfitting. Likewise, some other researchers report the influence of the number of parameters [20, 23] and training time [20]. Nonetheless, the use of preprocessing and segmentation has been little explored; the same applies to the effect of various available optimizers. The main goal of this investigation is to evaluate the influence of the optimizer (Adam [27] and Nadam [28]), preprocessing and segmentation in CNN for the precise identification of tomograms with pulmonary nodules. The evaluation was carried out considering both in precision and training time. The experiments were carried out on the Lung TIME [29] dataset, which is publicly available. In continuation, the paper is organized as follows: Sect. 2 deals with the materials and methods used while in Sect. 3 the results obtained are discussed and finally in the Sect. 4, the conclusions and future work are presented.

2 Materials and Methods

Three scenarios (see Fig. 1, the yellow color indicates the first case analyzed, while the blue color illustrates the second and the green color denotes the third) were considered to carry out the identification of tomograms with lung nodules applying a convolutional neural network: (i) to rescale the tomograms to \(96\times 96\) pixels and pass them as an input to CNN. (ii) to segment the tomograms to obtain the pulmonary regions and rescale them, then pass them as input to CNN. (iii) to preprocess the tomograms by applying filters (median and Gaussian), then the preprocessed image was binarized, subsequently the tomograms were scaled, which were taken as input to CNN. The motivation to perform the downsampling of the tomograms was to decrease the training time.

2.1 Dataset Used in the Study

In this study, CT thorax scans in DICOM format with annotations of the pulmonary nodules in XML format of Lung TIME [29] was used. 62 CT thorax scans were chosen, which had 2003 tomograms with nodules and 12934 without nodules. To validate the results, 70% of the tomograms was randomly selected and utilized for training and the rest for testing.

2.2 Preprocessing

To improve the quality of the tomograms, the median filter and afterwards the Gaussian filter were applied, as discussed in [31] to eliminate salt-and-pepper noise, and the mottled noise from the image. The applied median filter mask was \(5\times 5\) pixels. On the other hand, standard deviation for Gaussian kernel was equal to 2.

2.3 Segmentation

To perform the segmentation, the thresholding technique was chosen. Thresholding is a simple and efficient technique for partitioning an image into a foreground and background [30]. According to Alakwaa et al. [16] it produces the best lung segmentation compared to clustering techniques (K-means and Mean Shift) and Watershed. Binarization was performed with a threshold of −350 HU as suggested by Pulagam et al. [32] to separate the pulmonary region tomography. Finally, the components connected to the edge of the binarized image were removed.

Table 1. CNN architecture

Full size table

2.4 CNN

The description of the layers of the CNN architecture is indicated in Table 1, which consists of multiple convolutional layers with ReLU activation, maxpooling, flatten, dense and a final fully connected softmax layer to carry out the classification between tomograms with nodules and tomograms without nodules. Table 2 shows the CNN architecture using the Dropout layer, which helps selectively ignore neurons during training [33]. Both architectures were tested with Adam [27] and Nadam [28] optimizers. A batch size of 32, 5 epochs and a sparse categorical crossentropy loss function [34] was applied.

Table 2. CNN architecture with Dropout

Full size table

2.5 Computer Equipment

To implement CNN, Tensor Flow 2.0 was utilized in Python 3.7. The imageio [35] library was employed to read the DICOM images. For the preprocessing SciPy [36] library was used, while for segmentation of tomograms the scikit-image [37] library was used. The equipment on which the tests were performed has the following characteristics:

Operating System: Windows 10 Home 64-bit (Build 18362)
Processor: Intel(R) Core(TM) i3-5015U CPU @ 2.10 GHz
Memory: 6 GB

3 Experimental Results and Analysis

Figure 2 shows an example of the application of filters (first the median filter and then the Gaussian) to the tomograms. The use of preprocessing significantly increases image quality, thus helping to reduce both salt and pepper and the mottled noises from the images.

Figure 3 shows examples of binarization in the tomograms. By means of the segmentation, the pulmonary region could be obtained, which allowed to improve the performance of the CNN.

Table 3. Results obtained from the comparison experiments without using Dropout

Full size table

Table 3 gives a summary of the experiments performed without using Dropout while Table 4 reports the experiments carried out with a 0.0002 Dropout rate. Also tests were performed with/without preprocessing, with/without segmentation and with different number of tomograms. Performance was compared between Adam and Nadam optimizers. When carrying out the segmentation, better results were obtained, however, the execution time increased. In most tests (both using the Dropout layer and without using it), in which preprocessing was not carried out, better results were observed using the Nadam optimizer and a shorter runtime. When Dropout was not applied, preprocessing was performed and the Nadam optimizer was used, in some cases the runtime increased, compared to the Adam optimizer. So when the Dropout layer is not used, it is recommended to use the Nadam optimizer on images that have not been preprocessed, instead the Adam optimizer is suggested for images that were preprocessed.

Table 4. Results obtained from the comparison experiments using Dropout

Full size table

Figure 4 shows the average accuracy of training and testing in the experiments performed. On average, the Adam optimizer obtained a training accuracy of 96.17%, test accuracy of 95.23% and training time of 31.95 min in \(96\times 96\) pixel images. In contrast, the Nadam optimizer obtained 96.25%, 95.2% and 33.23 min respectively. It was observed that when using the Nadam optimizer slightly better results are obtained than when those furnished by Adam. In addition, accuracy using only segmentation is better than when it is combined with preprocessing.

4 Conclusions

An experimental analysis was performed through the preprocessing, segmentation and optimizer on images of Lung TIME dataset resized to \(96\times 96\) pixels. It is concluded that convolutional neural networks have excellent performance in the identification of tomograms with nodules, obtaining training accuracy above 90.24% and test accuracy above 86.8%, even when working with images with noise. It is suggested that when working with CT thorax scans, no preprocessing be applied and only segmentation can be performed, since better results were observed in this case (a training accuracy above 97.19% and test accuracy above 95.07% were obtained), compared to applying preprocessing and segmentation (a training accuracy above 96.41% and test accuracy above 94.71% were obtained). In addition, the use of preprocessing significantly increases runtime. On average, the Adam optimizer obtained a training accuracy of 96.17%, test accuracy of 95.23% and training time of 31.95 min. In contrast, the Nadam optimizer obtained 96.25%, 95.2% and 33.23 min, respectively. When Dropout is not applied and preprocessing is performed, it is recommended to use the Adam optimizer. On the contrary, the Nadam optimizer is recommended when no preprocessing on the tomogram is performed. Applying segmentation is an excellent option when accurate results are required. We would like to remark that the model obtained can be used as part of a computer-assisted diagnostic system on lung cancer research. As future work, the location of the nodules in the tomograms identified is proposed. In addition, it would be interesting to compare the performance of different preprocessing techniques.

References

Holder, L.B., Haque, M.M., Skinner, M.K.: Machine learning for epigenetics and future medical applications. Epigenetics 12(7), 505–514 (2017)
Article Google Scholar
Lenchik, L., et al.: Automated segmentation of tissues using CT and MRI: a systematic review. Acad. Radiol. 26(12), 1695–706 (2019)
Article Google Scholar
Rizwan-i-Haque, I., Neubert, J.: Deep learning approaches to biomedical image segmentation. Inf. Med. Unlocked 18, 100297 (2020)
Article Google Scholar
Zhang, Z., Sejdić, E.: Radiological images and machine learning: trends, perspectives, and prospects. Comput. Biol. Med. 108, 354–370 (2019)
Article Google Scholar
Polat, H., Mehr, H.: Classification of pulmonary CT images by using hybrid 3D-Deep convolutional neural network architecture. Appl. Sci. 9(5), 940 (2019)
Article Google Scholar
Simie, E., Kaur, M.: Lung cancer detection using convolutional neural network (CNN). Int. J. Adv. Res. Ideas Innov. Technol. 5(4), 284–292 (2019)
Google Scholar
Zhu, J., Zhang, J., Qiu, B., Liu, Y., Liu, X., Chen, L.: Comparison of the automatic segmentation of multiple organs at risk in CT images of lung cancer between deep convolutional neural network based and atlas-based techniques. Acta Oncol. 58(2), 257–264 (2019)
Article Google Scholar
Abdullah-Al-Zubaer, I., Hatamizadeh, A., Ananth, S.P., Ding, X., Tajbakhsh, N., Terzopoulos, D.: Fast and automatic segmentation of pulmonary lobes from chest CT using a progressive dense V-network. Comput. Methods Biomech. Biomed. Eng. Imaging Vis., 1–10 (2019)
Google Scholar
Geng, L., Zhang, S., Tong, J., Xiao, Z.: Lung segmentation method with dilated convolution based on VGG-16 network. Comput. Assist. Surg. 24(S2), 27–33 (2019)
Article Google Scholar
Hamidian, S., Sahiner, B., Petrick, N., Pezeshk, A.: 3D convolutional neural network for automatic detection of lung nodules in chest CT. In: Proceedings SPIE International Society for Optical Engineering (2017)
Google Scholar
Dey, R., Lu, Z., Hong, Y.: Diagnostic classification of lung nodules using 3D neural networks. In: IEEE International Symposium on Biomedical Imaging (2018)
Google Scholar
Tong, G., Li, Y., Chen, H., Zhang, Q., Jiang, H.: Improved U-NET network for pulmonary nodules segmentation. Optik - Int. J. Light Electron Opt. 174, 460–469 (2018)
Article Google Scholar
Huang, X., Sun, W., Tseng, T., Li, C., Qian, W.: Fast and fully-automated detection and segmentation of pulmonary nodules in thoracic CT scans using deep convolutional neural networks. Comput. Med. Imaging Graph. 74, 25–36 (2019)
Article Google Scholar
Setio, A., et al.: Pulmonary nodule detection in CT images: false positive reduction using multi-view convolutional networks. IEEE Trans. Med. Imaging 35(5), 1160–1169 (2016)
Article Google Scholar
Xie, H., Yang, D., Sun, N., Chen, Z., Zhang, Y.: Automated pulmonary nodule detection in CT images using deep convolutional neural networks. Pattern Recogn. 85, 109–119 (2019)
Article Google Scholar
Alakwaa, W., Nassef, M., Badr, A.: Lung cancer detection and classification with 3D convolutional neural network (3D-CNN). Int. J. Adv. Comput. Sci. Appl. (IJACSA) 8(8), 99–110 (2017)
Google Scholar
Tu, X., et al.: Automatic categorization and scoring of solid, part-solid and non-solid pulmonary nodules in CT images with convolutional neural network. Nature 7(1–10), 8533 (2017)
Google Scholar
Tajbakhsh, N., Suzuki, K.: Comparing two classes of end-to-end machine-learning models in lung nodule detection and classification: MTANNs vs CNNs. Pattern Recogn. 63(2017), 476–486 (2017)
Article Google Scholar
Yan, X., et al.: Classification of lung nodule malignancy risk on computed tomography images using convolutional neural network: a comparison between 2D and 3D strategies. In: ACCV 2016. LNCS, vol. 10118, pp. 91–101. Springer, Heidelberg (2017). https://doi.org/10.1007/978-3-319-54526-47
Kang, G., Liu, K., Hou, B., Zhang, N.: 3D multi-view convolutional neural networks for lung nodule classification. PLoS ONE 12(11), e0188290 (2017)
Article Google Scholar
Zhao, X., Liu, L., Qi, S., Teng, Y., Li, J., Qian, W.: Agile convolutional neural network for pulmonary nodule classification using CT images. Int. J. Comput. Assist. Radiol. Surg. 13(4), 585–595 (2018). https://doi.org/10.1007/s11548-017-1696-0
Article Google Scholar
Causey, J., et al.: Highly accurate model for prediction of lung nodule malignancy with CT scans. Nature 8(1–12), 9286 (2018)
Google Scholar
Liu, Y., Hao, P., Zhang, P., Xu, X., Wu, J., Chen, W.: Dense convolutional binary-tree networks for lung nodule classification. IEEE Access 30(6), 49080–49088 (2018)
Article Google Scholar
Gruetzemacher, R., Gupta, A., Paradice, D.: 3D deep learning for detecting pulmonary nodules in CT scans. J. Am. Med. Inf. Assoc. 25(10), 1301–1310 (2018)
Article Google Scholar
Zia, M.B., Juan, Z.J., Rehman, Z.U., Javed, K., Rauf, S.A., Khan, A.: The utilization of consignable multi-model in detection and classification of pulmonary nodules. Int. J. Comput. Appl. 177(27), 0975–8887 (2019)
Google Scholar
Onishi, Y., et al.: Automated pulmonary nodule classification in computed tomography images using a deep convolutional neural network trained by generative adversarial networks. J. 2(5), 99–110 (2019)
Google Scholar
Kingma, D., Lei, J.: Adam: A method for stochastic optimization. In: 3rd International Conference for Learning Representations, San Diego (2015)
Google Scholar
Dozat, T.: Incorporating nesterov momentum into Adam. In: International Conference on Learning Representations (2016)
Google Scholar
Dolejsi, M., Kybic, J., Polovincak, M., Tuma, S.: The Lung TIME: annotated lung nodule dataset and nodule detection framework. In: Proceedings SPIE 7260, Medical Imaging 2009: Computer-Aided Diagnosis, vol. 7260 (2009)
Google Scholar
Khan, S., Hussain, S., Yang, S., Iqbal, K.: Efective and reliable framework for lung nodules detection from CT scan images. Nature 9, 1–4 (2019)
Google Scholar
Makaju, S., Prasad, P., Alsadoon, A., Singh, A., Elchouemi, A.: Lung cancer detection using CT scan images. Proc. Comput. Sci. 125(2018), 107–114 (2018)
Article Google Scholar
Pulagam, A., Rao, V., Inampudi, R.: Automated pulmonary lung nodule detection using an optimal manifold statistical based feature descriptor and SVM classifier. Biomedical & Pharmacology Journal 10(3), 1311–1324 (2017)
Article Google Scholar
Srivastava, N., et al.: Dropout: a simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15(2014), 1929–1958 (2014)
MathSciNet MATH Google Scholar
Sparse categorical crossentropy. https://www.tensorflow.org/api_docs/python/tf/keras/losses/sparse_categorical_crossentropy. Accessed 21 Feb 2019
Imageio. https://imageio.readthedocs.io/en/stable/. Accessed 21 Feb 2019
Virtanen, P., et al.: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nat. Methods 17(3), 261–272 (2020)
Article Google Scholar
Van der Walt, S., et al.: scikit-image: image processing in Python. PeerJ 2, e453 (2014)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Departamento de Posgrado, Instituto Tecnológico Superior de Misantla, Misantla, Veracruz, Mexico
Cecilia Irene Loeza Mejía
Tecnologico de Monterrey, Escuela de Ingenieria y Ciencias, Zapopan, Jalisco, Mexico
R. R. Biswal, Eduardo Rodriguez-Tello & Gilberto Ochoa-Ruiz

Authors

Cecilia Irene Loeza Mejía
View author publications
You can also search for this author in PubMed Google Scholar
R. R. Biswal
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo Rodriguez-Tello
View author publications
You can also search for this author in PubMed Google Scholar
Gilberto Ochoa-Ruiz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to R. R. Biswal .

Editor information

Editors and Affiliations

Facultad de Ciencias Físico Matemáticas, Universidad Michoacana de San Nicolás de Hidalgo, Morelia, Mexico
Karina Mariela Figueroa Mora
Facultad de Ingeniería Eléctrica, Universidad Michoacana de San Nicolás de Hidalgo, Morelia, Mexico
Juan Anzurez Marín
Facultad de Ingeniería Eléctrica, Universidad Michoacana de San Nicolás de Hidalgo, Morelia, Mexico
Jaime Cerda
Computer Science, Instituto Nacional de Astrofísica, Óptica y Electrónica, Sta. Maria Tonantzintla, Mexico
Jesús Ariel Carrasco-Ochoa
Computer Science, Instituto Nacional de Astrofísica, Óptica y Electrónica, Sta. Maria Tonantzintla, Mexico
José Francisco Martínez-Trinidad
Faculty of Computer Science, Autonomous University of Puebla, Puebla, Mexico
José Arturo Olvera-López

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Loeza Mejía, C.I., Biswal, R.R., Rodriguez-Tello, E., Ochoa-Ruiz, G. (2020). Accurate Identification of Tomograms of Lung Nodules Using CNN: Influence of the Optimizer, Preprocessing and Segmentation. In: Figueroa Mora, K., Anzurez Marín, J., Cerda, J., Carrasco-Ochoa, J., Martínez-Trinidad, J., Olvera-López, J. (eds) Pattern Recognition. MCPR 2020. Lecture Notes in Computer Science(), vol 12088. Springer, Cham. https://doi.org/10.1007/978-3-030-49076-8_23

Download citation

DOI: https://doi.org/10.1007/978-3-030-49076-8_23
Published: 17 June 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-49075-1
Online ISBN: 978-3-030-49076-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Accurate Identification of Tomograms of Lung Nodules Using CNN: Influence of the Optimizer, Preprocessing and Segmentation

Abstract

Similar content being viewed by others

Computer-aided detection of pulmonary nodules: a comparative study using the public LIDC/IDRI database

Hierarchical approach for pulmonary-nodule identification from CT images using YOLO model and a 3D neural network classifier

Lung cancer classification and identification framework with automatic nodule segmentation screening using machine learning

Keywords

1 Introduction