Keywords

1 Introduction

At present, there has been an incredible growth in the use of machine learning techniques in medical research, mainly applied to genetics [1], disease detection, biomedical image segmentation [2, 3] and classification, thus showing the efficacy of machine learning in clinical decisions and monitoring systems [4]. The use of convolutional neural networks (CNN) in deep learning has helped in the automatic detection of various diseases particularly through the processing of biomedical images and clinical data. Recently, CNN research related to lung cancer, has focused on the automatic diagnosis of cancer [5, 6], lung segmentation [7,8,9], segmentation of pulmonary nodules [10,11,12,13], lung nodules detection [14, 15], cancer classification [16], nodule categorization [17] and nodule malignancy assessment [18,19,20,21,22,23,24,25,26]. Various investigations, related to lung nodules, report the influence of the data augmentation [14, 24, 26], number of input channels [20] and the use of dropout [8, 14, 18, 20, 21, 24, 26], in order to improve the accuracy of the network and to avoid overfitting. Likewise, some other researchers report the influence of the number of parameters [20, 23] and training time [20]. Nonetheless, the use of preprocessing and segmentation has been little explored; the same applies to the effect of various available optimizers. The main goal of this investigation is to evaluate the influence of the optimizer (Adam [27] and Nadam [28]), preprocessing and segmentation in CNN for the precise identification of tomograms with pulmonary nodules. The evaluation was carried out considering both in precision and training time. The experiments were carried out on the Lung TIME [29] dataset, which is publicly available. In continuation, the paper is organized as follows: Sect. 2 deals with the materials and methods used while in Sect. 3 the results obtained are discussed and finally in the Sect. 4, the conclusions and future work are presented.

Fig. 1.
figure 1

Pipeline of Methodology (Color figure online)

2 Materials and Methods

Three scenarios (see Fig. 1, the yellow color indicates the first case analyzed, while the blue color illustrates the second and the green color denotes the third) were considered to carry out the identification of tomograms with lung nodules applying a convolutional neural network: (i) to rescale the tomograms to \(96\times 96\) pixels and pass them as an input to CNN. (ii) to segment the tomograms to obtain the pulmonary regions and rescale them, then pass them as input to CNN. (iii) to preprocess the tomograms by applying filters (median and Gaussian), then the preprocessed image was binarized, subsequently the tomograms were scaled, which were taken as input to CNN. The motivation to perform the downsampling of the tomograms was to decrease the training time.

2.1 Dataset Used in the Study

In this study, CT thorax scans in DICOM format with annotations of the pulmonary nodules in XML format of Lung TIME [29] was used. 62 CT thorax scans were chosen, which had 2003 tomograms with nodules and 12934 without nodules. To validate the results, 70% of the tomograms was randomly selected and utilized for training and the rest for testing.

2.2 Preprocessing

To improve the quality of the tomograms, the median filter and afterwards the Gaussian filter were applied, as discussed in [31] to eliminate salt-and-pepper noise, and the mottled noise from the image. The applied median filter mask was \(5\times 5\) pixels. On the other hand, standard deviation for Gaussian kernel was equal to 2.

2.3 Segmentation

To perform the segmentation, the thresholding technique was chosen. Thresholding is a simple and efficient technique for partitioning an image into a foreground and background [30]. According to Alakwaa et al. [16] it produces the best lung segmentation compared to clustering techniques (K-means and Mean Shift) and Watershed. Binarization was performed with a threshold of −350 HU as suggested by Pulagam et al. [32] to separate the pulmonary region tomography. Finally, the components connected to the edge of the binarized image were removed.

Table 1. CNN architecture

2.4 CNN

The description of the layers of the CNN architecture is indicated in Table 1, which consists of multiple convolutional layers with ReLU activation, maxpooling, flatten, dense and a final fully connected softmax layer to carry out the classification between tomograms with nodules and tomograms without nodules. Table 2 shows the CNN architecture using the Dropout layer, which helps selectively ignore neurons during training [33]. Both architectures were tested with Adam [27] and Nadam [28] optimizers. A batch size of 32, 5 epochs and a sparse categorical crossentropy loss function [34] was applied.

Table 2. CNN architecture with Dropout

2.5 Computer Equipment

To implement CNN, Tensor Flow 2.0 was utilized in Python 3.7. The imageio [35] library was employed to read the DICOM images. For the preprocessing SciPy [36] library was used, while for segmentation of tomograms the scikit-image [37] library was used. The equipment on which the tests were performed has the following characteristics:

  • Operating System: Windows 10 Home 64-bit (Build 18362)

  • Processor: Intel(R) Core(TM) i3-5015U CPU @ 2.10 GHz

  • Memory: 6 GB

Fig. 2.
figure 2

(a, b) original images of slices, and (c, d) images obtained after application of filters

3 Experimental Results and Analysis

Figure 2 shows an example of the application of filters (first the median filter and then the Gaussian) to the tomograms. The use of preprocessing significantly increases image quality, thus helping to reduce both salt and pepper and the mottled noises from the images.

Figure 3 shows examples of binarization in the tomograms. By means of the segmentation, the pulmonary region could be obtained, which allowed to improve the performance of the CNN.

Fig. 3.
figure 3

(a, b) original images of slices, and (c, d) binarized slices

Table 3. Results obtained from the comparison experiments without using Dropout

Table 3 gives a summary of the experiments performed without using Dropout while Table 4 reports the experiments carried out with a 0.0002 Dropout rate. Also tests were performed with/without preprocessing, with/without segmentation and with different number of tomograms. Performance was compared between Adam and Nadam optimizers. When carrying out the segmentation, better results were obtained, however, the execution time increased. In most tests (both using the Dropout layer and without using it), in which preprocessing was not carried out, better results were observed using the Nadam optimizer and a shorter runtime. When Dropout was not applied, preprocessing was performed and the Nadam optimizer was used, in some cases the runtime increased, compared to the Adam optimizer. So when the Dropout layer is not used, it is recommended to use the Nadam optimizer on images that have not been preprocessed, instead the Adam optimizer is suggested for images that were preprocessed.

Table 4. Results obtained from the comparison experiments using Dropout
Fig. 4.
figure 4

Influence of the optimizer, preprocessing and segmentation in the accurate identification of tomograms of lung nodules

Figure 4 shows the average accuracy of training and testing in the experiments performed. On average, the Adam optimizer obtained a training accuracy of 96.17%, test accuracy of 95.23% and training time of 31.95 min in \(96\times 96\) pixel images. In contrast, the Nadam optimizer obtained 96.25%, 95.2% and 33.23 min respectively. It was observed that when using the Nadam optimizer slightly better results are obtained than when those furnished by Adam. In addition, accuracy using only segmentation is better than when it is combined with preprocessing.

4 Conclusions

An experimental analysis was performed through the preprocessing, segmentation and optimizer on images of Lung TIME dataset resized to \(96\times 96\) pixels. It is concluded that convolutional neural networks have excellent performance in the identification of tomograms with nodules, obtaining training accuracy above 90.24% and test accuracy above 86.8%, even when working with images with noise. It is suggested that when working with CT thorax scans, no preprocessing be applied and only segmentation can be performed, since better results were observed in this case (a training accuracy above 97.19% and test accuracy above 95.07% were obtained), compared to applying preprocessing and segmentation (a training accuracy above 96.41% and test accuracy above 94.71% were obtained). In addition, the use of preprocessing significantly increases runtime. On average, the Adam optimizer obtained a training accuracy of 96.17%, test accuracy of 95.23% and training time of 31.95 min. In contrast, the Nadam optimizer obtained 96.25%, 95.2% and 33.23 min, respectively. When Dropout is not applied and preprocessing is performed, it is recommended to use the Adam optimizer. On the contrary, the Nadam optimizer is recommended when no preprocessing on the tomogram is performed. Applying segmentation is an excellent option when accurate results are required. We would like to remark that the model obtained can be used as part of a computer-assisted diagnostic system on lung cancer research. As future work, the location of the nodules in the tomograms identified is proposed. In addition, it would be interesting to compare the performance of different preprocessing techniques.