Keywords

1 Introduction

Glioma is the most frequent primary brain tumor. It originates from glial cells and can be classified in High Grade and Low Grade depending upon the aggressiveness. Gliomas may have different degrees of aggressiveness, variable prognosis and several heterogeneous histological sub-regions. These are described by varying intensity profiles across different Magnetic Resonance Imaging (MRI) modalities, which reflect diverse tumor biological properties [9]. De-spite of recent advances in automated algorithms for brain tumor segmentation in multimodal MRI scans, the problem is still a challenging task in medical imaging analysis [6, 7, 12].

Prior to BRATS challenge, researchers tested their proposed algorithms on local datasets and there was no gold standard available for fair evaluation of methods globally. BraTS challenge provided global platform for researchers to evaluate their proposed algorithms on publically available dataset with leaderboard. This year BraTS challenge was divided in two parts 1. Segmentation of brain tumor with intra-tumor parts 2. Prediction of overall survival of the patients in number of days based on imaging features.

Many computational methods based on texture analysis, probabilistic models, active contours, random forests are proposed for tumor segmentation over decades [10]. Several advances were made in active contours where either an initial seed point was mentioned which would grow till the boundaries of the tumor or a bounding box was drawn across the abnormal region which would further confine to tumor boundaries. Figure 1 shows FLAIR, T1, T1ce, and T2 images with intra-tumor parts- Green for Edema, Blue for Enhancing tumor and Red for Tumor Core. Researchers had proposed methods based on Non-Negative Matrix Factorization where a data matrix was generated from MR data which acted as a feature representation. This data matrix was further clustered with Fuzzy C-means clustering algorithm for brain tumor segmentation [14].

Fig. 1.
figure 1

MRI modalities with intra-tumor parts. Edema in yellow, enhancing tumor in blue and necrotic tumor is shown in red color (Color figure online)

Deep Learning algorithms have been outperforming over all other the state of the art methods for segmentation, classification and detection applications. Researchers have proposed application specific models for which Convolutional Neural Networks (CNN) is the basic building block of the architecture. The advantage of CNN is that it is computationally cheaper compared to Fully Convolutional Networks (FCN). The implementation of CNN is successful because of the advancement in the computational power of the machines. This enables to de-sign neural network models with deep architecture to extract features in an image. Researchers had proposed several pixel classification based approaches for segmentation task where a window was considered around a pixel and the class of the window was the class of center pixel. U-Net based models have outperformed over traditional machine learning methods in bio-medical image segmentation [11]. Recently, there has been an increase in popularity of 3D CNNs which are effective in segmentation task at the expense of additional computational complexity compared to other state of the art algorithms [5].

2 Method

We developed patch based 3D U-Net model for tumor segmentation and evaluated efficiency of radiomic features for OS prediction named as ‘Deep Learning Radiomics Algorithm for Gliomas (DRAG) Model’. There was high class imbalance between tumor pixel and rest of the normal brain pixels in the BraTS datasets. This led to biased training of the model as the loss function was affected by normal brain pixels as compared to the tumor pixels. The problem became more challenging during intra-tumor segmentation. To overcome this issue, we adopted a patch-based training approach. Fixed sized 3D patches were extracted from the BraTS dataset which were used for training the network. Details of our approach are given in the section below.

2.1 Dataset

This proposed method was trained and validated on BraTS 2018 training dataset and validation dataset [1,2,3]. The training dataset included 210 High Grade Glioma (HGG) cases and 75 cases with Low Grade Glioma (LGG) while validation set consisted of 66 cases. For each case, there were four MRI sequences viz. the T1-weighted (T1), T1 with gadolinium enhancing contrast (T1ce), T2-weighted (T2) and FLAIR. All cases had been segmented manually, by four raters and marked annotations were approved by experienced neuro-radiologists into intra-tumor parts like tumor core, enhancing tumor and edema. The MRI data was collected from various institutions and acquired with different protocols, magnetic field strengths and MRI scanners. Furthermore, to pinpoint the clinical relevance of this segmentation task, BraTS 2018 also focused on the prediction of patient overall survival via analysis of radiomic features. For this purpose, the survival data (in days) of 163 cases was provided in training set and 54 cases in validation set. Reference segmentation and OS for validation set was hidden and evaluation was carried out via online evaluation portal.

2.2 Pre-processing

The MRI data in BraTS challenge dataset was already pre-processed which included skull stripping and the data was co-register and re-sampled to 1 mm × 1 mm × 1 mm resolution. The dimensions of each volume were 240 × 240 × 155. The intensity in-homogeneity was addressed with N4ITK tool [13]. All four MRI channel data was normalized to zero mean and unit variance.

2.3 Patch Extraction and Training

The proposed model is modified version of 3D U-Net with 3 down-sampling and 3 up-sampling branches with two back to back convolution layers with kernel size 3. 3D voxels with size 64 × 64 × 64 were extracted randomly from the training data and given as an input to the first layer of the model. Four patches extracted from FLAIR, T1, T2, T1c were concatenated together to form 64 × 64 × 64 × 4 and fed for training to the input layer along with corresponding Ground Truth. Patch extraction was challenging because of the high class imbalance between tumor area and normal brain tissues. During patch extraction, care was taken to include significant tumor area to avoid bias to background and non-tumor pixels. This was done for all the four modalities and ground truth as well. Each layer was followed by ReLU activation and Batch Normalization. No data augmentation was performed during the training of model.

At output 4 probability maps were generated for Necrosis, Edema, Enhancing Tumor and Background (including non-tumor brain pixels). The label was assigned to the map with highest probability amongst all. It was observed that there were some False Positives present in the segmentation output.

3D Connected Component Analysis was done to identify all the segmented components pre-sent in the segmented volume. Threshold value in terms of number of pixels was identified and insignificant small components which false positives were assigned to background label. This reduced false positives significantly. Similarly, to reduce over-segmentation in certain cases a binary brain mask was generated from brain volume and logical AND operation was performed on segmentation output. This improved the accuracy of the segmentation significantly.

2.4 Radiomic Feature Extraction and Training

After segmentation of intra-tumor parts, the next task in BraTS 2018 was to predict the over-all survival of the patients in number of days. For this task, organizers had provided only age details and OS in days which made the task challenging. From the last few years, researchers are working actively on Radiomic Feature extraction for tumor analysis and survival prediction task [4]. In our approach, we computed Radiomic features on FLAIR and T1c volume with different combination of intra-tumor parts as (Figs. 1 and 2):

Fig. 2.
figure 2

Sample segmentation results. Each row represents one case. Columns from left to right: FLAIR, T1, T2, T1c, GT and Output. Segmentation labels: Yellow for edema, Blue for enhancing tumor and Red for tumor core. (Color figure online)

  • Edema, Enhancing tumor and tumor core i.e. Whole tumor (WT)

  • Tumor Core and Enhancing tumor (TC+ET)

  • Enhancing tumor (ET)

We computed first order statistics, shape features, Gray Level Co-occurrence Matrix and Gray Level Run Length Matrix features. We computed 468 features for edema, tumor core and enhancing tumor. These features were used to train the regression model for survival prediction task. We started with 679 variables [678 radiomic variables (113 from each of the different tumor parts for both FLAIR and T1c images) and Age]. The radiomic variables with near perfect correlation (Spearman’s correlation coefficient 0.95 or higher, p 0.05 or lower) with each other were excluded with only one of the variables in each set being retained (N = 117). Age and all radiomics variables with no significant autocorrelations (N = 117) were assessed for relationship with survival.

Multi-Layer perceptron was used to train the neural network. Variables which had a statistically significant correlation (N = 56, including age) with survival were included for training the neural network. Results were replicated by setting a random seed. To assess the efficacy of the neural network and to correct over training, if any, we divided the BraTS 2018 training dataset (N = 163) into Training (51.5%), Validation (14.7%) and Testing (33.7%) subsets, randomly using Bernoulli variates. The neural network had two hidden layers with the number of units per layer set to auto, sigmoid activation functions for both the hidden as well as output layers. The variables were re-scaled using adjusted normalization with a correction of 0.2. The neural network was designed to predict survival in days as well as two broad categories viz. survival <300 days and survival >=300 days. All statistical procedures for survival prediction were performed using SPSS for Windows v24 on a standard computer running Windows 10.

3 Result and Discussion

The performance of the proposed method was evaluated on BraTS 2018 training data with 285 cases and validated on 66 cases for segmentation. The validation leader-board gave interesting information about the performance of the different teams’ algorithms. Average performance of proposed method on training data and validation data is given in Tables 1 and 2 respectively in terms of Dice Similarity Index and Sensitivity. The model was trained for 50 epochs and needed 48 h for training on NVIDIA P100 GPU with 128 GB system RAM. The framework was developed in Tensorflow [8].

Table 1. Performance of proposed method on BraTS 2018 training dataset for segmentation.
Table 2. Performance of proposed method on BraTS 2018 validation dataset for segmentation.

Overall, our approach reached a superior result in the whole tumor segmentation task with an average dice coefficient of 93% over training dataset and 87% over validation dataset. Sample segmentation results for intra-tumor parts are given in Fig. 2 The performance of the proposed approach is given in Table 3 in terms of Dice Coefficient and Hausdorff95 distance.

Table 3. Performance of proposed method on BraTS 2018 test dataset for segmentation.
Table 4. Performance of Multi-layer perceptron for OS prediction on validation dataset.

For the prediction of survival categories, the neural network demonstrated an accuracy of 70.2% in the training subset and 62.5% and 63.6% in the validation and testing subsets, respectively. The accuracy was 69.5% for the entire training dataset. The Area Under Curve (AUC) was 0.799 (Figs. 3 and 4). For prediction of survival in days, the proposed model performed better for values in the middle, with lower performance for the values at both extremes. The relative error was 0.842 for the training subset, 0.774 for the validation subset and 0.910 for the testing dataset.

Fig. 3.
figure 3

ROC curve depicting the accuracy of the model for categorizing into <300 and >=300 days survival.

Fig. 4.
figure 4

Residual to predicted scatter plot showing the good fit of the model for survival values in the middle between 200–350 days.

The performance of proposed OS prediction approach is given in Table 5 for 77 cases. The proposed approach stood third for overall Survival Prediction Task in BraTS 2018 Challenge (Table 4).

Table 5. Performance for OS prediction on test dataset.

Individual variable importance analysis revealed that the Age is one of the most significant variables in this neural network. The other variables are shown in Table 6.

Table 6. Importance of the independent variable in descending order (F = FLAIR)

4 Conclusion

In this study, we proposed a Deep Learning Radiomics Algorithm for Gliomas (DRAG) Model based on 3D U-Net network for brain tumor segmentation. 3D patches were extracted from multi-channel MRI data to train the proposed model. Radiomic features were extracted from FLAIR and T1ce channels for OS prediction task. MLP is trained with these radiomic features to predict the OS in days. The proposed approach achieved third place for OS prediction task in BraTS 2018 challenge [15].

The difference between mean and median in Table 2 indicates that for some cases, our pro-posed approach achieved poor accuracy, which is very close to zero and more analysis is required on this. Prediction of survival without more clinical data and treatment information is challenging and the same is reflected through accuracy of the participants in the leader-board. As the number of cases for OS prediction are less there is a need to develop an efficient feature selection algorithm which will select potential features for accurate OS prediction. Our future goal is to design radiomic features extraction pipeline with deep neural networks.