Keywords

1 Introduction

Histological image analysis is a process that allows to evaluate if microscopic structures at the sub-cellular, cellular, tissue and organs level are affected by diseases, through various computer assisted methods. Tissue image analysis could be used to measure the cancer cells in a biopsy of a cancerous tumour taken from a patient and it can significantly reduce uncertainty in characterizing tumours compared to evaluations done by histologists, or improve the prediction recurrence rate of some cancers. Image analysis involves complex algorithms which identify and characterize cellular colour, shape and quantity of the tissue sample using image pattern recognition technology. In [1] global features are used to automatically discriminate lymphoma, in [2] wavelet features are used for the detection of tumours in endoscopic images and in [3] image texture informations are used to automatically discriminate polyps in colonoscopy images. Over the past few years moment functions have been used in medical image analysis with promising performance. They are statistical measures used to obtain the relevant information of an object. Since the introduction of invariant moments in image analysis [4], moment functions have been widely used in image processing and pattern classification applications, as discriminative descriptors, such as the geometric moments [5] for texture classification, or the complex moments for texture segmentation [6]. However, both geometric and complex moments contain redundant information and are sensitive to noise, due to the fact that the kernel polynomials are not orthogonal. For these reasons many different moments have been proposed, such as the discrete Tchebichef moments [7], the discrete moments known as Krawtchouk moments [8] or orthogonal moments like Legendre and Zernike moments [9].

Orthogonal moments are shown to be less sensitive to noise and have an efficient capability in feature representation with minimum redundancy. Zernike moments have been widely used in different types of applications, in shape-based image retrieval [10, 11] and in pattern recognition [12] tasks. In medical image analysis the orthogonal moments have been used to reconstruct noisy CT, MRI, X-ray medical images [13], to describe the texture of a CT liver image [14] or prostate ultrasound [15], to detect tumours in brain images [16] or in mammography images [17], to recognize parasites [18] and spermatogonium [19].

In this work we propose a comparative study between different descriptors based on texture information for histological image classification. In particular, our study is focused on measuring the accuracy of moments (Hu, Legendre, Zernike), Local Binary Patterns (LPBs) and co-occurrence matrices in classifying histopathological images. The experimental results show that the combination of orthogonal moments with co-occurrence matrices reaches a very high level of accuracy on all the tested datasets, overcoming the most common and used descriptors. The rest of paper is organized as follows. Next section presents the texture descriptors definition used throughout this work. Section 3 presents the experimental evaluation, describing the utilized datasets, showing the experimental measures, the implementation details of each descriptor, and the results achieved for the specific collections. Finally, in Sect. 4 we give the conclusions.

2 Texture Descriptors

In this section we describe three important classes of texture descriptors: image geometric and orthogonal moments, cooccurence matrices and local binary patterns.

2.1 Image Moments

The moments are widely used in many applications for features extraction due to their invariance to scale, rotation and reflection change. The use of moments for image analysis and pattern recognition was inspired by Hu [4]. Hu’s, Legendre’s and Zernike’s are the most common moments.

Hu moments. They are derived and calculated from geometric moments. The two-dimensional geometric moments of order \((p + q)\) of an image of \(M \times N\) pixels with intensity function \(f(x,\,y)\) are defined as:

$$\begin{aligned} m_{pq} = \sum _{x=0}^{M-1}\sum _{y=0}^{N-1} x^{p}y^{q}f(x,y), \end{aligned}$$
(1)

where \(p, q = 0, 1, 2, \ldots \). A set of n moments consists of all \(m_{pq}\) for \(p + q \le n\). The corresponding central moments are defined as:

$$\begin{aligned} \mu _{pq} = \sum _{x=0}^{M-1}\sum _{y=0}^{N-1} (x-\overline{x})^{p}(y-\overline{y})^{q}f(x,y), \end{aligned}$$
(2)

where \(\overline{x}=m_{10}/m_{00}\) and \(\overline{y}=m_{01}/m_{00}\) are the coordinates of the centre of mass of the image. The central moments \(\mu _{pq}\) defined in Eq. 2 are invariant under the translation of coordinates. They can be normalized to preserve the invariance by scaling. For \(p + q = 2,3,\ldots \) the normalized central moments of an image are given by:

$$\begin{aligned} \eta _{pq}= \frac{\mu _{pq}}{\mu ^{\gamma }_{00}}\qquad \text {with}\qquad \gamma = \frac{p+q}{2} + 1. \end{aligned}$$
(3)

Hu defined seven functions that are invariant to scale, translation and rotation changes [4], from the normalized central moments through the order three.

Legendre Moments. Legendre moments are orthogonal moments first introduced by Teague [9]. They were used in several patterns recognition [4] tasks. The Legendre moment of order \((p\,+\,q)\) of an image of \(M\times N\) pixels with intensity function f(x, y) is defined on the square \([-1,+1]\times [-1,+1]\), by:

$$\begin{aligned} L_{pq}=\frac{(2p+1)(2q+1)}{M\times N}\sum _{i=0}^{M-1}\sum _{j=0}^{N-1}P_{p}(x_{i})P_{q}(y_{j})f(x_{i},y_{j}) \end{aligned}$$
(4)

where \(x_{i}\) and \(y_{j}\) denote the normalized pixel coordinates in the range of \([-1,+1]\), which are given by:

$$\begin{aligned} x_{i}=\frac{2i-(M-1)}{M-1}, \qquad y_{j}=\frac{2j-(N-1)}{N-1} \end{aligned}$$
(5)

and

$$\begin{aligned} P_{p}(x_{i})=\sum _{k=0}^{p}\frac{(-1)^{\frac{p-k}{2}} x^{k}(p+k)!}{2^{p}k!\big (\frac{p-k}{2}\big )!\big (\frac{p+k}{2}\big )!} \end{aligned}$$
(6)

with \(p-k\) even.

Zernike Moments. Zernike moments are the mapping of an image onto a set of complex Zernike polynomials. As these Zernike polynomials are orthogonal to each other, Zernike moments can represent the properties of an image with no redundancy or overlapping of information between the moments [9]. Due to these characteristics, Zernike moments have been used as features set in many applications. The computation of Zernike moments from an input image consists of three steps: computation of radial polynomials, computation of Zernike polynomials and computation of Zernike moments by projecting the image onto the Zernike polynomials [20]. The real-valued radial polynomial is defined as:

$$\begin{aligned} R_{p,q}(r)=\sum _{s=0}^{(p-|q|)/2}\frac{(-1)^{s} (p-s)! r^{p-2s}}{s!\big (\frac{p+|q|}{2}-s\big )!\big (\frac{p-|q|}{2}-s\big )!} \end{aligned}$$
(7)

with \(R_{p,q}(r)=R_{p,-q}(r)\), and p, q generally called order and repetition, respectively. The order p is a non-negative integer, and the repetition q is an integer satisfying \(p-|q|\) even and \(|q|\le p\). The discrete form of the Zernike moments of an image of size \(M \times N\) is expressed as follows:

$$\begin{aligned} Z_{pq}=\frac{p+1}{\lambda } \sum _{x=0}^{M-1}\sum _{y=0}^{N-1}R_{pq}(r_{xy})e^{-jq\theta _{xy}} f(x,y) \end{aligned}$$
(8)

where \(0\le r_{xy} \le 1\) and \(\lambda \) is a normalization factor. In the discrete implementation of Zernike moments, the normalization factor \(\lambda \) must be the number of pixels located in the unit circle by the mapping transformation and corresponds to the area of a unit circle \(\pi \) in the continuous domain. The transformed \(\theta _{xy}\) phase and the distance \(r_{xy}\) at the pixel coordinates (x, y) are given by:

$$\begin{aligned} \theta _{xy}=tan^{-1}\bigg (\frac{(2y-(N-1))/(N-1)}{(2x-(M-1))/(M-1)}\bigg ) \end{aligned}$$
(9)
$$\begin{aligned} r_{xy}=\sqrt{\bigg (\frac{2x-(M-1)}{M-1}\bigg )^{2}+\bigg (\frac{2y-(N-1)}{N-1}\bigg )^{2}}. \end{aligned}$$
(10)

2.2 Co-occurrence Matrices

One of the earliest method for texture descriptors extraction was proposed by Haralick et al. [21]. His method is based on the creation of the grey level co-occurrence matrices, GLCMs, from which features representing some image aspects can be calculated. A GLCM represents the probability of finding two pixels i, j with distance d and orientation \(\theta \). Obviously, the d and \(\theta \) values can assume different values, but the most used are \(d = 1\) and \(\theta = [0^\circ , 45^\circ , 90^\circ , 135^\circ ]\). A GLCM for an image of size \(M \times N\) with \(N_{g}\) grey levels is a 2D array of size \(Ng \times Ng\). Haralick proposed thirteen descriptors that can be extracted from these matrices. Interesting methods have already been presented in order to extend the original implementation of GLCM. In [22] different values for the distance parameter influencing the matrices computation are evaluated, in [23] the GLCM descriptors are extracted by calculating the weighted sum of GLCM elements, in [24] the GLCM features are calculated by using the local gradient of the matrix. Furthermore, the GLCM has been extracted using the colour information from single channels [25] or by combining them in pairs [26, 27]. Considering that invariant descriptors have our main focus in this work, we compute the GLCM only using the grey level intensities, and convert the rotation-dependent descriptors in rotationally invariant by the following approach. We start considering all the possible circular shifts of a feature vector \(f_k=[f_1, \ldots , f_m]\). Then, we construct a matrix of size \(m\times m\) in which all the circular shifts of the vector \(f_k\) are present and disposed regularly, generating a symmetric matrix as follows:

$$ \left( \begin{array}{ccccc} f_1 &{} f_2 &{} \cdot \cdot \cdot &{}f_{m-1} &{}f_m \\ f_2 &{}\cdot \cdot \cdot &{} \cdot \cdot \cdot &{}f_m &{}f_1 \\ \cdot \cdot \cdot &{}\cdot \cdot \cdot &{} \cdot \cdot \cdot &{}\cdot \cdot \cdot &{}\cdot \cdot \cdot \\ f_{m-1} &{}f_m &{}\cdot \cdot \cdot &{}\cdot \cdot \cdot &{}\cdot \cdot \cdot \\ f_m &{} f_1 &{}\cdot \cdot \cdot &{}\cdot \cdot \cdot &{}f_{m-1}\end{array} \right) $$

Hence, the eigenvalues of such matrix are the new invariant descriptors, GLCMri, as they preserve dimension and direction of the original feature vector.

2.3 LBP Descriptors

The LBPs, instead, are a quite recent tool for texture analysis, originally proposed in [28] and widely used for grey level texture classification, due to its simplicity and robustness. This operator transforms the image by thresholding the neighbourhood of each pixel and by coding the result as a binary number. The resulting image histogram can be used as a feature vector for texture classification. Moreover, radius and number of neighbourhood pixels are two main parameters needed for the LBP operator. Although the LBP have been extended in many different ways, the most useful version, proposed by the same authors [29], realizes a rotation invariant descriptor, called LBPri. The LBPri is easily obtained through an iterative rotation of the binary digits, until the smallest value has been reached.

3 Datasets

The experimentation has been carried out on seven of the most famous colour histology image databases: HistologyDS, Pap-smear, Lymphoma, Liver Aging Female, Liver Aging Male, Liver Gender AL and Liver Gender CR that represent a set of really different computer vision problems.

HystologyDS (HIS) database [30] is a collection of 20,000 histology images for the study of fundamental tissues. It is provided in a subset of 2828 images annotated by four fundamental tissues: connective, epithelial, muscular and nervous. Each tissue is captured in a 24-bit RGB image of size \(720\times 480\). Some sample tissue images from HIS database are showed in Fig. 1.

Fig. 1.
figure 1

Four different tissues from HistologyDS database.

Pap-smear (PAP) database [31] is a collection of pap-smear images acquired from healthy and cancerous smears coming from the Herlev University Hospital. It is composed of 917 images containing cells, annotated into seven classes: four represent abnormal cells and three represent normal cases. Nevertheless, from the medical diagnosis viewpoint the most important requirement corresponds to the general two-class problem of correct separation between normal from abnormal cells. For this reason we have considered only the binary case. Each cell was captured in a 24-bit RGB image without a fixed size, that ranges from about \(50\times 50\) to about \(300\times 300\). Some examples are showed in Fig. 2.

Fig. 2.
figure 2

The seven classes of cells belonging to Pap-smear database: first four abnormal and last three normal.

Lymphoma (LYM) database [1] is a collection of tissues affected by malignant lymphoma, a cancer affecting lymph nodes. Three types of malignant lymphoma are represented in the set: Chronic Lymphocytic Leukemia (CLL), Follicular Lymphoma (FL) and Mantle Cell Lymphoma (MCL). This dataset presents a collection of samples from biopsies sectioned and stained with (H&E), realized in different laboratories by several pathologists. Only the most expert pathologists specialised in these types of lymphomas are able to consistently and accurately classify these three lymphoma types from H&E-stained biopsies. This slide collection contains significant variation in sectioning and staining and for this reason it is more representative of slides commonly encountered in a clinical setting. This database contains a collection of 374 slides captured in a 24-bit RGB image of size \(1380\times 1040\). In Fig. 3 a randomly selected image from each class is showed.

Fig. 3.
figure 3

Three different kinds of lymphoma belonging to lymphoma database.

AGEMAP Atlas of Gene Expression in Mouse Aging Project [32] is a study by the National Institute on Aging, involving 48 male and female mice, of four ages (1, 6, 16, and 24 months), on ad-libitum or caloric restriction diets. Fifty colour images from 30 livers were manually acquired using a Carl Zeiss Axiovert 200 microscope and 40x objective, for a total of 1500 images. Each image is of size \(1388\times 1040\) in TIFF format with a 36-bit RGB colour depth. As the acquisition was done using 12 bits of quantization per colour channel, the histograms have been compressed so as to cover the 8 bits encoding. All the slides were prepared by the same person, thus staining variability in this dataset is very limited. AGEMAP images can be analysed across multiple axis of differentiation: age, gender, diet, or individual mice to construct a variety of classification problems. For these reasons the datasets’ authors proposed three different experiments using three different subsets of the original images:

  • Liver Aging Female (LAF) experiment consists on a 4-way classification problem using the four classes (1, 6, 16 and 24 months) of images of female mice on ad libitum diet. This set is composed of 529 images.

  • Liver Gender AL (LGAL) experiment consists on a 2-way classifier which classifies the gender of the mouse based on the images of 6-month old male and female mice on ad-libitum diet. This set is composed of 265 images.

  • Liver Gender CR (LGCR) experiment consists on a 2-way classifier which classifies the gender of the mouse based on the images of 6-month old male and female mice on caloric restriction diet. This set is composed of 303 images.

One more experiment has been added, Liver Aging Male (LAM), to the previously mentioned. It consists on a 4-way classification problem, like the first one, even though four classes (1, 6, 16 and 24 months) of images of male mice on an ad libitum diet have been used. This set is composed of 499 images.

4 Experimental Evaluation

The performance of the described descriptors has been evaluated following two strategies. The first one has been performed converting each image in grayscale and extracting each descriptor from the converted image. The second one has been performed over the RGB images applying the computation scheme for the three R, G, B channels and linking the descriptors into a single vector in order to take into account the colour information as we proposed in [27]. Classification performances have been evaluated by calculating the accuracy, which offers a good indication of the performance since it considers each class of equal importance. Thus the classification accuracy have been estimated through a k-Nearest Neighbour (k-NN) classifier, with \(k = 1\) and using the euclidean distance. k-NN strategy has been preferred over more complex classifier in order to produce the results more representative of the effectiveness of the descriptors than of the classifiers themselves. Both analysing the grayscale images and the colour images we first tested the Hu, Zernike (up to order 10) and Legendre (up to order 8) moments and GLCMri and LBPri texture descriptors individually to assess the performances of the state-of-the-art methods. Then, we evaluated if the previous descriptors could benefit from a combination of them. In particular, we evaluated if the invariant moments could be more discriminative if extracted starting from a different representation instead of directly from the original images. Thus, we computed Hu, Zernike and Legendre moments starting from the LBP images and from the GLCM computed with angles \(0 ^\circ , 45 ^\circ , 90 ^\circ \) and \(135 ^\circ \). In order to better understand the behaviour of the single descriptors and their combination we report a plot in Fig. 5 where the average accuracy calculated on every descriptor applied to each of the datasets from the grayscale images is showed. As it can be observed, all the invariant moments, and in particular Zernike and Legendre moments, are more discriminant if extracted from a different representation. However, in order to further improve the classification performances, a second experiment has been conducted. We extracted the features considering the R, G, B, channels colour information, by computing every descriptor for each of the colour channels and then concatenating the results of the three channels in the same feature vector as we proposed in [27]. In that work we demonstrated that the performance of a descriptor depends on the used color model. So, in order to make a fair comparison of our descriptors, in this work we have chosen the RGB color space. A plot that sums up this experiment is presented in Fig. 6. The performance of all the descriptors improves considerably by using colour information.

Fig. 4.
figure 4

Four liver images representing: female mice of the different ages (top), male mice of the different ages (center), male and female mice on Ad-libitum diet and on caloric restriction diet.

Fig. 5.
figure 5

Average performances of the descriptors extracted from the grayscale converted images. (Color figure online)

Fig. 6.
figure 6

Average performances of the descriptors extracted from the RGB images. (Color figure online)

5 Conclusions

In this work we have proposed a comparative study between different descriptors in analysing histological images. We focused the comparison on descriptors invariant to image rotations and, in particular, we measured the accuracy of moments, local binary patterns and co-occurrence matrices in classifying histological images. The experimentation has been conducted on well known public biomedical datasets: HistologyDS, Pap-smear, Lymphoma, Liver Aging Female, Liver Aging Male, Liver Gender AL and Liver Gender CR that represent a set of really different computer vision problems. We observed that, by extracting the invariant moments from the GLCM matrices, the overall accuracy of the invariant moments increases considerably, overcoming the classical LBP ang GLCM approaches. In particular, if extracted taking into account colour information, the Zernike and Legendre moments impose themselves as very powerful descriptors for histological image analysis.