Keywords

1 Introduction

Automated analysis of retinal vascular structure is very important to support examination, diagnosis and treatment of eye disease [1, 2]. Vascular changes in the eye fundus images, such as the arteriolar constriction or arteriovenous nicking, are also associated with diabetes, cardiovascular diseases and hypertension [3]. Arteriolar-to-Venular Ratio (AVR) is considered to be an important characteristic sign of a wide spectrum of diseases [4]. Low AVR, i.e., narrowing in arteries and widening of veins, is a direct biomarker for diabetic retinopathy. By contrast, a high AVR has been associated with higher cholesterol levels and inflammatory markers [5]. However, manual annotation of the artery and vein vessels is time consuming and prone to human errors. An automated method for classification of vessels as arteries or veins is indispensable.

Fig. 1.
figure 1

Overview of the proposed method. (a) Original image. (b) Extracted vessels. (c) Skeletonized vessels. (d) Graph generated with significant nodes overlaid. (e) Estimated vascular network topology. (f) Classified arteries and veins, where arteries are shown in red and veins in blue.

The task of separating vascular network into arteries and veins appear to be understudied. Martinez-Perez et al. [6] proposed a semi-automatic retinal vessel analysis method that is capable of measuring and quantifying the geometrical and topological properties of retinal vessels. Vazquez et al. [3] combined color-based clustering and vessel tracking to differentiate arteries and veins. A tracking strategy based on the minimal path approach is employed to support the resulting classification by voting. Dashtbozorg et al. [7] proposed a graph-based method for A/V classification. Graph nodes and intensity feature analysis were undertaken to establish the artery/vein distinction. Estrada et al. [5] utilized a global likelihood model to capture the structural plausibility of each vessel, and employed a graph-theoretic method of estimating the overall vessel topology with domain-specific knowledge to accurately classify the A/V types. Huang et al. [8] introduced four new features to avoid distortions resulted from lightness inhomogeneity, and the accuracy of the A/V classification is improved by using a linear discriminate analysis classifier.

Numerous factors can cause the aforementioned A/V classification methods to return inaccurate results. Several methods [3, 6, 8, 9] rely on precise segmentation results: any ambiguity in distinguishing between small and midsized vessels makes the subsequent A/V classification a very difficult computational task. On the other hand, pathological conditions and intensity inhomogenities also affect the performance of A/V classification techniques [7, 8]. To address these problems, we propose a novel Dominant Sets-based A/V classification method (DoS) based on vessel topological features. The underlying vessel topology reveals how the different vessels are anatomically connected to each other, and is able to identify and differentiate the structure of individual vessels from the entire vessel network. The concept of dominant sets clustering [10, 11] was introduced to tackle the problem of vessel topology estimation and A/V classification.

2 Method

Figure 1 shows a graphical overview of the proposed method.

2.1 Graph Generation

Our proposed topology estimation approach can be applied either on manual annotations or automated segmentation results. The method proposed in [1] was employed to automatically segment the retinal vessel. An iterative morphology thinning operation [12] is performed on the extracted vessels to obtain a single-pixel-wide skeleton map. The vascular bifurcations/crossovers, and vessel ends (terminal points) can be extracted from the skeleton map by locating intersection points (pixels with more than two neighbors) and terminal points (pixels with one neighbor). All the intersection points and their neighbors may then be removed from the skeleton map, in order to obtain an image with clearly separated vessel segments. A vessel graph can be generated by linking first and last nodes in the same vessel segment. The generated graph usually includes misrepresentations of the vessels: typical errors are node splitting, missing link and false link. Correction of these errors can be achieved by using the strategy proposed in [7]. Red dots in Fig. 1(d) indicate terminal points, green triangle bifurcations, and blue squares intersection or crossover points.

2.2 Dominant Sets Clustering-Based Topology Estimation

The topology reconstruction can be achieved by breaking down the graph nodes into four categories (node degrees 2–5): connecting points (2), bifurcation points (3, 4), and crossing/meeting points (3, 4, 5). The number in the bracket indicates the possible number of links connected to each node (node degree). The method proposed by Dashtbozorg et al. [7] is used to handle cases of nodes of degrees 2–3. For nodes of degrees 4 and 5, a classification method based on dominant sets clustering is proposed. The nodes to be classified are represented as an undirected edge-weighted graph with \(G=(V,E,\omega )\).

Since we are only taking into account of the pixels around the connecting point, we have \(|V|\le 5\). The edge set \({E\subseteq V\times V}\) indicates all the possible connections. \(\omega : E\rightarrow R^{*}_+\) is the positive weight function. The symmetric matrix \(A=(a_{ij})\) is used to represent the graph G with a weighted adjacency matrix. This non-negative adjacency matrix is defined as:

$$\begin{aligned} {a_{ij}}=\left\{ \begin{array}{ll} \omega (i,j) &{} \text {, if }(i,j)\in E \wedge i \ne j\\ {0} &{} \text {, otherwise.}\\ \end{array}\right. \end{aligned}$$
(1)

The concept of dominant set is similar to that of maximum clique. In an undirected edge-weighted graph, the weights of edges within a dominant set should be large, representing high internal homogeneity or similarity, while the weights of those linking to the dominant set from outside it will be small [13]. Let \(S \subseteq V\) be a nonempty subset of nodes, \(i\in S\), and \(j\notin S\). Intuitively, the similarity between nodes j and i can be defined as:

$$\begin{aligned} \phi _S(i,j)=a_{ij}-\frac{1}{|S|}\sum _{k\in S}{a_{ik}} \end{aligned}$$
(2)

It is worth noticing that \(\phi _S(i,j)\) can be either positive or negative. \(\frac{1}{|S|}\sum _{j\in S}{a_{ij}}\) is the average weighted degree of i with regard to S. It can be observed that \(\frac{1}{|S|}\sum _{j\in S}{a_{ij}}=0\) for any \(S:|S|=1\wedge S\subseteq V\), hence \(\phi _{\{i\}}(i,j)=a_{ij}\). For each node \(i\in S\), the weight of i with regard to S is assigned as:

$$\begin{aligned} {\omega _s(i)}=\left\{ \begin{array}{ll} {1} &{} \text {if } |S|=1\\ \sum \nolimits _{j\in S\setminus {\{i\}}}{ \phi _{S\setminus {\{i\}} } (j,i)\omega _{S\setminus {\{i\}}} (j) }&{} \text {otherwise.}\\ \end{array}\right. \end{aligned}$$
(3)

where \(S\setminus {\{i\}}\) indicates the node set S excluding the node i. \({\omega _S(i)}\) demonstrates the overall similarity between node i and the nodes of \(S\setminus {\{i\}}\).

A subset of nodes \(S,S\in V\) is called a \(\textit{dominant set}\) if the set S satisfies the following two conditions: (a) \(\omega _S(i)>0, \text {for all } i\in S\); and (b) \(\omega _{S\cup \{i\}}(i)<0, \text {for all } i\notin S\) [11]. It is evident from the above properties that (a) a dominant set is defined by high internal homogeneity, whereas (b) defines the degree of external incoherence. One can find a dominant set by first localizing a solution of the program:

$$\begin{aligned} \begin{array}{ll} \mathrm {maximize} &{} f(\mathbf {x})=\mathbf {x}'A\mathbf {x} \\ \mathrm {subject\, to} &{} \mathbf {x}\in \varDelta \\ \end{array} \end{aligned}$$
(4)

where \(\mathbf {x}'\) denotes the transposition of \(\mathbf {x}\), \(\varDelta \subset \mathbb {R}^{|V|}\), and

$$\begin{aligned} \varDelta =\Big \{\sum _{k=1}^{|V|}x_k=1, \text { and }\, x_k \ge 0 \, \text { for all } \, k=1\cdots |V|\Big \} \end{aligned}$$

A strict local solution to the standard quadratic program \(\mathbf {x}\) indicates a dominant set S of G, where \(x_k > 0\) means that the according node \(i_k \in S\). As suggested in [10, 11], an effective optimization approach for solving Eqn. (4) is given by the so-called replicator dynamics:

$$\begin{aligned} x_k^{(t+1)} = x_k^{(t)} \frac{{(A\mathbf {x}^{(t)})}_i}{{\mathbf {x}^{(t)}}'A\mathbf {x}^{(t)}}, \end{aligned}$$
(5)

where \(k=1,2,\cdots ,|V|\). It has been proven that for any initialization of \(\mathbf {x} \in \varDelta \), its trajectory will remain in \(\varDelta \) with the increase of iteration t. Since A is symmetric, the objective function f(x) in Eqn. (4) is either strictly increasing, or constant. In practice, the stop criteria of Eqn. (5) can be set either as a maximal number of iteration tof iterations or a minimal increment of f(x) over two consecutive iterations.

In the reconstruction of a vascular network topology, the dominant set is a good method of identifying branches of the vascular tree with nodes whose degree is above 3. In general, the weights of edges within a vessel segment should be large, representing high internal homogeneity, or similarity. By contrast, the weights of edges will be small for two or more different vessel segments, because those on the edges connecting the vessel ends represent high inhomogeneities [10]. Intuitively, the identification of vessel branches is more likely to be carried out by finding the most “dominant” vessel branch first and then finding the second most “dominant” vessel branch (and so on). Therefore, dominant set clustering is adopted in this step to determine the most “dominant” vessel branch pixels around each connecting point and assign them to one vessel segment. The remaining pixels are then assigned to the other vessel segment. Practically, for each vessel segment, a feature vector of 23 features is derived for each vessel segment to generate the symmetric matrix A, and these features are listed in Table 1.

2.3 Artery/Vein Classification

After estimating the vessel topology, the complete vessel network is separated into several subgraphs with individual labels. The final goal is to assign these labels to one of two classes: artery and vein. Again, the features listed in Table 1 and the DoS classifier are utilized to classify these individual labels into two clusters, A and B. For each subgraph v, the probability of its being A is computed by the number of vessel pixels classified by DoS as A: \(P^v_A=n_A^v/(n^v_A+n^v_B)\), where \(n^v_A\) is the number of pixels classified as A, and \(n^v_B\) is the number of pixels classified as B. For each subgraph, the higher probability is used to define whether the subgraph is assignable to category A or B. Cluster A and B are then assigned as artery and vein, respectively, based on their average intensity in the green channel: a higher average intensity is classified as artery and lower as vein.

Table 1. List of feature vectors for classification.

3 Experimental Results

The proposed topology estimation and A/V classification method was evaluated on three publicly available datasets: INSPIRE [14], AV-DRIVE [15], and VICAVR [16]. All of these datasets have manual annotations on A/V classification, but no manual annotations of vessel topology were made on these datasets. Therefore, an expert was asked to manually label the topological information of the vascular structure on all the images from these datasets. Each vessel tree is marked with a distinct color, as shown in the second column of Fig. 2.

Fig. 2.
figure 2

Examples of vascular topology estimation performances. From left to right column: original image; manual annotations; results from the proposed topology estimation, and the highlighted correct and incorrect connections.

Fig. 3.
figure 3

A/V classification results on three different datasets. From left to right column: original image; vessel topology; A/V classification results of the proposed method; and corresponding manual annotations.

3.1 Topology Estimation

The two right-hand columns of Fig. 2 illustrate the results of our vascular topology estimation method. Compared with the manual annotations shown in the second column of Fig. 2, it is clear from visual inspection that our method is able to trace most vascular structures correctly: only a few crossing points were incorrectly traced, as shown in the last column of Fig. 2 - the pink squares indicate the incorrectly traced significant points.

To facilitate better observation of the performance of the proposed method, the percentage of the relevant significant points (connecting, bifurcation, and crossing points) that were correctly identified (average accuracy): was calculated as 91.5%, 92.8%, and 88.9% in INSPIRE, AV-DRIVE, and VICAVR, respectively.

3.2 A/V Classification

Figure 3 shows the A/V classification performances of the DoS classifier on sample images based on their topological information. Overall, our proposed method correctly distinguished most of the A/V labels on all three datasets, when compared with the corresponding manual annotations. In order to better demonstrate the superiority of the proposed method, Table 2 reports the comparison of our method with the state-of-the-art methods over three datasets in terms of pixel-wise sensitivity (Se), specificity (Sp), and accuracy (Acc). It is clear that our method outperforms all the compared methods on all datasets, except that the sp score on DRIVE dataset is \(0.2\%\) lower than [5].

Table 2. Performances of different A/V classification methods on three datasets.
Table 3. Classification results by different classifiers on INSPIRE dataset.

To highlight the relative performance of our DoS classifier, we also employed commonly-used classifiers, namely linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), support vector machine (SVM) and k-nearest neighbor (kNN) for A/V classification based on the topology-assigned structures derived from images from the INSPIRE dataset, with the same feature vectors as listed in Table 1. It can be seen from Table 3 that our method clearly outperforms the compared classification methods.

4 Conclusions

Development of the proposed framework was motivated by medical demands for a tool to measure vascular changes from the retinal vessel network. In this paper, we have proposed a novel artery/vein classification method based on vascular topological characteristics. We utilized the underlying vessel topology to better distinguish arteries from veins. The concept of dominant set clustering was adapted and formalized for topology estimation and A/V classification, as a pairwise clustering problem. The proposed method accurately classified the vessel types on three publicly accessible retinal datasets, outperforming several existing methods. The significance of our method is that it is capable of classifying the whole vascular network, and does not restrict itself to specific regions of interest. Future work will focus on the AVR calculation based on the proposed methodology.