1 Introduction and Motivation

The eye is the only organ allowing direct, non-invasive and inexpensive observation of a rich portion of the human microvasculature. Ophthalmoscopic instruments include nowadays fundus cameras, optical coherence tomography (OCT), scanning laser ophthalmoscopes, ultra-widefield angiography, autofluorescence and OCT-angiography. Fundus camera imaging remains the most common modality, given its use in many decades of clinical practice and research. A number of software packages have been developed to quantitate the morphometry of the retinal vasculature efficiently in large numbers of images, e.g. IVAN [22], SIVA [23], QUARTZ [24] and VAMPIRE [25]. Coupled with the increasing availability of cross-linked clinical data repositories, the above has enabled a plethora of studies on retinal vascular biomarkers for a variety of conditions, among others diabetes, stroke, dementia, and cardiovascular disease. Morphometric vascular parameters commonly adopted include the central retinal arteriolar/venular equivalents (CRAE, CRVE) and their ratio, the arterio-venous ratio (AVR), all of which summarize measures of vessel calibers around the optic disc (OD); measures of tortuosity; bifurcation coefficients; and the fractal dimension (FD), assessing the complexity of the vascular network. Details of such measurements can be found in many publications; see e.g. MacGillivray et al. [1] for an introduction.

A crucial assumption of biomarker studies is that the retinal vascular measurements are accurate, consistent and reliable. Accuracy, consistency and reliability depend, in turn, on a considerable number of factors [2, 3], not all of which easily controllable. The effects of several of these factors have been reported in a few studies (Sect. 2). In this paper, we focus on a specific and little investigated factor, the centering of fundus images. Centering is commonly of two types: on the macula, named Type I, or on the OD, named Type II (Fig. 1). Clinical protocols require one or both, and for one eye or both (left and right), depending on the pathology of interest. Crucially for our discussion, a standard  ~30o45o field-of-view (FOV) image centered on the macula does not capture vessels in nasal quadrants (Fig. 1) which would be visible in an OD-centered image. Many studies on retinal vascular biomarkers have drawn on existing clinical repositories of retinal images, but it is not always specified whether images of different types have been analyzed separately. To our best knowledge, no reports exist on the quantitative difference in retinal fundus measurements of the same eye induced by different centering. This is what we present in this paper, contributing to the body of studies summarized in the next section.

Fig. 1.
figure 1

Illustration of quadrants and retinal coordinates centered on the OD and circular zones used to compute retinal measurements in a left eye (left). The x axis goes through the estimated centers of OD and macula. Examples of Type I (center) and Type II (right) images of right eyes; Type I (macula-centered) images miss parts of the vasculature in nasal quadrants (Q3, Q4).

2 Related Work

The consistency and repeatability of retinal vascular measurements in fundus images, on which we focus (other imaging modalities have been considered, e.g. OCT [5]), has been investigated sparsely. The examples below, far from a systematic review, highlight concerns raised in the last decade. Notice that there is ample variability in the cohorts, numbers and statistical methods used in these studies themselves.

Chandler et al. [4] studied the variations of CRAE and CRVE, measured with IVAN in 30 fundus camera images (3 to 5 Mpixels from various cameras) of 3 subjects, applying systematically increasing blur. They found significant caliber broadening in blurred images, and almost twice as much on average for CRAE (20 μm) as for CRVE (10 μm). Lim et al. [6] analyzed the effect of variations of axial length and myopic refractive errors on the retinal vasculature in an Asian diabetic population (n = 2,882, Singapore Malay Eye Study), finding narrower retinal arterioles and venules, less tortuous arterioles, and increased branching coefficients in both arterioles and venules depending on axial length and refractive errors.

Knudtson et al. [7] analyzed the effect of the pulse cycle on width estimates in fundus images of 30 subjects. The retinal vessel diameter for one large arteriole/venule and one small arteriole/venule were measured by two trained graders. Results showed that the width of large retinal venules were less variable compared to that of arterioles. A related, more recent study was reported by Hao et al. [13].

Motivated by conflicting findings in the literature of the association of diseases with the FD, Huang et al. [8] performed a stability analysis of three FD measures (box, information and correlation coefficient) against different vessel segmentations from human annotators, automatic segmentation methods, threshold values, and regions of interest. Using 20 images from DRIVE [9], the authors observed substantial variations, leading them to recommend the use of vessel probability maps directly in biomarkers studies [10]. FD stability was also studied by Wainwright et al. [11] against variations of image quality, color, and format in a set of 30 images from the Blue Mountains Eye Study (9.6 Mb, 3,888 × 2,595 pixels), processed with IRIS-Fractal software. Simulated degradations resulted in significant variations of the FD coefficient.

Yip et al. [21] compared vessel width measurements (CRAE, CRVE) from two semi-automatic software applications, SIVA and IVAN, with 200 fundus camera images from the Singapore Chinese Eye Study. They found only moderate associations (ICC ~0.5; for ICC see Sect. 3, Analysis) and discordant associations with body mass index and arterial blood pressure. The same authors report similar results in [22] including the RA application in the comparison. Recently, McGrory et al. [3] reported a similar comparison of SIVA and VAMPIRE with 655 images of participants in the Lothian Birth Cohort 1936 studies. They found ICC values indicating poor to limited agreement for all retinal parameters (0.159–0.410), but consistent associations with systemic variables relating to blood pressure, as well as significant differences in the magnitude of association between retinal and systemic variables for 7 of 77 comparisons. We omit some reports of small-scales comparison between software applications for reasons of space.

Other authors analyzed morphometric parameter variations induced by further factors (see [3] and references therein) including image resolution, operators and fundus cameras make and models.

In summary, several, independent authors have measured considerable variability of retinal vascular measurements. Understanding and reducing such variability seems crucial as statistical associations within biomarker studies rely on accurate and consistent measurements. To our best knowledge, we contribute the first quantitative pilot study on the effect of fundus image centering, an important part of any imaging protocol.

3 Materials and Methods

Data Set.

4 fundus-camera images of each of 20 subjects (2 per eye, macula and OD-centered, 80 images in total) were sourced from the Edinburgh Type 2 Diabetes Study (ET2DS), a population-based cohort study designed to investigate potentially modifiable risk factors for cognitive decrements in type 2 diabetes [12]. Images were acquired with a TOPCON TRC-50FX digital fundus camera at 35° FOV after pupil dilation using 1% tropicamide. Ethical approval for the ET2DS was granted by the Lothian Research Ethics Committee, and written informed consent was obtained from all participants; see Prince et al. [26] for details on the recruitment protocol. The images did not present diabetic lesions upsetting the detection and quantification of the vasculature, hence were considered a suitable sample for our purposes.

Retinal Measurements.

All images were measured by a trained operator (author [1]) with VAMPIRE 3.1 (Universities of Dundee and Edinburgh), obtained from its authors [1, 3] following a standard protocolFootnote 1. For each image, VAMPIRE computes 151 measurements (Sect. 1) and their basic statistics (mean, median, standard deviation, max, min). Measurements are computed by vessel type (arteriole or venule), by region (zone, whole image, quadrant) and vessel (path, generation). We considered the 149 measures describing vessel morphology: 39 widths and functions thereof (e.g. CRAE, CRVE, AVR, basic statistics, width gradients, different width estimation algorithms by artery and vein, average ratio length-diameter at branching points), 104 tortuosity measurements, and 6 FD coefficients (3 per vessel network type, arterial or venous).

Analysis.

Two-way mixed model intra-class coefficients (ICC) were computed to evaluate the extent of correspondence between two measurements (e.g., right eye versus left eye, or OD centered versus macula centered) of the same parameter (e.g., CRAE). The ICC quantifies this agreement, combining a measure of correlation with a test of the difference in means correcting for systematic bias and agreement based on chance alone. ICCs are thought to be more appropriate for assessing whether two variations in measuring a quantitative parameter provide similar results than Pearson’s r, which measures the extent to which two variables are linearly dependent [14]. Method-comparison studies have demonstrated that a perfect linear relationship does not necessarily reflect good or even moderate agreement as measured by ICC [15, 16]. ICC results are usually interpreted using 0.00–.49 = poor, 0.50–0.74 = moderate, and 0.75–1.00 = excellent [16]. Single-measure coefficients and 95% confidence intervals (CI) as well as correlations (raw, uncorrected Pearson’s rs) were also computed. Tortuosity measurements were log-transformed to improve their distributions, which were positively skewed, as done elsewhere [3]. ICCs and Pearson’s correlation were used to examine agreement between macula- and OD-centered images (right and left eye separately). In addition, we also analyzed measurement symmetry between right and left eye (macula and OD-centered images separately).

4 Results

Full-result tables are reported in the supplementary materialFootnote 2 and summarized here. Following a well-established protocol (based on [17] and developments), VAMPIRE requires a minimum number of vessels visible in Zone B and C. These are however only partially visible in macula-centered images (e.g. Figure 1, center), as nasal quadrants are minimally or not at all visible, leading to higher rejection rates than in OD-centered ones (not enough vessels), or to an analysis based on fewer vessels in fewer quadrants. 5 macula-centered images of the right eye had major AVR segments missing in Q2, and one image in Q1. Similarly, 3 macula-centered images of the left eye had AVR segments missing in Q2. All four quadrants were visible in all OD-centered images (right and left eye).

In the right eye, 5 width-related measurements (of 39) showed at least moderate correlation, association and significance (defined for our purposes as r > 0.5, ICC > 0.6, p < 0.1) between OD- and macula-centered image, including CRAE, CRVE, arterial average ratio length-diameter in Zone C and the width gradient of the main artery in Q2 (LDR). Between 13 and 20 images supported these computations for OD- and macula centered images. In the left eye, only 3 width-related measures satisfied our conditions: CRAE (but not CRVE), AVR (not found in the right eye), and the venular (not arterial) LDR, with 12 to 20 images supporting the computation. Only the CRAE and arterial LDR satisfied our conditions in both eyes. For additional illustration, Bland-Altman graphs of two measures (right eyes) are shown in Fig. 2.

Fig. 2.
figure 2

Bland-Altman plots visualizing, for illustration, the association of venular tortuosity in Zone C (left) and AVR (right) between OD- and macula-centered images (right eyes).

Tortuosity measures with at least moderate correlation and ICC (defined as above) between OD- and macula-centered images were only 17 (of 104) in the right and 20 in the left eyes. Of these, only 10 satisfied our conditions in both eyes: 8 arterial and 2 venular tortuosity measures, including 7 taken in Q1 and mean arterial tortuosity in Zone C. Of the 6 FD measures (3 for arteries, 3 for veins), only 2 of 20 images of the right eye supported full computation, leading to excellent but obviously not significant correlation and association; but in the left eyes, all 6 measures could be computed on the full set (20 images). Good and significant correlation (r ~0.7, p < 0.01) and moderate association (ICC ~0.7) was found for arterial measures only.

5 Discussion

Number of Images and Vessels Measured.

The absence or large occlusion of the nasal quadrants in macula-centered images implies higher image rejection rates (not enough vessels) or smaller number of vessels contributing to measurements compared to OD-centered images. Results show that values for the same eye vary in the two cases. What can we say of this variation (in our sample) is summarized below.

Width-Related Parameters.

Given our results (Sect. 4), even imposing minimal requirements on r, ICC and significance, the effect of considering different sets of vessels for CRAE/CRVE and AVR calculations is considerable. This is supported by results reported by Heitmar et al. [18] in a related analysis.

Tortuosity.

Again discrepancies between Type I and II images seem strong. In our sample, only 10 tortuosity measures satisfied our requirements simultaneously in both eyes. We notice that tortuosity values tend to be very small numbers, hence the numerical stability [19] of calculations involving them must be considered carefully.

Fractal Dimension.

There was a marked discrepancy between OD- and macula centered images, with venular measures missing altogether in the right eye due to the exclusion of too many macula-centered images. Again this suggests that omitting substantial parts of the nasal quadrants induces substantial changes on FD measures compared to OD-centered images. This supports related findings and concerns by Huang et al. [8] on the stability of the FD of the retinal vasculature.

Symmetry.

The right-left symmetry of morphometric measurements of the vascular network remains an object of study [20]. Our pilot strengthens the hypothesis that good symmetry levels must not be taken for granted. For instance, CRAE, CRVE, and AVR were poorly and not significantly correlated in macula-centered measurements of either eye; but in the OD-centered images, CRAE and CRVE showed strong correlation (r btw. 0.837 and 0.859, p < 0.001) and excellent agreement (ICC value range, 0.837 and 0.857, 95% CI); and good, significant correlation (r = 0.673, p < 0.01) was obtained for AVR. Similar discrepancies were found for tortuosity and FD (details omitted for conciseness).

6 Conclusions

To our best knowledge, we have reported the first pilot study on the quantitative changes in retinal measurements commonly used in retinal biomarkers studies induced by centering fundus image acquisition on the OD or on the macula. Our results suggest that different centering induces substantial differences. The important risk is that this could lead, potentially, to fragile statistical conclusions in biomarker studies. Such studies should, ideally, consider both centering types and discuss the differences in associations for Type I and Type II images separately.

The main limit of our pilot is the modest number of images and subjects (80 images, n = 20), larger however than those in published reports on related topics (Sect. 2). We notice that the question itself of what statistical analysis methods are resilient to what levels of uncertainty and errors requires attention. A second limit is the use of only two, if commonly used, statistics (r and ICC). We plan to extend our analysis to larger samples from independent populations to better understand the effects of centering on morphometric vascular measurements in the retina.

Ultimately, the many aspects of a protocol for reliable biomarker studies, of which centering is only one, require in our view an international collaborative standardization effort, which we strongly auspicate.