Keywords

1 Introduction

Individual motion contributions of the cervical vertebrae provide valuable information about the natural neck movement and reveal abnormalities associated with spinal injuries or medical conditions [1]. Digital videofluoroscopy is an imaging modality which allows a real-time in vivo analysis of unrestricted cervical motion, otherwise is not possible when using static radiographic images. Cervical range of motion has been investigated in whiplash-associated disorders [1], in neck pain [1, 2], as well as in healthy subjects [1, 3, 4], and it has been shown to be significantly decreased in whiplash and neck pain [1]. New evidence indicates that the cervical joints contributions to the range of motion, previously thought to be regular and continuous [3], in fact prove to be opposite to the direction of movement [4,5,6], and that the vertebral motion patterns to and from the end ranges of movement are not mirror images of each other [5, 7]. Therefore, a rotational and translational cervical motion analysis is of considerable importance.

Motion analysis of the cervical spine requires an annotation of landmarks on vertebral corners [3]. The majority of studies analyzing cervical joint motion employed manual and semi-automated approaches for landmark annotations [3, 4, 6, 8, 9, 11]. Manual methods have been shown to be highly reliable [4, 6, 12], but also time-consuming, and thus impractical for large data analysis. Automatic vertebral tracking studies have used template matching [11, 13], Active Appearance Models [14,15,16], or feature tracking algorithms [17]. However, these methods still require manual identification of vertebral landmarks in the first frames of the videos. Fully automatic landmark identification has been successful in the lumbar spine [8, 10, 18, 19], due to the larger size and better visibility of the vertebral bodies, or when using imaging modalities providing higher contrast and spatial resolution, such as Computed Tomography [15, 20, 21] or X-ray [14, 16, 22]. Nonetheless, these approaches have not been successful when applied to the cervical vertebrae in fluoroscopic images, due to their smaller size, small field-of-view, lower image quality, and considerable presence of motion blur.

In this paper we propose a procedure for automatic identification and segmentation of cervical vertebrae in videofluoroscopic sequences. It allows an accurate computation of vertebral landmarks necessary for a real-time cervical motion analysis, and eliminates the prerequisite for manual annotation (Fig. 1a) of the C3–C7 vertebrae.

2 Method

2.1 Experimental Procedure

Four young adult subjects were included in this study: two women (age: 23.5 ± 0.71 years; height: 167.5 ± 17.7 cm; weight: 73.8 ± 26.6 kg, and two men (age: 25.0 ± 1.4 years; height: 184.5 ± 6.4 cm; weight: 77.5 ± 6.4 kg. Exclusion criteria were: neck disorders, any neck symptoms up to three months prior to the study, and possible pregnancy. Fluoroscopic video sequences (Fig. 1a) were acquired at 25 frames per second, with a resolution of 576 \(\times \) 768 pixels, using the Phillips BV Libra mobile diagnostic fluoroscopic image acquisition and viewing system. For each subject, two average quality fluoroscopic sequences were recorded: one at the onset of flexion, and one at the onset of extension. The average source-to-participant distance (C7 spinous process) was 76 cm, and the average exposure of 45-kV, 208-mA, 6.0-ms X-ray pulses yielded 0.12 mSv per individual motion from upright to end-range (PCXMC software, STUK, Helsinki, Finland). Subjects were asked to sit in the normal upright position and perform movement in the sagittal plane, starting from the neutral position to the end-range of movements. They all wore plastic glasses with two small metal bearings on each side, attached to the glasses by metal wires. The purpose of them was to serve as external markers of the occiput visible under fluoroscopy (Fig. 1a).

Fig. 1.
figure 1

(a) Manually annotated corners on the cervical spine, from C3 to C7, with visible external markers. (b) Marking order for the corners, as well as the posterior and anterior midpoints (red) which form the mid-planes used for joint angle calculations, illustrated on the C5, C6, and C7 vertebrae. (Color figure online)

2.2 Automatic Identification of Vertebral Landmarks

The automatic vertebral landmark identification algorithm consisted of the following steps (Fig. 2): template matching; two parallel segmentation methods, using contrast-limited adaptive histogram equalization and gradient magnitude approaches; registration of the segmented vertebrae to the template; and identification of the vertebral corners as landmarks.

Fig. 2.
figure 2

Procedure workflow for an automatic identification of the cervical vertebral landmarks.

Template Matching: A binary template was created to represent an average shape of the cervical vertebrae (Fig. 5a). Videofluoroscopic sequences of subjects in neutral position were preprocessed with a local range filter. Canny edge detection (sensitivity thresholds [0.02, 0.05]) and a morphological closing (spherical structuring element, radius of 2 pixels) were then performed. Next, the binary template was matched to the preprocessed image at every location, and the candidate locations where the template matched the vertebrae were identified by means of the following criteria: Dice similarity coefficient (DSC) \(>0.34\) (Eq. 1); average pixel intensity range [100, 150]; entropy threshold \(>1.99\); gray-level co-occurrence matrix properties: contrast range [0.045, 0.12], correlation range [0.93, 0.98], energy range [0.2, 0.34], and homogeneity range [0.94, 0.98].

$$\begin{aligned} DSC = \frac{2\left| X \cap Y \right| }{\left| X \right| +\left| Y \right| } \end{aligned}$$
(1)

In Eq. 1 for the Dice coefficient, \(\left| X \right| \) was the number of pixels in the template image and \(\left| Y \right| \) the number of pixels in the candidate locations. The identified candidate locations were then edge-, and contrast-enhanced using a power law transformation (\(\gamma =1.1\), \(c=1\)). Finally, a quadratic anisotropic diffusion filter was applied. At the end of this step, regions-of-interest (ROIs) around the vertebral bodies were identified for segmentation, using two parallel approaches (Fig. 2), both of which were applied only to these ROIs.

Segmentation 1 - Contrast-Limited Adaptive Histogram Equalization: First, a contrast-limited adaptive histogram equalization (CLAHE) algorithm was applied in order to enhance the contrast in the identified gray-scale candidate ROIs (Fig. 3). The candidate locations were sharpened to enhance the contrast of the edges. Next, adaptive thresholding was applied to 3-by-3 neighborhoods of the vertebral ROIs to filter the noise, while simultaneously preserving the edges. The gray-level co-occurrence matrix was calculated once more and adaptive thresholding was applied to the scaled image. The resulting images were processed in three parallel pathways (Fig. 3). In (1), a fourth order Butterworth bandpass filter was applied (cut-off frequencies: [5, 71]), and the residual noise was removed through binarization (threshold = 0.99). The holes in the binarized objects were filled using morphological filling. In (2), no image filtering was applied before binarization and morphological hole filling. In (3), the vertebral edges were computed using Canny edge detection (sensitivity thresholds [0.02, 0.05]). The three images were fused together, so that the pixels constituting the vertebral edges were kept in the fused images if and only if they had the same value of 1 (white) at the same pixel locations. Finally, this step concluded with morphological opening and then closing. The results of Segmentation 1 are illustrated in Fig. 5b.

Fig. 3.
figure 3

The workflow of Segmentation 1 procedure using contrast-limited adaptive histogram equalization.

Segmentation 2 - Gradient Magnitude: In Segmentation 2 (Fig. 4), a gradient magnitude was applied to the fluoroscopic images, filtered with a quadratic anisotropic diffusion filter. The images were then filtered using an edge-preserving, local Laplacian filter (\(\sigma =0.9\), \(\alpha =0.1\)). The vertebral ROIs were then sharpened to enhance the contrast along the edges (radius = 3; sharpening strength = 2, minimum contrast threshold = 0). Next, adaptive thresholding was applied for binarization, and morphological opening and closing for filling the holes and bridging the edges in the segmented vertebrae. The results of Segmentation 2 are illustrated in Fig. 5c.

Fig. 4.
figure 4

The workflow of Segmentation 2 procedure using the gradient magnitude.

Template Registration: At the beginning of this step, each vertebral ROI was segmented using the two aforementioned segmentation procedures. In order to quantitatively determine which of them provided the best results, the template image (Fig. 5a) was matched once again with the vertebral boundaries by means of affine registration (Fig. 5d and e). The registration was optimized by means of mean squared error, with a regular step gradient descent configuration, initial step length of 0.01, and 1000 iterations. The segmentation result with the highest DSC (Fig. 6a) was selected.

Fig. 5.
figure 5

(a) Binary template; (b) representative examples of Segmentation 1 and (c) of Segmentation 2; (d) and (e) results of template registrations to vertebrae segmented in (b) and (c), respectively.

Corner Detection: The corners of the segmented vertebrae (Fig. 6a) were located by determining the largest Euclidean distance between all the points of the vertebral boundary (Fig. 6b). The four corners obtained in this process were then selected as vertebral landmarks (Fig. 6c). The results of the corner detection are shown in red in Fig. 7, superimposed on a fluoroscopic image with manually annotated vertebral corners (blue).

Fig. 6.
figure 6

(a) Segmented vertebral body; (b) two largest Euclidean distances within the vertebral boundary; (c) vertebral corners.

2.3 Manual Annotation of Vertebral Corners

For the purpose of validating the algorithm, vertebral corners were also manually annotated on the fluoroscopic images in C3–C7 (Fig. 1a). Additionally, intervertebral joint angles were computed using the automatically detected and manually annotated vertebral corners. The marking procedure is described in detail in Plocharski et al. [12]. Briefly, four corners were manually marked on the C3–C6 vertebrae at points where lines through soft or cancellous corners intersect with the outer edges of the compact bone. Figure 1b illustrates the placement and the order of the markings. Due to the fact that C7 is often partially obscured in fluoroscopic recordings, it was only marked with two points on the superior cancellous corners under the superior vertebral plate [12]. In order to compute the cervical joint angles, we incorporated the vertebral landmark methodology developed by Frobin et al. [3]. A line connecting the posterior and anterior midpoints, defined as equidistant points between corners 1 and 4, and 2 and 3 respectively (red points in Fig. 1b) formed a mid-plane, which was used for angle computation between two adjacent vertebrae (angles \(\theta _1\) and \(\theta _2\), Fig. 1b). The C6/C7 angle was computed between the C6 mid-plane and a line going through the two corners of C7. All angles were calculated as four-quadrant inverse tangents of the determinant and dot product of the two direction vectors, measured counterclockwise from the posterior to the anterior midpoints in the range from \(0^{\circ }\) to \(180^{\circ }\) [12].

3 Results

Figure 7 illustrates the automatically identified (red), and manually annotated vertebral corners (blue) on C3–C7. For C3–C6 vertebrae, the automatic detection method provided locations in close proximity to the manual annotations. A few inaccurate detections of the first and fourth corners were observed in C3 (Fig. 7a and g), and in the second and fourth corners of C6 (Fig. 7a, d, and d). Corner detection of C7 yielded somewhat inferior results to, especially for the second corner (Fig. 7d, h), likely due to an absence of clear vertebral edges. For each vertebral corner, we compared point coordinates of the automatically identified corners and the corresponding manual annotations. Error was calculated as the average Euclidean distance between the two corresponding corners in (\(n=8\)) fluoroscopic images (Eq. 2):

Fig. 7.
figure 7

Automatically located (red) and manually annotated vertebral landmarks (blue) in the four subjects, at the onset of extension ((a), (b), (c), (d)) and flexion ((e), (f), (g), (h)). (Color figure online)

$$\begin{aligned} Error = \frac{1}{n}\sum _{n=1}^{n}\sqrt{(x_{A}-x_{M})^{2}+(y_{A}-y_{M})^{2}} \end{aligned}$$
(2)

where \((x_{A},y_{A})\) was the automatically detected corner, and \((x_{M},y_{M})\) was the manually annotated one. Table 1 illustrates the mean errors and standard deviations in pixels. Errors smaller than five pixels were deemed acceptable. Additionally, a one-tailed t-test was computed for every automatically identified corner to test the null hypothesis that the average detection error was equal to or smaller than five pixels. The p-values for all tests are shown in Table 1. Statistical analysis was performed in SPSS (IBM Statistics, v.25). All data in Tables 1 and 2 was initially tested for normality using the Shapiro-Wilk test. Normality of the data was confirmed (\(p>0.05\)). Statistical analysis indicates that the average corner detection errors were not significantly larger than five pixels. Table 2 illustrates the intervertebral angles, computed using the approach illustrated in Fig. 1b, using both the manually annotated and the automatically detected vertebral corners. A paired-sample t-test was computed for each joint to determine if the angles obtained using the two approaches differed significantly. No significant difference was found between the two methods (\(p>0.05\) for all cervical joints).

Table 1. Detection errors (pixels) computed for each corner in the C3–C7 vertebrae in the eight fluoroscopic images, as well as in average for every vertebra.
Table 2. Intervertebral joint angles obtained using the manual and automatic methods. All values are presented in degrees.

4 Discussion

Vertebral landmarks are a requirement for range of motion analysis, which is a crucial tool for understanding the spine joint mechanics [1]. The time-consuming process of manual landmark annotations is still a prerequisite of the state-of-the-art automatic vertebral tracking algorithms [11, 17, 19, 20]. In this paper we propose a method to automatically identify and segment the C3–C7 vertebral bodies in videofluoroscopic images, and to detect the vertebral landmarks necessary for cervical joint motion analysis. We compare this automatic detection method with manually annotated vertebral landmarks.

Results from Table 1 showed the average detection error under five pixels in the C3–C6 vertebrae, with the lowest average error of \(1.65\pm 1.60\) pixels in the C4 vertebra. The one-sample, one-tailed t-test for each of the average detection errors of the four corners in C3–C7 vertebrae revealed that the errors were not significantly greater than five pixels. Given the spatial resolution of 576 \(\times \) 768 pixels, five pixels corresponded to respectively \(0.9\%\) and \(0.7\%\) of the height and width of the images. However, the average errors for C7 vertebrae were larger than the other vertebrae. A possible explanation may be partial occlusion, a lower contrast, and a lack of well-defined edges on the C7. The joint angles results for C3–C7 were also not significantly different from the angles computed with the manually annotated corners (\(p>0.05\)). This suggests that the presented method can be suitable for large data analysis of cervical joint motion using automatic tracking algorithms.

Comparison of these results with other work is difficult, since similar vertebral landmark detection methods in the cervical spine using videofluoroscopic sequences were not found in literature. A similar study by Xu et al. [14] used a combination of Haar-like features and Active Appearance Models training algorithms for automatic segmentation of cervical vertebrae in X-ray images. They obtained the lowest average error of 4.79 pixels. Al-Arif et al. reported the lowest average median error of 2.08 mm using Haar-like features in radiographic images [20], and the lowest average error of 0.7688 mm using Active Shape Models with Random Classification Forest in X-ray images [16]. Automatic approaches to detect and label the vertebral landmarks have also been developed using deep learning [23, 24]. However, they require large data sets and high quality imaging modalities, such as CT or MRI, and thus are not directly comparable to fluoroscopic sequences of the cervical spine.

The following limitations to this study need to be addressed. First, a larger number of participants would be beneficial. Secondly, the fluoroscopic images were of relatively good quality, and thus we did not evaluate the ability of our approach to automatically identify the cervical corners in images with higher degrees of image blurring. However, the aim of this approach was vertebral detection at the onset of movement, with stationary subjects in a neutral position, and thus motion blur was not expected to occur. Finally, our approach did not aim to detect C1 or C2. C1 does not have the vertebral body and is seldom used in most vertebral analyses, while C2 is often obscured and its corners are often not visible.

5 Conclusion

The proposed method to automatically detect and segment the cervical vertebrae allows a computation of the vertebral landmarks for a real-time intervertebral motion analysis in videofluoroscopy. It also eliminates the necessity for a manual annotation of the C3–C7 vertebrae for automatic landmark tracking. Additionally, our approach does not require large datasets necessary for training the algorithm to be able to detect the vertebrae, as is the case in deep learning approaches.