Automatic image segmentation based on synthetic tissue model for delineating organs at risk in spinal metastasis treatment planning
- 480 Downloads
One of the main goals in software solutions for treatment planning is to automatize delineation of organs at risk (OARs). In this pilot feasibility study a clinical validation was made of computed tomography (CT)-based extracranial auto-segmentation (AS) using the Brainlab Anatomical Mapping tool (AM).
The delineation of nine extracranial OARs (lungs, kidneys, trachea, heart, liver, spinal cord, esophagus) from clinical datasets of 24 treated patients was retrospectively evaluated. Manual delineation of OARs was conducted in clinical routine and compared with AS datasets using AM. The Dice similarity coefficient (DSC) and maximum Hausdorff distance (HD) were used as statistical and geometrical measurements, respectively. Additionally, all AS structures were validated using a subjective qualitative scoring system.
All patient datasets investigated were successfully processed with the evaluated AS software. For the left lung (0.97 ± 0.03), right lung (0.97 ± 0.05), left kidney (0.91 ± 0.07), and trachea (0.93 ± 0.04), the DSC was high with low variability. The DSC scores of other organs (right kidney, heart, liver, spinal cord), except the esophagus, ranged between 0.7 and 0.9. The calculated HD values yielded comparable results. Qualitative assessment showed a general acceptance in more than 85% of AS OARs—except for the esophagus.
The Brainlab AM software is ready for clinical use in most of the OARs evaluated in the thoracic and abdominal region. The software generates highly conformal structure sets compared to manual contouring. The current study design needs revision for further research.
KeywordsRadiotherapy Extra-cranial Automization Elements Atlas
Auf einem Gewebemodell basierende automatische Bildsegmentierung zur Konturierung von Risikoorganen in der Behandlungsplanung für spinale Metastasierung
Bei der Entwicklung neuer Softwarelösungen zur Bestrahlungsplanung wird eine Automatisierung der Konturierung von Risikoorganen (OARs) angestrebt. In dieser Arbeit wurde die computertomographie(CT)-basierte automatische Segmentierung (AS) extrakranieller Strukturen des Anatomical-Mapping(AM)-Tools der Firma Brainlab validiert.
Anhand der Datensätze von 24 behandelten Patienten wurde die Konturierung von 9 extrakraniellen OARs retrospektiv überprüft. Die in der klinischen Routine manuell erstellten Konturen wurden mit durch AM automatisch erzeugten Datensätzen verglichen. Als Messmethoden kamen der Dice Similarity Coefficient (DSC) und die Hausdorff Distance (HD) zum Einsatz. Zusätzlich wurden alle automatisch erzeugten Strukturen mit Hilfe eines subjektiven qualitativen Punktesystem bewertet.
Die automatische Konturierung der OARs aller untersuchten Datensätze durch die AS-Software konnte durchgeführt werden. Für die Organe linke Lunge, rechte Lunge, linke Niere und Trachea zeigten sich hohe DSC-Werte mit geringer Variabilität. Bei den übrigen Organen (mit Ausnahme des Ösophagus) bewegten sie die DSC-Werte zwischen 0,7 und 0,9. Die Berechnung der HD ergab ähnlich interpretierbare Ergebnisse. In der subjektiven Bewertung wurden >85 % der automatisch erzeugten OARs akzeptiert – außer die des Ösophagus.
Das AM-Tool der Firma Brainlab kann für die meisten untersuchten OARs der thorakalen und abdominellen Region in der klinischen Routine eingesetzt werden. Im Vergleich zur manuellen Konturierung werden hochkonformale Strukturen erzeugt.
SchlüsselwörterStrahlentherapie Extra-kraniell Automatisierung Elements Atlas
Today, virtual three-dimensional planning on computed tomography scans (CT) is state of the art in radiotherapy (RT). Common techniques like intensity-modulated radiotherapy (IMRT) and volumetric-modulated arc therpay (VMAT) are increasingly used worldwide. These techniques require highly specialized treatment planning with contouring of organs at risk (OARs; ). The latter represents a major part of the planning workload, since commonly used slice-by-slice manual or interpolation-based semi-automatic contouring approaches are time-consuming and repetitive tasks [2, 3]. Furthermore, these approaches are associated with an interobserver variability in contouring of the OARs that is independent of the dosimetrist’s experience [4, 5]. Hence, precise, fast, and reproducible contouring methods may facilitate treatment planning. Moreover, the introduction of techniques, e.g., adaptive RT and gating, may further increase the need for repeatedly applied planning updates for the same patient . The use of automatic segmentation (AS) software could compensate for the additional expenditure of time.
Different concepts of AS such as model fitting and rule-based or image registration-based approaches were proposed recently for automated OAR delineation. Their functionality was well described by Haas et al. . So far, image registration-based methods that are based on a priori anatomical atlas information represent the most promising approach for clinical applications . In principle, the acquired patient dataset is registered to a pre-designed atlas dataset comprising prior knowledge of the human anatomy. In a second step, the OARs are transferred from the atlas space to the patient dataset and subsequently post-processed.
A variety of commercially available software products, e.g., MIM Maestro (MIM Software Inc., Cleveland, OH, USA), Velocity (Varian Medical System, Palo Alto, CA, USA), ABAS (Elekta AB, Stockholm, Sweden), iPlan RT Image (Brainlab AG, Munich, Germany), and SPICE (Philips Medical Systems DMC GmbH, Hamburg, Germany), have been evaluated [3, 8, 9]. These software solutions offer a potential for reproducible accuracy and time-sparing for different use cases (head and neck, prostate, breast, lung). However, most of these software solutions provide different atlases for the different extracranial regions or are specialized on a single organ .
In this pilot feasibility study, a novel commercially available AS software (Anatomical Mapping 1.0; Brainlab AG, Munich, Germany) was evaluated, which is integrated in a specific workflow for treatment planning of spinal metastasis (including semi-automatic delineation of Clinical Target Volumes [CTV] of vertebrae). The Anatomical Mapping software is based on a versatile atlas-based Synthetic Tissue Model designed for OAR definition in thoracic, abdominal, and pelvic body regions at the same time. This method was previously introduced by Blumhofer et al. . It uses the complete anatomical tissue environment in the complex extracranial region (modeling of arm positions, muscle, and subcutaneous fat proportions) and may therefore facilitate a reliable and fully automated OAR delineation. More thorough information can be found in the Methods section.
It was hypothesized that the evaluated method (a) produces highly conformal structure sets in comparison with clinically approved reference contours (RC) of the evaluated OARs (left lung, right lung, heart, liver, left kidney, right kidney, spinal cord, trachea, esophagus) and (b) is applicable for clinical usage (qualitative validation). Patients who had undergone RT for spinal metastasis were considered for this study since they typically show a large number of OARs relevant for contouring in order to prepare the RT plan.
Methods and materials
Total Number of patients
No relevant OARs in field-of-view
Included number of patients
Primary radiotherapy (no surgery)
Type of radiotherapy technique
Manual and semi-automatic contouring
Results of quantitative assessment
Organ at risk
0.97 ± 0.03
20.8 ± 12.5
0.97 ± 0.05
21.2 ± 10.5
0.78 ± 0.16
31.2 ± 10.2
0.80 ± 0.17
37.7 ± 13.8
0.91 ± 0.07
17.5 ± 12.9
0.81 ± 0.28
22.9 ± 12.9
0.71 ± 0.12
21.4 ± 19.7
0.93 ± 0.04
7.6 ± 6.9
0.49 ± 0.13
30.6 ± 11.4
Software-based automatic contouring
Quantitative and qualitative assessment
The comparison between manual (RC) and automatic segmentation (AS) was based on common and regularly calculated statistical or geometrical parameters [7, 8, 9, 10, 11, 13, 14, 15, 16, 17]: the Dice similarity coefficient (DSC) and Hausdorff distance (HD). The DSC ranges between 0 and 1 and scores greater than 0.7 can be interpreted as if the generated contours show a high grade of overlap with the RCs used. The HD in this evaluation ranges from 0 to 50 mm (owing to a shortening of calculation time the maximum HD was restricted to 50 mm). Lower distances mean better results. Both evaluations were performed using Matlab (MathWorks, Natick, MA, USA) scripts provided by Brainlab.
In order to better interpret the results obtained from the quantitative evaluation and to affirm or support the findings from a clinical perspective, an initial qualitative review was performed. This qualitative review relies on the subjective scoring system of Zhu et al. . The scoring system rates every OAR per case with 1 = “useful without correction,” 2 = “useful with minor correction,” and 3 = “not useful.” The definition of minor correction was defined as editing in the Anatomical Mapping, AS being preferred over total manual contouring.
The HD values calculated yielded comparable results with the smallest mean values for the left lung (20.8 ± 1.5), right lung (21.2 ± 10.5), left kidney (17.5 ± 12.9), and trachea (7.6 ± 6.9) and interquartile ranges between 4 mm (25th quartile for trachea) and 25 mm (75th quartile for right lung). The right kidney (22.9 ± 12.9), heart (31.2 ± 10.2), and esophagus (30.6 ± 11.4) showed interquartile ranges of approx. 15–25 mm, 26–35 mm, and 25–35 mm, respectively. The highest interquartile ranges were found for the liver and spinal cord ranging from approx. 24 to 50 mm and 6 to 47 mm, respectively. Of note, higher interquartile ranges and lower similarity values were not principally associated with data derived from postoperative patient datasets (in Fig. 3 corresponding results of postoperative data are marked as red dots).
Results of qualitative assessment
Organ at risk
Expert scoring results in %
Score = 1
Score = 2
Score = 3
Auto-segmentation is a powerful tool for contouring in radiation oncology. Currently, several commercial software solutions are available (SPICE, ABAS, MIM, Velocity), which have been evaluated in the past [2, 7, 8]. It was concluded that the tools had similar performance (with regard to time-saving) with promising results (quality of AS contours). Difficulties were related to the interpretation of the quantitative parameters (DSC and HD). Jameson et al. demonstrated that these parameters are generally not able to provide a clear statement about the clinical usability of a certain contour . Structures with small volumes may even show lower and more variable DSCs than structures with high volumes . To support the quantitative evaluation, we have used the aforementioned expert scoring system as introduced by Zhu and coworkers  as an additional parameter in this investigation.
In our study, the AS of the OARs generated with the new Brainlab Anatomical Mapping software mostly showed good conformity with the RC (example: Figs. 1 and 2). Good results with high quality of the automatically generated structures were obtained in both lungs and trachea as OARs. These air-containing organs are characterized by high contrast to their surroundings. The expert scoring results determined in this study support these findings. Acceptance of the lungs and left kidney was quite good, having a score of 1 in >90 and >70% of cases, respectively (i.e., no edits needed). The heart, liver, and right kidney (we had two outliers with DSC <0.5) showed acceptable scores in this investigation, although scores were lower and the HD was higher than for the lungs and left kidney. The structures of these OARs were considered as acceptable without further editing required in more than half and up to three quarters of the cases. The number of structures that were scored as unacceptable was low for all of these OARs (<10%). Zhu et al. obtained comparable results in their investigation of the SPICE algorithm .
The poorest contouring quality was observed for the esophagus. However, manual contouring of the esophagus is difficult, even for experienced radiation oncologists. The long and highly variable mediastinal course of the esophagus, its variable diameter, the inconstant visibility of its lumen and the low contrast to its surroundings all together render the esophagus the most complex OAR in the thoracic region. As an outstanding example, Collier et al. even demonstrated a case with no overlap of the contour in single slices outlined by two different radiation oncologists (, Fig. 3). In our analysis, none of the auto-segmented esophagus contours were considered as useful without corrections, but almost 75% were acceptable with only minor corrections required (editing the Anatomical Mapping auto-segmentation was preferred over manual contouring).
The ambiguous results of the spinal cord (Fig. 4, good qualitative acceptance but high HD) correlate with the interobserver variabilities in contouring of the spinal cord due to its elongated shape. Nieder et al. demonstrated that even for SBRT is the interpretation of the spinal cord heterogeneous . In particular, variation of the cranial and caudal borders with inclusion of parts of the brainstem or the cauda equina can cause high HD (maximum distance of surface points), while the effect on the volume is less crucial (DSC). Additionally, the evaluated software is able to distinguish between the spinal cord and spinal canal, but only the spinal cord was tested in our study with qualitative scoring results ranging between 1 and 2 (no spinal cord was scored 3). This also illustrates that DSC and HD alone are not able to measure the clinical acceptance of AS. Factors such as proximity to the planning target volume (PTV), the subscribed dose, and applied radiation method could influence the Radioation-Oncologist´s (RO) accuracy in OAR delineation and the acceptance of AS. In the qualitative review, we rated every AS structure as if it was to be used for SBRT, even though not all RCs were delineated for this purpose (Table 1).
One of the biggest shortcomings of this study is that this expert scoring was performed by only one RO and that the “minor edits” necessary to have a structure scored 2 upgraded to a scored 1 were not applied. A second big shortcoming is the small sample size. The retrospective design of our study offered surprisingly low numbers of contoured livers, hearts, and tracheae. Even though these OARs were included in more scans, they were not always outlined during clinical routine (no RCs available). By contrast, since AS was performed for all structures the qualitative assessment was done with the full number of cases (resulting in larger numbers in Table 3 compared with Table 2). The OARs of the whole pelvic region (prostate, bladder, rectum, hip joints, seminal vesicles, penile bulb) were only included in the images of one case (and consequently excluded from this study). Here, we recommend more careful a priori statistical considerations for designing a larger, prospective study. The scans of different body regions should be standardized and evaluated separately. As an example, for the whole pelvic region images of prostate patients would contain all the AS pelvic structures provided  and in the thoracic region scans of breast cancer patients would build a much more homogeneous cohort . The inclusion criteria of our study (RT for spinal metastasis in the period from November 2016 through June 2017) proved to be too simple.
Owing to the retrospective nature of our investigation, we had to omit another interesting test: the analysis of segmentation time (manual contouring compared with auto-segmentation). However, we see a large potential of AS for optimizing the workflow. After PACS import, the tested software starts automatically with AS as a background routine (5–20 min). Afterwards every structure must be evaluated (and corrected when necessary) by the RO before dose-planning is possible. To prospectively evaluate the clinical benefit of this AS method compared with currently established manual or semi-automatic contouring approaches, further studies are necessary.
The study has limitations regarding the data collection (low case numbers, inhomogeneous cohort). As a first pilot feasibility study, comparison of auto-segmentation and reference contours was done via quantitative and qualitative assessment. The evaluated organs at risk were the lungs, trachea, esophagus, heart, liver, kidneys, and spinal cord. Regarding these OARs, (excluding the esophagus) one can reasonably assume that AS software produces highly conformal structure sets (hypothesis a) and is ready for clinical usage (hypothesis b). So far, the evaluated software (Anatomical Mapping 1.0) is embedded in treatment planning for spinal metastasis, but implementation in clinical routine for the whole thoracic and abdominal region (and not only for spinal metastasis) is possible with individual workflow-optimizations. The estimated great potential of time-saving and standardization of OAR contours was not yet tested.
Conflict of interest
O. Wittenstein has received a speaker honorarium from Brainlab AG. P. Hiepe and L.H. Sowa are employers of Brainlab AG. E. Karsten, I. Fandrich and J. Dunst declare that they have no competing interests.
- 4.Collier D, Burnett SSC, Amin M et al (2002) Assessment of consistency in contouring of normal-tissue anatomic structures. J Appl Clin Med Phys 4:1Google Scholar
- 7.Bach Cuadra M, Duay V, Thiran JP (2015) Atlas-based Segmentation. In: Paragios N, Duncan J, Ayache N (eds) Handbook of Biomedical Imaging. Springer, Boston, MA, pp 221–244Google Scholar
- 9.Zhu M, Bzdusek K, Brink C et al (2013) Multi-institutional Quantitative Evaluation and Clinical Validation of Smart Probabilistic Image Contouring Engine (SPICE) Autosegmentation of Target Structures and Normal Tissues on Computer Tomography Images in the Head and Neck, Thorax, Liver, and Male Pelvis Areas. Int J Radiation Oncol Biol Phys 87:809–816CrossRefGoogle Scholar
- 12.Abstracts DEGRO (2018) Strahlentherapie und Onkologie 194 (S1):1–222Google Scholar
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.