BCL-2 expression aids in the immunohistochemical prediction of the Oncotype DX breast cancer recurrence score
- 524 Downloads
The development of molecular techniques to estimate the risk of breast cancer recurrence has been a significant addition to the suite of tools available to pathologists and breast oncologists. It has previously been shown that immunohistochemistry can provide a surrogate measure of tumor recurrence risk, effectively providing a less expensive and more rapid estimate of risk without the need for send-out. However, concordance between gene expression-based and immunohistochemistry-based approaches has been modest, making it difficult to determine when one approach can serve as an adequate substitute for the other. We investigated whether immunohistochemistry-based methods can be augmented to provide a useful therapeutic indicator of risk.
We studied whether the Oncotype DX breast cancer recurrence score can be predicted from routinely acquired immunohistochemistry of breast tumor histology. We examined the effects of two modifications to conventional scoring measures based on ER, PR, Ki-67, and Her2 expression. First, we tested a mathematical transformation that produces a more diagnostic-relevant representation of the staining attributes of these markers. Second, we considered the expression of BCL-2, a complex involved in regulating apoptosis, as an additional prognostic marker.
We found that the mathematical transformation improved concordance rates over the conventional scoring model. By establishing a measure of prediction certainty, we discovered that the difference in concordance between methods was even greater among the most certain cases in the sample, demonstrating the utility of an accompanying measure of prediction certainty. Including BCL-2 expression in the scoring model increased the number of breast cancer cases in the cohort that were considered high certainty, effectively expanding the applicability of this technique to a greater proportion of patients.
Our results demonstrate an improvement in concordance between immunohistochemistry-based and gene expression-based methods to predict breast cancer recurrence risk following two simple modifications to the conventional scoring model.
KeywordsDigital pathology Prognostic markers Computer-assisted diagnosis Staining
A number of prognostic factors have been described with a significant relationship to likelihood of breast cancer recurrence, and commonly include patient age [1, 2], tumor size , lymph node status , and histologic grade , but also include biomarkers as assessed by immunohistochemistry (IHC) including Her2 status , patterns of hormone receptor expression (ER and PR) [7, 8], and expression of proliferative marker Ki-67 [9, 10]. Gene expression analysis has also been used to successfully predict recurrence likelihood , with one such test (Oncotype DX, Genomic Health, Redwood City, CA) having been added to NCCN guidelines for treatment . This test provides a recurrence score (RS) which estimates the likelihood of recurrence according to a 21-gene assay. Importantly, RS has also been shown to successfully predict chemotherapy benefit , and has extensively been shown to alter treatment recommendations when considered alongside other diagnostic factors [14, 15, 16, 17, 18, 19, 20].
Despite the clear clinical utility of this approach, it remains costly and inaccessible to many patients, leading several groups to evaluate alternative methods as a surrogate for the Oncotype DX test. Turner, et al. suggested that limiting the use of the test based on routine histologic examination could reduce send out by up to 23%, resulting in significant cost savings . More recently, Gage, et al. argued that a 44% reduction in send out could be achieved based on traditional variables used in sign out . In addition to economic considerations, another focus has been to provide accurate predictions from immediately available data. Flanagan, et al. described one of the most compelling methods, using histologic factors combined with routinely acquired protein expression data from IHC to predict RS . This approach provided an estimate of recurrence likelihood using data immediately available following routine processing of the tissue. In a subsequent study, this group offered three modifications to the original formulation with similar predictive success ; one such modification, the Magee Score #3, generated a prediction of RS strictly from ER, PR, Ki-67, and Her2 status. An advantage to this modification is that it is entirely quantitative and reproducible, and does not depend on pathologist grading or interpretation which has been shown to exhibit considerable inter-observer variability . Furthermore, with the advent of whole-slide imaging and its association with modern informatics approaches, a score based entirely on image analysis of IHC has the potential to be generated in a completely automated fashion.
Although the Magee Score #3 is an attractive alternative to the Oncotype DX test, it achieved only 54.4% overall concordance in their study. Improved concordance between this IHC-based approach and RS is needed to inspire confidence in the performance of this technique. We explored whether two modifications to this approach could be used to improve concordance and produce a viable IHC-based algorithm with clinical potential. First, we examined the impact on concordance by applying a diagnostically-relevant data transformation. Second, we explored whether the expression of BCL-2 as part of a complete breast panel can improve the estimation of RS. BCL-2 expression has previously been shown to be associated with a decreased risk of recurrence [26, 27] and a higher relapse-free survival rate [28, 29, 30, 31, 32]. We hypothesized that the prognostic information provided by BCL-2 accompanied by an optimized mathematical treatment of these variables can be harnessed to improve RS estimation.
a diagnosis of primary invasive breast carcinoma was rendered on the original pathology report;
Oncotype DX recurrence scores were available in the original pathology report;
a breast panel that included expression of ER, PR, Ki-67, Her2, and BCL-2, had been performed on the same block using IHC;
slides were scanned using whole-slide imaging and staining was assessed quantitatively using computational image analysis;
ER percent positive staining was at least 1%;
Her2 was considered negative or equivocal by image analysis and confirmed manually.
Case data were obtained by an honest broker and delivered to the investigators in a deidentified fashion. This study was considered exempt by the Drexel University College of Medicine Institutional Review Board under Category 4.
All breast specimens had ischemic and fixation times within CAP/ASCO guidelines. Biopsy cases were processed in a standard fashion after fixation in neutral buffered formalin. Lumpectomy specimens were entirely submitted after fixation in neutral buffered formalin and inking all surfaces to maintain orientation. A specimen map was used as a worksheet to document lesional tissue and distance to margins. After review of all carcinoma slides, the most representative slide was used to perform the invasive breast panel consisting of ER, PR, Ki-67, Her2, and BCL-2. Immunohistochemical stains for ER (SP1, RM, Ventana, Benchmark Ultra), PR (1E2, RM, Ventana, Benchmark Ultra), Ki-67 (30–9, RM, Ventana, Benchmark Ultra), Her2 (4B5, RM, Ventana, Benchmark Ultra), and BCL-2 (124, MM, Ventana, Benchmark Ultra) antibodies were performed on formalin-fixed paraffin-embedded sections. The antibody conditions for ER, PR, Ki-67, Her2, and BCL-2 were as follows: 8.1 pH antigen retrieval using CC1 reagent for 36–64 min, followed by primary antibody incubation for 16–44 min, and then staining with the Ultra Ultraview Universal DAB Detection Kit.
High resolution whole-slide images were acquired at either 20x or 40x magnification using the Aperio Scanscope XT (Leica Microsystems, Wetzlar, Germany). Eight regions of interest were manually selected by a pathologist for scoring, and the average score was derived using one of three semi-automated algorithms provided by the Aperio software. The nuclear staining algorithm was applied to ER, PR, and Ki-67 slides and was used to compute the percentage of positively stained cells, as well as an H-score indicating staining intensity, consistent with CAP/ASCO guidelines . The membrane algorithm was applied to Her2 slides and generated a Her2 score consistent with CAP/ASCO guidelines . The cytoplasmic algorithm was applied to BCL-2 slides and produced an H-score based on cytoplasmic staining intensity. Importantly, these scores were obtained at the time of diagnosis and were not influenced by the purposes of this study. Original Her2 scores that were reported using previous CAP/ASCO guidelines  were recomputed to meet current standards, but this did not require the image analysis portion to be modified.
We performed linear regression to derive a set of coefficients (and a constant term) that, in combination with selected IHC data, could be used to predict RS. To compute concordance, we embedded this model within a 10-fold cross-validation framework to ensure that the test sample was not used to generate the coefficients. To further ensure that the model was not susceptible to overfitting, we repeated the cross-validation 10,000 times, randomly selecting training and test data groups (folds) at each iteration. The results that we report are accompanied by a standard deviation which represents the iteration-to-iteration variability of the value under test. Generally, we observed only a small difference in the model’s coefficients between iterations.
This relationship holds true only for small values of Tx (i.e. when e-k approaches zero). For example, when Tx is 0.14, y is equal to 0.500 when x is 0.14 and 0.999 when x is 1. We selected values of Tx based on previous reports of diagnostic criteria that were successful in stratifying patients using ER , PR , and Ki-67 [9, 10, 38] (10, 10, and 14%, respectively).
Analysis of concordance
We used a threshold of 18 to distinguish the low from intermediate group and a threshold of 31 to distinguish the intermediate from high group, consistent with thresholds established for the Oncotype DX test .
Relationship between H-score and percent positive cells
We used H-score to describe the staining attributes of ER and PR in a subset of analyses. We observed that H-score, in contrast to the logistic score described above, maintained a linear relationship with the percent positive metric. To demonstrate the relationship between H-score and the percent positive metric, we performed simulations by randomly assigning a staining intensity of 0, 1+, 2+, or 3+ to a set of 104 model cells. We measured the H-score and computed the corresponding percent positive staining for the model cells, repeating this process 105 times.
The conventional scoring model is highly reproducible
Case details by histologic grade
Number of cases
Median patient age
57 ± 8
57.5 ± 6
56.5 ± 7
56.5 ± 9
Tumor size (cm)
1.6 ± 0.6
1.5 ± 0.7
1.6 ± 0.7
1.5 ± 0.4
Oncotype DX RS
15 ± 4.5
14 ± 3
14 ± 4
23.5 ± 6.5
Immunohistochemistry attributes of the data set
Negative or Not overexpressed
Low positive or Equivocal
Positive or Overexpressed
Contribution to IHC score
Magee Score #3 coefficients
Linear regression coefficients
Linear regression coefficients
Transformation of staining attributes improve concordance
CAP/ASCO guidelines specify that ER and PR expression as determined by IHC should be reported according to the percentage of positively stained cells to guide diagnostic interpretation and treatment decision making . Likewise, Her2 expression is reported according to a score which measures the relative proportions of 3+, 2+, 1+, and unstained cells based on membrane staining intensity . Treatment decision making critically relies on the categorical interpretation of these quantities [50, 51, 52, 53, 54, 55], but it is difficult to reconcile the clinical utility of categorical data with the linear treatment of a continuous variable. For instance, a tumor is considered to exhibit ER immunoreactivity if the percentage of positively stained cells exceeds 1% . This implies that the binary categorization of this quantity into “positive” and “negative”, which has a profound effect on its diagnostic and therapeutic interpretation, largely ignores differences in expression levels over the vast majority of its range. Therefore, using a continuous scalar quantity such as percent positive cells or H-score to characterize ER expression is not consistent with the diagnostic interpretation of ER.
BCL-2 improves classification performance
Individual contribution of markers to the prediction of RS
We compared the independent contributions of each marker to the concordance rate of the algorithm. When markers were individually used to predict RS, we found that they were poor predictors alone. When we evaluated the performance of the algorithm when trained with only two markers at a time, we found that PR and BCL-2 together produced the highest number of high-certainty scores (> 3), achieving an overall concordance rate of 63.7%. We found that the most successful trio of markers was PR, BCL-2, and Ki-67, expanding the number of high-certainty predictions and achieving an overall concordance rate of 65.3%.
We developed an alternative measure of the individual contributions of each marker by examining the equation’s coefficients. We found that the standardized weights of the coefficients were similar to the rank order observed originally, except that Ki-67 had a much higher standardized weight than BCL-2 (Table 3, fourth column). The disparity between the results of the two methods indicates that BCL-2 and PR likely share a complementary role that is not revealed when all the markers are present.
IHC score to predict chemotherapy benefit
The results that we describe demonstrate a quantitative approach to estimating the likelihood of breast cancer recurrence from routinely acquired protein expression data. This technique offers the ability to immediately stratify patients following IHC processing of excised tissue. The IHC score is based strictly on the results of image analysis of tissue and does not require other information that may not be available at the time of biopsy, for example. It also does not necessarily require whole-slide imaging; it can be used with any semi-quantitative method for estimating staining and therefore is accessible to underprivileged areas, offering a key advantage over some molecular approaches. However, when combined with whole-slide imaging, it may offer the potential to be robust in the presence of tumor heterogeneity by enabling the analysis to be performed with spatial precision. This not only provides a check that can confirm the reliability of the molecular interpretation of the result, but can also help guide microdissection to improve sampling for molecular send out.
IHC score: 11.6
A score of 11.6 indicates an 89% chance of belonging to the low recurrence likelihood group and an 11% chance of belonging to the intermediate recurrence likelihood group.
Refining the certainty and prediction accuracy measures on which these interpretations are based would be aided by testing this technique on additional data sets.
BCL-2 as a predictive marker
Our analysis of the individual contributions of each marker to the predictive success of the IHC score suggests that all the markers we used in this study provide unique information that aids in the prediction of RS. However, the results also indicate that only two markers (PR and BCL-2) are needed to provide approximately the same performance as conventional IHC measures for recurrence likelihood. This result illustrates the complementary contributions of PR and BCL-2 in predicting recurrence, and perhaps de-emphasizes the importance of ER, Ki-67, and Her2 in this role. Although ER, Ki-67, and Her2 are routinely analyzed in invasive breast cancer cases, the results also suggest that quantitative image analysis can likely be sequestered to just PR and BCL-2 slides in underprivileged areas for the purpose of predicting breast cancer recurrence.
We suggest that conventional methods for estimation of the Oncotype DX breast recurrence score can be augmented by the addition of BCL-2, by the mathematical transformation of ER, PR, and Ki-67, or by the combination of both approaches. The results indicate that this modification improves concordance between IHC- and gene-expression-based methods, and can be extended to predict chemotherapy effect.
We would like to thank Sharon Cavone for her histology expertise.
Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.
MZ made substantial contributions to conception and design; acquired, analyzed, and interpreted the data; and drafted the manuscript. RH made substantial contributions to conception and design; and contributed to the preparation of the manuscript. NP made substantial contributions to conception and design; and contributed to the preparation of the manuscript. FG made substantial contributions to conception and design; and contributed to the preparation of the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Ethics approval and the consent to participate were considered exempt by the Institutional Review Board at Drexel University under category 4.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 5.Le Doussal V, Tubiana-Hulin M, Friedman S, Hacene K, Spyratos F, Brunet M. Prognostic value of histologic grade nuclear components of Scarff-Bloom-Richardson (SBR). An improved score modification based on a multivariate analysis of 1262 invasive ductal breast carcinomas. Cancer. 1989;64(9):1914–21.PubMedGoogle Scholar
- 6.Gamucci T, Vaccaro A, Ciancola F, Pizzuti L, Sperduti I, Moscetti L, Longo F, Fabbri MA, Giampaolo MA, Mentuccia L, et al. Recurrence risk in small, node-negative, early breast cancer: a multicenter retrospective analysis. J Cancer Res Clin Oncol. 2013;139(5):853–60.PubMedPubMedCentralGoogle Scholar
- 7.Colleoni M, Sun Z, Price KN, Karlsson P, Forbes JF, Thurlimann B, Gianni L, Castiglione M, Gelber RD, Coates AS, et al. Annual Hazard rates of recurrence for breast Cancer during 24 years of follow-up: results from the international breast Cancer study group trials I to V. J Clin Oncol. 2016;34(9):927–35.PubMedPubMedCentralGoogle Scholar
- 8.Cheng L, Swartz MD, Zhao H, Kapadia AS, Lai D, Rowan PJ, Buchholz TA, Giordano SH. Hazard of recurrence among women after primary breast cancer treatment--a 10-year follow-up using data from SEER-Medicare. Cancer Epidemiol Biomark Prev. 2012;21(5):800–9.Google Scholar
- 9.Criscitiello C, Disalvatore D, De Laurentiis M, Gelao L, Fumagalli L, Locatelli M, Bagnardi V, Rotmensz N, Esposito A, Minchella I, et al. High Ki-67 score is indicative of a greater benefit from adjuvant chemotherapy when added to endocrine therapy in luminal B HER2 negative and node-positive breast cancer. Breast. 2014;23(1):69–75.PubMedGoogle Scholar
- 10.de Azambuja E, Cardoso F, de Castro G Jr, Colozza M, Mano MS, Durbecq V, Sotiriou C, Larsimont D, Piccart-Gebhart MJ, Paesmans M. Ki-67 as prognostic marker in early breast cancer: a meta-analysis of published studies involving 12,155 patients. Br J Cancer. 2007;96(10):1504–13.PubMedPubMedCentralGoogle Scholar
- 15.Joh JE, Esposito NN, Kiluk JV, Laronga C, Lee MC, Loftus L, Soliman H, Boughey JC, Reynolds C, Lawton TJ, et al. The effect of Oncotype DX recurrence score on treatment recommendations for patients with estrogen receptor-positive early stage breast cancer and correlation with estimation of recurrence risk by breast cancer specialists. Oncologist. 2011;16(11):1520–6.PubMedPubMedCentralGoogle Scholar
- 16.Loncaster J, Armstrong A, Howell S, Wilson G, Welch R, Chittalia A, Valentine WJ, Bundred NJ. Impact of Oncotype DX breast recurrence score testing on adjuvant chemotherapy use in early breast cancer: real world experience in greater Manchester, UK. Eur J Surg Oncol. 2017;43(5):931–7.Google Scholar
- 19.Ozmen V, Atasoy A, Gokmen E, Ozdogan M, Guler N, Uras C, Ok E, Demircan O, Isikkdogan A, Cabioglu N, et al. Correlations between Oncotype DX recurrence score and classic risk factors in early breast Cancer: results of a prospective multicenter study in Turkey. J Breast Health (2013). 2016;12(3):107–11.Google Scholar
- 22.Gage MM, Mylander WC, Rosman M, Fujii T, Le Du F, Raghavendra A, Sinha AK, Espinosa Fernandez JR, James A, Ueno NT, et al. Combined pathologic-genomic algorithm for early-stage breast cancer improves cost-effective use of the 21-gene recurrence score assay. Ann Oncol. 2018;29(5):1280–5.PubMedGoogle Scholar
- 32.Lee KH, Im SA, Oh DY, Lee SH, Chie EK, Han W, Kim DW, Kim TY, Park IA, Noh DY, et al. Prognostic significance of bcl-2 expression in stage III breast cancer patients who had received doxorubicin and cyclophosphamide followed by paclitaxel as adjuvant chemotherapy. BMC Cancer. 2007;7:63.PubMedPubMedCentralGoogle Scholar
- 34.Wolff AC, Hammond ME, Hicks DG, Dowsett M, McShane LM, Allison KH, Allred DC, Bartlett JM, Bilous M, Fitzgibbons P, et al. Recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. Arch Pathol Lab Med. 2014;138(2):241–56.PubMedGoogle Scholar
- 35.Wolff A, Hammond M, Schwartz J, Hagerty K, Allred D, Cote R, Dowsett M, Fitzgibbons P, Hanna W, Langer A. American Society of Clinical Oncology/College of American Pathologists guideline recommendations for human epidermal growth factor receptor 2 testing in breast cancer. Arch Pathol Lab Med. 2007;131(1):18–43.PubMedGoogle Scholar
- 39.Schwartz AM, Henson DE, Chen D, Rajamarthandan S. Histologic grade remains a prognostic factor for breast cancer regardless of the number of positive lymph nodes and tumor size: a study of 161 708 cases of breast cancer from the SEER program. Arch Pathol Lab Med. 2014;138(8):1048–52.PubMedGoogle Scholar
- 42.Orucevic A, Bell JL, McNabb AP, Heidel RE. Oncotype DX breast cancer recurrence score can be predicted with a novel nomogram using clinicopathologic data. Breast Cancer Res Treat. 2017;163(1):51–61.Google Scholar
- 45.Mattes MD, Mann JM, Ashamalla H, Tejwani A. Routine histopathologic characteristics can predict oncotype DX(TM) recurrence score in subsets of breast cancer patients. Cancer Investig. 2013;31(9):604–6.Google Scholar
- 48.Tang P, Wang J, Hicks DG, Wang X, Schiffhauer L, McMahon L, Yang Q, Shayne M, Huston A, Skinner KA, et al. A lower Allred score for progesterone receptor is strongly associated with a higher recurrence score of 21-gene assay in breast cancer. Cancer Investig. 2010;28(9):978–82.Google Scholar
- 51.Fountzilas G, Dafni U, Bobos M, Batistatou A, Kotoula V, Trihia H, Malamou-Mitsi V, Miliaras S, Chrisafi S, Papadopoulos S, et al. Differential response of immunohistochemically defined breast cancer subtypes to anthracycline-based adjuvant chemotherapy with or without paclitaxel. PLoS One. 2012;7(6):e37946.PubMedPubMedCentralGoogle Scholar
- 52.Sanchez-Munoz A, Garcia-Tapiador AM, Martinez-Ortega E, Duenas-Garcia R, Jaen-Morago A, Ortega-Granados AL, Fernandez-Navarro M, de la Torre-Cabrera C, Duenas B, Rueda AI, et al. Tumour molecular subtyping according to hormone receptors and HER2 status defines different pathological complete response to neoadjuvant chemotherapy in patients with locally advanced breast cancer. Clin Transl Oncol. 2008;10(10):646–53.PubMedGoogle Scholar
- 55.Early Breast Cancer Trialists’ Collaborative G. Effects of chemotherapy and hormonal therapy for early breast cancer on recurrence and 15-year survival: an overview of the randomised trials. Lancet. 2005;365(9472):1687–717.Google Scholar
- 57.Soran A, Bhargava R, Johnson R, Ahrendt G, Bonaventura M, Diego E, McAuliffe PF, Serrano M, Menekse E, Sezgin E, et al. The impact of Oncotype DX(R) recurrence score of paraffin-embedded core biopsy tissues in predicting response to neoadjuvant chemotherapy in women with breast cancer. Breast Dis. 2016;36(2–3):65–71.PubMedGoogle Scholar
- 58.Jahn B, Rochau U, Kurzthaler C, Hubalek M, Miksad R, Sroczynski G, Paulden M, Kluibenschadl M, Krahn M, Siebert U. Cost effectiveness of personalized treatment in women with early breast cancer: the application of OncotypeDX and adjuvant! Online to guide adjuvant chemotherapy in Austria. Springerplus. 2015;4:752.PubMedPubMedCentralGoogle Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.