Estrogen independent gene expression defines clinically relevant subgroups of estrogen receptor positive breast cancer
- 1.1k Downloads
Human breast cancer represents a significantly heterogeneous disease. Global gene expression profiling measurements have been used to classify tumors into multiple molecular subtypes. The capacity to define subtypes of breast tumors provides a framework to enable improved understanding of the mechanisms of breast oncogenesis, as well as to provide opportunities for improved therapeutic intervention in patients.
We used publicly available gene expression profiling data to identify ‘estrogen independent’ genes in estrogen receptor alpha (ER+) breast tumors, and subsequently identified 6 subgroups of ER + breast tumors.
Each of the 6 identified subgroups exhibited distinct clinical behaviors and biology. Patients whose tumors comprised subgroups 2,5&6 experienced excellent long-term survival, whereas those patients whose tumors belonged to subgroups 1&4 experienced much poorer survival. Breast tumor cell lines representative of the different subgroups responded to therapeutic compounds in accordance with their subgroup classification.
These data support the existence of 6 distinct subgroups of ER + breast cancer and suggest that knowledge of the ER + subgroup status of patient samples have the potential to guide therapy choice.
KeywordsBreast cancer Gene expression Subtypes Therapies Estrogen
There is significant molecular and cellular diversity among human breast tumors. Indeed, this heterogeneity is evident from histopatholologic features and differences in ER, progesterone receptor (PR) and ERBB2/HER2/NEU status as well as more recent molecular classification schemes based on the expression of large numbers of genes [1, 2, 3]. Importantly, these data indicate that breast cancer is an imprecise definition that embodies many molecularly distinct neoplastic disorders that share a common normal breast tissue origin.
The capacity to more accurately define breast cancers and identify tumor subgroups that represent more homogeneous disease entities, provides a framework to increase our understanding of these diseases and provides opportunities to focus treatment options for patients. To this end investigators have completed relatively large gene expression studies and identified patterns in gene expression that reproducibly stratify breast tumors into each of 5 molecular subtypes. These breast cancer subtypes named basal-like, ERBB2-positive, normal-like, luminal A and luminal B were originally described by Perou et al. . The various molecular subtypes possess distinct clinical behaviors thus providing a basis for improved taxonomy for breast cancer. For example, basal-like tumors are highly aggressive, resistant to endocrine therapies but sensitive to conventional chemotherapy, whereas luminal A tumors are more indolent and responsive to endocrine therapies. Importantly, recent and more comprehensive molecular profiling of human breast tumors, including global gene expression, mutation, DNA copy number variation, and protein expression support the original finding that breast cancer falls into major molecular subtypes comprising subsets of genetic and epigenetic abnormalities . Currently, the additional clinical value of molecular classification over traditional histopathological methods is unclear, as the molecular subtypes show high correspondence to the expression of ER, PR, and HER2, as well as to tumor grade .
It is possible that further refinement of the ‘intrinsic’ classification scheme of Perou et al., could identify other molecular classes of breast cancer, and provide additional clinical value beyond traditional techniques. For example, ER + tumors generally fall into the luminal A and B molecular subtypes, characterized by expression of the ER as well as cytokeratins typically expressed by luminal epithelial cells [1, 3]. However, more recent studies suggest that as many 12 molecular subgroups of ER + breast cancer exist, demonstrating that the luminal A and B stratification of ER + breast tumors does not fully capture the biological complexity of these tumors . Indeed, further dissection of ER + breast tumors into additional relevant disease subgroups would likely provide further insight into the mechanisms that underlie these tumors, as well as prevent carefully planned studies from being confounded by the heterogeneity found among un-grouped or sub-optimally grouped populations of ER + breast tumors. Notably, the molecular subtypes of breast cancer show subtype specific response to standard chemotherapies as well as experimental compounds, highlighting the value of investigating specific disease subtypes . Hence, the identification and characterization of additional subgroups of ER + breast tumors could focus treatment options for patients with ER + breast tumors, because therapy could be rationally applied based on specific molecular characteristics of the patient’s tumor.
We hypothesized that the biology of ER + tumors comprised both estrogen-dependent and -independent components, and furthermore, that investigation and characterization of the estrogen independent component might provide a means to stratify ER + tumors into different distinct disease subgroups. To this end we used publicly available data to identify ‘estrogen independent’ genes in ER + breast tumors and subsequently identified subgroups of ER + tumors based on molecular differences between tumors identified by these genes. Importantly, we reproducibly identified 6 subgroups of ER + breast tumors that exhibited distinct clinical behavior as well as biology. Moreover, we show that these subgroups have specific responses to therapeutic compounds in vitro. Taken together these data support the existence of 6 distinct subgroups of ER + breast cancer, and advance efforts to increase the precision of therapeutic intervention in human breast cancer patients.
Human breast tumor data sets
All tumor samples were downloaded from the gene expression omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/). The latter included Letrozole treated tumor samples (GSE5462) , the discovery cohort (GSE6532, 133A array samples, n = 327 , the validation cohort (GSE6532 133 Plus 2.0 array samples n = 87 , GSE9195 n = 77 , GSE17705 n = 298 , GSE2034 n = 209 , GSE7390 n = 134 , Original samples from GSE26971 (n = 136)). Cell line expression profiles were downloaded from ArrayExpress (E-TABM-157) . Raw data files representing the tumor samples were normalised using RMA . TCGA gene expression was obtained from the TCGA research network (http://cancergenome.nih.gov/.), by downloading level 3 RNAseq data from the TCGA data portal (RSEM normalised) . For GEO cohorts, ER + status was obtained from associated clinical files, which were generally based on histopathological assessment. ER + status for the TCGA cohort was determined using expression cut-offs (250 RSEM normalised transcript counts) for the ESR1 gene. ER + patients were selected from each dataset, and validation cohorts were combined after each probe set/gene was standardized and mean centered.
Cell line drug sensitivity data
We obtained previously reported human breast tumor cell line sensitivity data from Heiser et al. .
Definition of estrogen independent genes
We calculated within (w, treatment pairs) and between (b, independent primary tumor samples) variation for all tumors. In this fashion probe-sets with greater variation in expression between tumors than between treatment paired samples received high b/w scores, and vice versa.
PAM 50 subtype assignment
Subtype membership was assignment was based on the nearest PAM50 centroid (Pearson correlation) .
Non-negative matrix factorization was carried out as previously described . Prediction analysis of microarrays (PAM) was carried out as described  to discover subgroup specific genes (discovery cohort) and to classify samples (validation cohort, cell lines).
Cell growth assays
All cell lines were obtained from the ATCC and passaged minimally prior to completing these experiments. Cell lines were maintained as suggested by the ATTC. Cell lines were maintained in either RPMI or DMEM supplemented with 10% fetal bovine serum (all from Life Technologies). Cells were seeded at a density 50,000 cells/ml in the wells of a 6-well plate (Corning) in triplicate for each time point. At each time point cells were trypsinized and viable cells were counted with a hemocytometer using Trypan Blue exclusion as a marker of cell viability. Relative cell growth was calculated as a number of viable of cells relative to control at each time point.
Survival analysis and Log-rank tests were used to evaluate survival differences between patient subgroups. We used 10 year distant metastasis free survival (DMFS) or disease free survival (DFS) as the clinical endpoint for these studies, and log-rank tests to detect differences in survival. T-test were used to compare means for 2-group comparisons, whereas ANOVA followed by Dunnett’s multiple comparison test was used to compare means for 3 or more groups. Tests were two-sided and a p-value of 0.05 or less was considered statistically significant.
Identification of estrogen independent genes and distinct subgroups of ER + breast cancer
To investigate whether the expression of the estrogen independent probe sets could capture the phenotypic complexity of ER + breast tumors we completed unsupervised clustering using non-negative matrix factorization (NMF) . NMF is an efficient method to identify molecular patterns that is readily applicable to gene expression data, and therefore can be used as a powerful means for class discovery. In short, NMF identifies metagenes, or distinct gene expression patterns, which are used to determine the optimal value for k, where k represents the number of sample subgroup clusters by calculating a cophenetic co-efficient for each value of k. In short, we applied NMF (for k = 2-10) to gene expression data representing 262 primary ER + breast tumors (GSE6532, 133A arrays,  filtered such that only the 1,000 estrogen independent probe sets were used for class identification. This data set optimally fell into 6 clusters, designated subgroups 1–6 (Figure 1C). Moreover, NMF on an additional independent data set of 298 ER + breast tumors (GSE17705, ) using the same 1,000 estrogen independent probe sets also suggested that these patients were also optimally stratified into 6 subgroups (Additional file 2: Figure S1). Hence, we concluded that on the basis of the expression of estrogen independent genes, ER + breast tumors can be categorized optimally into 1 of 6 ER independent subgroups.To learn whether these groups might encompass disease with different clinical outcomes we compared DFS (Figure 1D, *p < 0.05, Log-rank test) and DMFS (Figure 1E, *p < 0.05, Log-rank test) among the various subgroups. Interestingly, some subgroups displayed excellent long term outcomes, whereas other groups did not. For example, 10 year DMFS in subgroup 5 patients was 88%, whereas in subgroup 4 patients it was 48%. All patients comprising the various subgroups were uniformly chemotherapy naïve, suggesting that these differences in survival are likely related to the natural progression of their disease, rather than influenced by response to chemotherapy. Interestingly, in tamoxifen treated subgroup #3 (n = 27) patients we observed that the majority of DMFS events occurred after 5 years (Figure 1F), the time at which most patients cease tamoxifen treatment, possibly suggesting that these patients would have benefited from tamoxifen treatment beyond 5 years. Unfortunately, this dataset only comprised 8 subgroup #3 patients who did not receive tamoxifen, making the complimentary analysis in tamoxifen naive patients impractical.
Subgroups are independent of the molecular subtype of breast cancer
A framework for ER + breast tumor classification
As additional validation, we investigated the prevalence of the 6 subgroups in the TCGA breast data set, which comprised 801 ER + breast tumors . Using the PAM classifier described above, the mean probability for classification was 86%, and more than 70% (n = 580) of the tumors in the TCGA set were assigned a probability of 80% or higher of belonging to one of the 6 subgroups (Additional file 2: Figure S5 A&B). Hence, this extra validation data set provides additional evidence for the robustness of the classification framework.
Taken together with our previous data, these results demonstrate that the 6 identified subgroups of ER + breast tumors can be reproducibly identified in independent patient cohorts and provide a clinically relevant means of classifying ER + breast tumors.
ER + subgroups enable predictive modeling of anti-cancer drug sensitivity
To extend these findings, we looked for over-expression of other actionable targets with subgroup selective expression among the 6 subgroups (Additional file 2: Figure S6). IGF2 was over-expressed in subgroup 2 tumors, implicating IGF signaling as a therapeutic target in subgroup#2 tumors. Interestingly, subgroup #3 tumors significantly over-expressed the angiotensin receptor 2 (AGTR2). Whereas AGTR2 hasn’t been a target for cancer drug development, it has been a successfully exploited target for the development of hypertension drugs . Other notable targets included over-expression of the anti-apoptotic protein BCL2 in subgroup #3 tumors, and the immune-modulatory target CTLA4 in subgroup #5 tumors. In each case, approved therapeutics exist or are under development that target these highlighted over-expressed genes. These observed patterns could potentially be used to target therapies in ER + breast cancer patients contingent on the subgroup membership of their tumor.
Substantial molecular heterogeneity exists among ER + tumors, which isn’t adequately captured by either histophathological variables or more recent molecular subtyping strategies. Accordingly, we sought to identify novel means of classifying ER + tumors, and reproducibly identified 6 subgroups of ER + tumors based on the expression of estrogen independent genes. Notably, we also observed survival and treatment sensitivity differences among the 6 subgroups. Hence, our data suggests that patient subgroup membership may be a useful tool for guiding treatment of ER + breast cancer patients.
Briefly, the subgroup identification strategy was highly similar to that originally described by Perou et al. in 2000 . Whereas Perou et al. employed an unsupervised clustering approach with intrinsic genes in unselected breast tumors, we employed unsupervised clustering with estrogen independent genes in breast tumors selected for ER positivity. For this experiment we analysed gene expression profiling data from 58 ER + tumors biopsied from post-menupausal women before and after treatment with letrozole . We hypothesized that genes whose expression showed minimal variation after letrozole treatment could be considered to be expressed independent of estrogen, and identified estrogen independent genes based on this assumption. However, many breast cancer patients are pre-menopausal and receive different endocrine therapies for breast cancer treatment, namely tamoxifen. It is unclear whether the definition of estrogen independent genes we propose here would be different in pre-menopausal patients, or patients treated with alternate endocrine agents, and these possibilities represent intriguing avenues for future research. We note however, that subgrouping ER + tumors based on estrogen independent gene expression was both robust and reproducible in cohorts of tumors that included pre-menopausal patients as well as those treated with tamoxifen, suggesting that our approach is broadly applicable.
There remain several limitations of the work reported herein. All of our conclusions are based on the analysis of retrospective data, which limits its clinical value. We validated the occurrence, and clinical attributes, of the 6 subgroups in relatively large independent cohorts, however a true estimate of the clinical usefulness of the 6 subgroup classification for ER + breast cancers would require additional validation in clinical trial samples, as well as completion of a prospective clinical trial examining the capacity of the classification to guide therapy. In addition, it isn’t clear if subgroup classification would add meaningful clinical information beyond that obtained from existing prognostic tests designed for ER + tumors, such as OncotypeDX® . For example, a relevant question that remains is whether the good prognosis subgroups identified here (subgroups 2,5&6) experience similarly excellent survival to the low risk group identified by OncotypeDX®. Additionally, it isn’t clear if the relationship between patient outcome and subgroup assignment is a consequence of subgroup association with natural progression of breast cancer or tumor response to adjuvant endocrine therapy. Many of the patients obtained from publically available sources had incomplete clinical annotations, and they comprise a mixture of patients that received no adjuvant therapy, or adjuvant tamoxifen, likely lasting for 5 years. Based on these data it is difficult to discern how differences in extent or choice of endocrine therapy might influence the relationship between patient outcome and subgroup membership. Hence, although our data suggests the 6 subgroup classification of ER + breast cancer may be useful for guiding therapy in patients, many additional validation experiments are required to confirm our findings.
Ultimately, we propose that the 6 subgroups described here provide a strategy for improved understanding and treatment of ER + breast tumors. We demonstrate that the subgroups are unique and independent of the molecular subtypes of cancer, and provide a clinically relevant means of tumor classification. We anticipate that subgrouping will provide a framework to both guide optimal use of existing therapeutics, as well as gain insight into biological processes that represent relevant targets for development of the next generation of experimental therapies.
This work was generously supported by grants from the Canadian Breast Cancer Foundation. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript. We wish to acknowledgements helpful discussion from Drs. Greg Pond and Anita Bane throughout the course of this work.
This work was generously supported by grants from the Canadian Breast Cancer Foundation to JAH.
- 1.Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Fluge O, Pergamenschikov A, Williams C, Zhu SX, Lønning PE, Børresen-Dale AL, Brown PO, Botstein D: Molecular portraits of human breast tumours. Nature. 2000, 406 (6797): 747-752. 10.1038/35021093.CrossRefPubMedGoogle Scholar
- 2.Sorlie T, Tibshirani R, Parker J, Hastie T, Marron JS, Nobel A, Deng S, Johnsen H, Pesich R, Geisler S, Demeter J, Perou CM, Lønning PE, Brown PO, Børresen-Dale AL, Botstein D: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci U S A. 2003, 100 (14): 8418-8423. 10.1073/pnas.0932692100.CrossRefPubMedPubMedCentralGoogle Scholar
- 6.Heiser LM, Sadanandam A, Kuo WL, Benz SC, Goldstein TC, Ng S, Gibb WJ, Wang NJ, Ziyad S, Tong F, Bayani N, Hu Z, Billig JI, Dueregger A, Lewis S, Jakkula L, Korkola JE, Durinck S, Pepin F, Guan Y, Purdom E, Neuvial P, Bengtsson H, Wood KW, Smith PG, Vassilev LT, Hennessy BT, Greshock J, Bachman KE, Hardwicke MA, et al: Subtype and pathway specific responses to anticancer compounds in breast cancer. Proc Natl Acad Sci U S A. 2012, 109 (8): 2724-2729. 10.1073/pnas.1018854108.CrossRefPubMedGoogle Scholar
- 7.Miller WR, Larionov AA, Renshaw L, Anderson TJ, White S, Murray J, Murray E, Hampton G, Walker JR, Ho S, Krause A, Evans DB, Dixon JM: Changes in breast cancer transcriptional profiles after treatment with the aromatase inhibitor, letrozole. Pharmacogenet Genomics. 2007, 17 (10): 813-826. 10.1097/FPC.0b013e32820b853a.CrossRefPubMedGoogle Scholar
- 8.Loi S, Haibe-Kains B, Desmedt C, Lallemand F, Tutt AM, Gillet C, Ellis P, Harris A, Bergh J, Foekens JA, Klijn JG, Larsimont D, Buyse M, Bontempi G, Delorenzi M, Piccart MJ, Sotiriou C: Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. J Clin Oncol. 2007, 25 (10): 1239-1246. 10.1200/JCO.2006.07.1522.CrossRefPubMedGoogle Scholar
- 9.Loi S, Haibe-Kains B, Desmedt C, Wirapati P, Lallemand F, Tutt AM, Gillet C, Ellis P, Ryder K, Reid JF, Daidone MG, Pierotti MA, Berns EM, Jansen MP, Foekens JA, Delorenzi M, Bontempi G, Piccart MJ, Sotiriou C: Predicting prognosis using molecular profiling in estrogen receptor-positive breast cancer treated with tamoxifen. BMC Genomics. 2008, 9: 239-10.1186/1471-2164-9-239.CrossRefPubMedPubMedCentralGoogle Scholar
- 10.Symmans WF, Hatzis C, Sotiriou C, Andre F, Peintinger F, Regitnig P, Daxenbichler G, Desmedt C, Domont J, Marth C, Delaloge S, Bauernhofer T, Valero V, Booser DJ, Hortobagyi GN, Pusztai L: Genomic index of sensitivity to endocrine therapy for breast cancer. J Clin Oncol. 2010, 28 (27): 4111-4119. 10.1200/JCO.2010.28.4273.CrossRefPubMedPubMedCentralGoogle Scholar
- 11.Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA: Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005, 365 (9460): 671-679. 10.1016/S0140-6736(05)17947-1.CrossRefPubMedGoogle Scholar
- 12.Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, Haibe-Kains B, Viale G, Delorenzi M, Zhang Y, d'Assignies MS, Bergh J, Lidereau R, Ellis P, Harris AL, Klijn JG, Foekens JA, Cardoso F, Piccart MJ, Buyse M, Sotiriou C, TRANSBIG Consortium: Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res. 2007, 13 (11): 3207-3214. 10.1158/1078-0432.CCR-06-2765.CrossRefPubMedGoogle Scholar
- 13.Neve RM, Chin K, Fridlyand J, Yeh J, Baehner FL, Fevr T, Clark L, Bayani N, Coppe JP, Tong F, Speed T, Spellman PT, DeVries S, Lapuk A, Wang NJ, Kuo WL, Stilwell JL, Pinkel D, Albertson DG, Waldman FM, McCormick F, Dickson RB, Johnson MD, Lippman M, Ethier S, Gazdar A, Gray JW: A collection of breast cancer cell lines for the study of functionally distinct cancer subtypes. Cancer Cell. 2006, 10 (6): 515-527. 10.1016/j.ccr.2006.10.008.CrossRefPubMedPubMedCentralGoogle Scholar
- 16.Parker JS, Mullins M, Cheang MC, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, Quackenbush JF, Stijleman IJ, Palazzo J, Marron JS, Nobel AB, Mardis E, Nielsen TO, Ellis MJ, Perou CM, Bernard PS: Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol. 2009, 27 (8): 1160-1167. 10.1200/JCO.2008.18.1370.CrossRefPubMedPubMedCentralGoogle Scholar
- 19.Sotiriou C, Wirapati P, Loi S, Harris A, Fox S, Smeds J, Nordgren H, Farmer P, Praz V, Haibe-Kains B, Desmedt C, Larsimont D, Cardoso F, Peterse H, Nuyten D, Buyse M, Van de Vijver MJ, Bergh J, Piccart M, Delorenzi M: Gene expression profiling in breast cancer: understanding the molecular basis of histologic grade to improve prognosis. J Natl Cancer Inst. 2006, 98 (4): 262-272. 10.1093/jnci/djj052.CrossRefPubMedGoogle Scholar
- 20.Davies C, Pan H, Godwin J, Gray R, Arriagada R, Raina V, Abraham M, Medeiros Alencar VH, Badran A, Bonfill X, Bradbury J, Clarke M, Collins R, Davis SR, Delmestri A, Forbes JF, Haddad P, Hou MF, Inbar M, Khaled H, Kielanowska J, Kwan WH, Mathew BS, Mittra I, Müller B, Nicolucci A, Peralta O, Pernas F, Petruzelka L, Pienkowski T, et al: Long-term effects of continuing adjuvant tamoxifen to 10 years versus stopping at 5 years after diagnosis of oestrogen receptor-positive breast cancer: ATLAS, a randomised trial. Lancet. 2013, 381 (9869): 805-816. 10.1016/S0140-6736(12)61963-1.CrossRefPubMedPubMedCentralGoogle Scholar
- 25.Paik S, Shak S, Tang G, Kim C, Baker J, Cronin M, Baehner FL, Walker MG, Watson D, Park T, Hiller W, Fisher ER, Wickerham DL, Bryant J, Wolmark N: A multigene assay to predict recurrence of tamoxifen-treated, node-negative breast cancer. N Engl J Med. 2004, 351 (27): 2817-2826. 10.1056/NEJMoa041588.CrossRefPubMedGoogle Scholar
- The pre-publication history for this paper can be accessed here:http://www.biomedcentral.com/1471-2407/14/871/prepub
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.