Introduction

It is well established that BRCA1-related breast tumors, as a group, differ from non-BRCA1 tumors in terms of histological phenotype. Tumors of BRCA1 mutation carriers are more likely to be high-grade with medullary subtype features, including greatly increased mitotic count, pushing margins, lymphocytic infiltrate, trabecular growth pattern, and necrosis [1]-[3]. Consistent with overrepresentation of a basal phenotype, a number of immunohistochemical (IHC) markers have been shown to be of value in assessing BRCA1 tumor phenotype in female patients, including estrogen receptor (ER), progesterone receptor (PR), human Epidermal Growth Factor Receptor 2 (HER2), p53, cytokeratin 5/6 (CK5/6), cytokeratin 14 (CK14), cytokeratin 17 (CK17), and epidermal growth factor receptor (EGFR) [4]-[8]. In addition, several studies reported that reduced expression of CK8/18 can discriminate the basal tumors of BRCA1 mutation carriers from basal tumors of noncarriers [9],[10], whereas loss of phosphatase and tensin homolog (PTEN), together with triple-negative (TN; ER-, PR-, HER2-) status, was reported to improve the sensitivity of BRCA1 mutation prediction in a study of Asian breast cancer patients [11]. The introduction of PTEN to BRCA1 mutation-prediction algorithms is supported by single-cell analyses of temporal somatic events in BRCA1 breast tumor tissue, which revealed that loss of PTEN is an early event in the development of BRCA1 basal-like tumors, whereas TP53 mutations occur first in most luminal BRCA1 tumors [12].

The breast tumor phenotype of female BRCA2 female mutation carriers is less distinctive than that of BRCA1 mutation carriers [1],[13],[14]. Nevertheless, reports based on IHC or expression array analysis have shown that BRCA2 breast tumors are predominantly of the luminal B subtype [13],[15], and are more likely than non-BRCA2 tumors to be ER positive and high grade, with reduced tubule formation and continuous pushing margins [2],[13].

A number of these histopathological features have been incorporated into prediction models or have been proposed as selection criteria for prioritizing testing of breast cancer patients for BRCA1 and BRCA2 mutations [11],[16]-[24]. These findings have also served as the basis for including independently predictive tumor histopathological features as a component of the multifactorial likelihood model for clinical classification of BRCA1/2 variants of uncertain significance [25]. The current iteration of the model includes likelihood ratio (LR) estimates of pathogenicity for combined ER and grade or combined ER, CK5/6, and CK14 status, for analysis of BRCA1 variants, and tubule formation for BRCA2 [26]-[29]. However, these LR estimates were derived from analyses of relatively small datasets including a maximum of 600 mutation carriers and 288 noncarriers [4],[6], and have not been directly validated.

We conducted analyses of large pathology datasets accrued by the Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA) and the Breast Cancer Association Consortium (BCAC) to reassess previously reported histopathological predictors of BRCA1 and BRCA2 mutation status. The results provide more-refined LR estimates for downstream multifactorial likelihood analysis and for prediction of BRCA1 and BRCA2 mutation status.

Methods

Access to data and ethics approvals

ENIGMA (Evidence-based Network for the Interpretation of Germline Mutant Alleles) is a research consortium aimed to improve methods to assess the clinical significance in breast cancer susceptibility genes [30]. Considerable overlap in membership exists between ENIGMA, CIMBA, and BCAC. As a collaboration between the three consortia, investigators in ENIGMA accessed CIMBA and BCAC datasets for approved pathology-related analyses relevant to the purposes of ENIGMA. The collection of clinical, pathology, and genetic data by CIMBA and BCAC has been previously approved for ongoing research studies by the local ethics committee relevant to each of the participating CIMBA and BCAC studies, and all participants provided informed consent to the relevant participating CIMBA and BCAC sites for such ongoing studies.

Research analyses specific to this study were carried out using only de-identified data, with approval from the Human Research Ethics Committee of the QIMR Berghofer Medical Research Institute, and the Institutional Review Board of the University of Utah.

Sample sets

CIMBA

The Consortium of Investigators of Modifiers of BRCA1/2 (CIMBA; [31]) is a consortium established to conduct large-scale research studies of carriers of germline BRCA1 or BRCA2 pathogenic mutations [32]. Specifically, carriers of variants of uncertain significance are ineligible for entry into CIMBA. The major focus is discovery and validation of genetic factors that modify risk of breast and ovarian cancer in BRCA1 and BRCA2 mutation carriers, with consideration of risk stratified by tumor histologic features. Contributing centers provide information relevant to analyses, including year of birth, age at diagnosis of breast and/or ovarian cancer, cancer behavior (invasive, in situ), basic histology, and other pathology measures for breast and ovarian tumors from study participants. Pathology information is extracted mainly from pathology reports, although a small subset of contributing centers have conducted centralized pathology review and/or supplemented clinical IHC results with research testing of tumor material (for example, 5% of ER pathology results were centrally reviewed) [33]. All CIMBA centers with ER and grade data available in the CIMBA database that were from countries with pathology data available from population (presumed noncarrier) reference cases in BCAC (see later) were included in the analyses. Variables included were as follows: gene mutated, mutation nomenclature (and mutation type, for example, truncating, missense, and so on), date of birth, age and date of diagnosis of breast cancer(s), breast cancer behavior, ER status, PR status, HER2 status, Cytokeratin 5 or 5/6 status, and grade. No CK14 IHC results were available. No dual-mutation carriers were found. Only invasive breast cancer cases diagnosed before age 70 years were included, to reduce the likelihood of phenocopy tumors not directly related to mutation status. Samples were included irrespective of ovarian cancer diagnoses. For individuals with two breast cancers (20% of cases), the breast cancer diagnosed closest in time to the entry into the CIMBA cohort was included preferentially.

BCAC

The Breast Cancer Association Consortium (BCAC [34]) was established to discover and validate genetic factors associated with risk of breast cancer in the general population [35]. BCAC also studies risk factors associated with tumor subtypes and tumor histologic features, and pathology data from participating centers are derived from pathology reports or center-specific research efforts. BCAC pathology data were checked and cleaned centrally [36]. BCAC centers were selected for inclusion in this analysis based on availability of ER and grade data. Studies in BCAC in which cases were ascertained on the basis of tumor characteristics (for example, the TN consortium) were excluded. Variables provided for analyses were as follows: study type (to identify within-study strata, and/or to define cohorts with familial cases), age at diagnosis of breast cancer(s), breast cancer behavior, ER status, PR status, HER2 status, CK5 or 5/6 status, and grade. No CK14 IHC results were available. The study design was noted as selected (familial and/or age-selected, relevant for 13 studies) or unselected (from population-based or hospital-based design), based on study-ascertainment criteria provided by the principal investigators of individual BCAC sites.

BRCA1 and BRCA2 germline mutation testing results were provided by 13 of the 36 BCAC studies (comprising 12% of BCAC individuals overall), nine of which used age/family history selection criteria for case ascertainment (with testing for 4% to 100% of these nine studies). The 345 known mutation carriers (189 BRCA1, 156 BRCA2) identified in BCAC were excluded. Analysis included subjects known to be noncarriers or untested for BRCA1/2 mutations, with relevant pathology information for primary invasive breast cancer diagnosis younger than age 70 years. As for CIMBA, for individuals with two breast cancers (only 5% of all BCAC cases considered), the breast cancer diagnosed closest in time to the entry into the cohort was included preferentially.

Statistical analysis

ER or grade data were available for 4,477 BRCA1 mutation carriers, 2,565 BRCA2 mutation carriers, and 47,565 BCAC breast cancer cases with no known mutation in BRCA1 or BRCA2 (presumed noncarriers). The numbers of subjects by country are shown in Table 1. Only countries with ≥200 cases in BCAC and ≥100 carriers in CIMBA were included in analyses to minimize potential bias due to country-specific patterns of pathology assessment. ER-negative, PR-negative, and HER2-negative tumors were categorized as triple-negative (TN). All other combinations of known ER, PR, and HER2 status for a single breast tumor were categorized as “Not TN.” CIMBA and BCAC studies contributing pathology data are noted in Additional file 1: Table S1. Final sample sizes for analyses are reported in footnotes to Tables 2 and 3, and Additional file 1, Tables S2 to S4.

Table 1 Subjects in CIMBA and BCAC datasets with breast tumor ER or grade status, by country
Table 2 Estimated likelihood ratios for predicting BRCA1 or BRCA2 mutation status defined by breast tumor ER and/or grade phenotype*
Table 3 Estimated likelihood ratios for predicting BRCA1 or BRCA2 mutation status defined by breast tumor triple-negative phenotype

CK5/6 IHC data were available for only 128 BRCA1 carriers, 78 BRCA2 carriers and 6,796 BCAC cases with valid data on ER status. Numbers of carriers reduced further after country-matching, and frequencies differed significantly between countries for carriers. Cytokeratin analyses were thus not pursued further.

All statistical analyses were performed by using STATA version 12 (StatCorp, College Station, TX, USA). Statistical significance was defined as P <0.05.

We first examined whether family history was related to the predictor variables of interest in the BCAC sample set. Family-history information, defined as first-degree relative with breast cancer, was available for 30,223 individuals (7,547 reporting a family history of breast cancer). Logistic regression analyses were performed to predict ER status, grade 3, or TN status as a function of family history (defined as first-degree relative with breast cancer), adjusting for age at diagnosis and country. No significant effect was observed for family history on any of these histopathologic features, so we did not consider family history further in any analyses.

To identify the most important predictors of mutation status to be used in estimation of the likelihood ratios for classification of variants, we undertook a series of logistic regression analyses. These analyses compared BRCA1 and BRCA2 with the BCAC set. A sequential series of models with country and age (younger than 50 years versus 50 years or older) as a starting point and then adding ER, grade, and the ER/grade combination to test for interaction between ER and grade. For those cases who had data on TN status, we examined ER, ER and grade, ER TN, grade TN, and last, models with ER, grade, and TN. Likelihood ratio tests were used to determine the most parsimonious models for each gene.

We then estimated simple likelihood ratios of the form L[path| BRCAi/L[path|BRCA0], where i = 1, 2 to denote tumors from women with germline BRCA1 and BRCA2 mutations, respectively, and BRCA0 to denote cancers from women presumed to be without such mutations. For example, if m BRCA1 tumors have a given histopathological feature of a total of M total carriers with measured data on this feature, and s noncarriers (in this case, from the BCAC set) of a total of S have the feature of interest, then the LR is estimated by (m/M)/(s/S). An approximate variance of log(LR) is given by Var(ln(LR) = [1/m – 1/M +1/s – 1/S]. Thus assuming a normal distribution for log(LR), 95% confidence limits are given by exp[ln(LR) ± 1.96√(Var(ln(LR))].

However, to account for potential differences between countries in the distributions of ER status and grade together with large differences in the ratio of carriers to noncarriers, we derived stratified estimates of LR by using a Mantel-Haenszel approach [37] with approximate 95% confidence intervals calculated according to Greenland and Robins [38]. The country-based strata considered were as follows: Australia, United Kingdom, Germany, USA, and all other countries (with smaller individual sample sizes) pooled. Both stratified and unstratified analyses were conducted for all ages, and by age at diagnosis younger than 50 years versus 50 years or older, when sufficient sample size was available in all groups.

We performed a series of sensitivity analyses to assess how the lack of BRCA1/2 testing in the vast majority of BCAC cases might affect the Likelihood Ratio estimates reported here. First, we determined the probability that each untested BCAC case was a true noncarrier as follows: We calculated the probability, a priori, that each untested BCAC case carried a pathogenic BRCA1 and/or BRCA2 mutation by using the age-specific relative risks in Antoniou et al. [39] and assuming allele frequencies of pathogenic mutations in each gene of 0.0005. Next we calculated crude LRs for ER-negative and ER-positive tumor status by using only the ~6,000 BCAC cases that tested negative for BRCA1/2, and all the BRCA1 and BRCA2 carriers. Then, assuming the prior calculated in step 1, we used these preliminary ER LRs to calculate the posterior probability that each untested BCAC case had a mutation in BRCA1 or BRCA2 based on their ER status. We calculated the probability that each BCAC case was a true noncarrier for a mutation in either gene, as 1, brca1 probability minus brca2 probability.

Second we reestimated a subset of the LRs by using iterative sampling of BCAC cases from the posterior distribution calculated, as described. We generated a uniform random number for each case, and used this and the posterior probabilities to determine whether each of the untested BCAC cases was a noncarrier, a BRCA1 carrier, or a BRCA2 carrier. We then used these simulated data to reestimate LRs from the whole data set, adjusting for country, as in the initial analysis.

Further, to examine the effects of changes in pathology over time, potential racial/ethnic differences in these features, and possible survival bias, we performed three additional analyses, one estimating overall unstratified ER/grade LRs for diagnosis after 1989; one restricted to white European ancestry cases only; and another of only cases diagnosed within 5 years of recruitment (to avoid possible bias between tumor phenotype and survival).

Results

The principal aim of this study was to reassess histopathological predictors of BRCA1 and BRCA2 mutation status by analysis of datasets considerably larger than those analyzed previously for this purpose, to provide more robust pathology-based likelihood ratios for use in assessing the pathogenicity of BRCA1 or BRCA2 variants. Our main analyses of breast tumor features included up to 3,929 BRCA1 mutation carriers, 2,273 BRCA2 mutation carriers, and 42,623 assumed BRCA1 and BRCA2 mutation-negative breast cancer cases (Tables 2 and 3). This large sample set allowed us to explore ER alone, grade alone, combined ER and grade stratified by age, and ER/PR/HER2 TN status as predictors of BRCA1 and BRCA2 mutation status.

Logistic regression determining best histopathology predictors of mutation status

For BRCA1 carriers, likelihood ratio tests indicated that both ER and grade were strong independent predictors of BRCA1 status compared with the BCAC set (P <10-20). Marginal evidence suggested that considering grade and ER status jointly improved the fit compared with including them separately in the model (χ2 = 6.25, 2 df, P = 0.04). When we considered only cases in which TN status and grade were available, TN significantly added to the model fit, even with ER status in the model; the most parsimonious model included ER, grade, and TN status, which was significantly better than any model with only two of these included (χ2 = 83.8, 1 df, P <10-20). For BRCA2 both ER and grade were highly significant predictors of mutation status, and the interaction of ER and grade was also quite significant (χ2 = 28.3, 2 df, P <10-6). The addition of TN did not improve the model fit significantly (P = 0.14) when ER and grade were included in the model. We thus considered ER, grade, and TN status in deriving likelihood ratio estimates for BRCA-mutation status.

ER and grade as predictors of mutation status

The estimated likelihood ratios for predicting BRCA1 or BRCA2 mutation status defined by breast tumor ER-grade phenotype, adjusted for country by using stratified analysis, are shown in Table 2. Results based on pooled data unstratified for country, including cell counts, are shown in Additional file 1: Table S2. In general, the Mantel-Haenszel stratified LR estimates were quite similar to the pooled estimates, with stratified estimates most often closer to 1.0 (although not always). Significant between-country heterogeneity for the estimated likelihood ratios was most often observed with grade rather than ER or TN status. ER-positive cases were less likely to be carriers of a BRCA1 mutation, irrespective of grade. Conversely, ER-negative cases with high-grade tumors were more likely to be BRCA1 mutation carriers. Further, our analyses showed that ER-positive grade 3 tumors were modestly predictive of positive BRCA2 mutation status (Table 2). The association of BRCA2 mutation status with ER-positive high-grade tumors was not substantially different for women diagnosed at younger or older than age 50 years (LR <50 years = 1.77 (95% CI, 1.60 to 1.95), LR ≥50 years = 1.76 (95% CI, 1.49 to 2.08)). However, ER-negative grade 3 tumor status was modestly predictive of positive BRCA2 mutation status in women diagnosed at 50 years or older (LR, 1.54; 95% CI = 1.27 to 1.88).

It is well known that ER and grade status are correlated, with ER-negative tumors more likely to present with high grade. Consistent with this, relatively few cases appeared in any of the sample sets with ER-negative grade 1 tumors. However, we estimated LRs for ER alone and grade alone to allow inclusion of pathology data in models for predicting BRCA1 and BRCA2 mutation status, in instances in which information for only one of these variables is available (Table 2). For example, for a woman diagnosed with breast cancer at 50 years or older, the LR in favor of positive BRCA1 mutation status would be 3.5 if her tumor were known to be ER negative but grade status was unknown, and 2.4 if reported as grade 3 without information on ER status.

An acknowledged caveat to the inclusion of pathology data in multifactorial likelihood modeling is the underlying assumption that missense and in-frame deletions considered to be pathogenic mutations will exhibit the same tumor histopathological characteristics as do truncating mutations. The dataset in this study included 398 known pathogenic BRCA1 missense mutation carriers (mainly C61G), and 44 pathogenic BRCA2 missense mutation carriers with information on ER status or grade. Comparing the missense variants with the truncating set of mutations, we found no significant association of BRCA1 mutation type with ER status (OR = 0.9; 95% CI, 0.7 to 1.2; P =0.4) or grade (OR = 1.15; 95% CI, 0.9 to 1.4; P =0.2) or BRCA2 (OR = 2.7; CI, 0.9 to 7.6; =0.07 for ER; OR = 0.6 0.3 –to 1.2; P =0.14 for grade), although power was quite limited for BRCA2.

Triple-negative (TN) phenotype in BRCA1 and BRCA2carriers

Secondary country-stratified analysis of 2,249 BRCA1, 1,195 BRCA2 and 19,178 assumed mutation-negative breast cancer cases (Table 3) indicated that TN tumor status is highly predictive of BRCA1 mutation status for women diagnosed at younger than 50 years (LR = 3.73; 95% CI, 3.43 to 4.05) and at age 50 years or older (LR = 4.41; 95% CI 3.86 to 5.04), and results were little different for unstratified analysis (see Additional file 1: Table S3, also displaying cell counts).

Results also indicated that TN phenotype is modestly predictive of BRCA2 mutation status in cases diagnosed at age 50 years or older (LR, 1.79; 95% CI = 1.42 to 2.24). This observation is explained by the lower frequency of the TN phenotype in noncarriers (12.9% 50 years or older) versus BRCA2 mutation carriers (23.5% 50 years or older). Additional analysis considering grade and TN status combined (see Additional file 1: Table S4) did not show substantial improvement over LRs estimated for ER and grade combined (Table 2) or TN status (Table 3), although numbers in some cells were limited.

Sensitivity analyses

With respect to the possible consequences of contamination by missed mutation carriers in the BCAC sample set, we first estimated which BCAC-untested cases were more likely to be an undetected mutation carrier, and then re-estimated a subset of the LRs by using iterative sampling of the control dataset. Based on age-specific relative risks, we estimated that there could be at most 796 BRCA1 (1.7%) and 433 BRCA2 (0.9%) undetected carriers in the reference dataset of 47,565 BCAC cases. Based on age and crude ER, LR estimated from true non-carriers in BCAC, of 41,515 BCAC cases whose genetic status was unknown, 34,869 (84%) had posterior probabilities of being a true BRCA1/2-negative case greater than 0.95, with the minimum posterior probability being 0.89. Repeating this sampling process a total of 5 times, the number of BRCA1 carriers within the BCAC set ranged from 688 to 784, and the number of BRCA2 carriers ranged from 410 to 455 (total carriers, 1,114 to 1,194). Re-estimation of a subset of LRs indicated that the LRs assuming all BCAC cases do not carry a pathogenic BRCA1 or BRCA2 mutation is quite close to what we would expect, had all individuals been tested. For ER-negative Grade 3 cases diagnosed at younger than 50 years, the original LR for BRCA1 mutation status, assuming all BCAC cases were non-carriers, was 3.16, whereas the five replicates from iterative analysis ranged from 3.22 to 3.25. For TN tumor phenotype, the original LR for BRCA1 mutation status was 3.73, whereas the median of the five replicates was 3.76.

In additional sensitivity analyses, we recalculated unstratified LRs for ER and grade combined, restricting the analyses to the subset of 36,522 (33,260 BCAC, 3,252 CIMBA) breast cancer cases of European ancestry, of which 31,374 (28,364 BCAC, 3,010 CIMBA) were diagnosed within 5 years of interview, and 40,874 (36,414 BCAC, 4,460 CIMBA) were diagnosed after 1989. Results were similar to those from the overall analyses, with LR estimates consistently within the confidence intervals of the overall analyses.

Discussion

Histopathological predictors of mutation status

This study assessing histopathological predictors of BRCA1 and BRCA2 mutation status is based on the largest sample set reported to date, and so provides more-precise estimates that account for age at diagnosis as a potential confounder. We also provide age-stratified LRs for ER alone and grade alone, which, although not as predictive as ER and grade combined, will facilitate inclusion of minimal pathology information in multifactorial modeling of individually rare variants.

Further, we provide, for the first time, LR estimates for TN status that can be applied when grade information is not recorded, with estimates associated with TN status comparable to those for ER-negative-grade 3 (for BRCA1) and ER-positive-grade 3 (for BRCA2). Altogether, these refined LRs will improve the clinical classification of BRCA1 and BRCA2 variants, particularly those identified in women with later age at diagnosis.

Our ER-grade analysis results for BRCA1 are consistent with results from analysis of raw data for a smaller dataset of 600 BRCA1 carriers aged younger than 60 years and 258 age-matched non-carriers from the Breast Cancer Linkage Consortium, which yielded LRs of 1.94 (95% CI = 1.05 to 3.56) and 2.95 (95% CI = 2.41 to 3.62) for ER-negative grade 2 and ER-negative grade 3 tumors, respectively [26],[27]. However, the current study demonstrates that ER-negative grade 2 or 3 status is more predictive of positive BRCA1 status in women diagnosed at older than 50 years compared with younger than 50 (for example, for ER-negative-grade 3, LR ≥50 years is 4.13 (95% CI = 3.70 to 4.62) versus LR <50 years of 3.16 (95% CI = 2.96 to 3.37); P het <0.0001. These observations reflect the fact that although the overall proportion of ER-negative high-grade tumors is lower for older onset (54.5%) than younger onset (67.1%) BRCA1 carriers (as previously reported [33],[40]), the proportion of ER-negative high-grade tumors differs much more markedly for older-onset (12.8%) than younger-onset (20.8%) cases with no identified mutation in BRCA1 or BRCA2.

In addition, not reported in previous smaller studies [6],[7],[41], our results show that ER-positive grade 2 or 3 status is a stronger negative predictor of BRCA1 mutation status in women diagnosed before age 50 years compared with those diagnosed at age 50 years or older. These patterns reflect changes in the frequency of ER status and grade as a function of age in the non-carrier cases, rather than large changes in the frequency of these features in the carriers. Similarly, the findings for BRCA2 are consistent with those from a previous study of 157 BRCA2 mutation carriers and 314 mutation-negative familial breast cancer cases, which indicated that BRCA2-associated tumors were more likely to be ER-positive than were control tumors, when accounting for grade (OR, 2.09; 95% CI, 1.21 to 3.63; P =0.008) [13].

However, age-stratified analysis highlighted that ER-negative grade 3 tumor status modestly predicted positive BRCA2 mutation status in women diagnosed at age 50 years or older, indicating that grade is a more important factor than ER status in predicting BRCA2 tumors. We attempted to assess pathology difference by mutation type (missense versus truncating), an issue that has not previously been addressed rigorously because of the limited availability of pathology information for proven high-risk missense mutations. However, even in our very large dataset, the number of proven pathogenic missense mutations remained small, and it is apparent that future even larger studies will be needed to address this question.

The associations between BRCA1 mutation status and TN phenotype are consistent with those observed for ER-negative, high-grade tumors. They are also consistent with prior evidence that BRCA1 mutation carriers are enriched for the “basal” tumor phenotype that is highly concordant with TN status. A recent meta-analysis assessing the prevalence of BRCA1 mutations in TN versus non-TN breast cancer patients from largely high-risk breast cancer populations [42] estimated a risk of 5.65 (95% CI, 4.15 to 7.69) based on analysis of 236 BRCA1 mutation carriers and 2,297 non-carriers. In addition, these authors predicted that approximately two in nine women with TN breast cancer and additional high-risk features (early onset or family history) harbor a BRCA1 mutation [42]. TN status has not been obviously linked to BRCA2 mutation status previously; however, a recent study of 43 deleterious BRCA1/2 mutation carriers identified from screening of 409 Chinese familial breast cancer cases reported that TN phenotype was more likely to be exhibited by both BRCA1 (P =0.001, 69%, n = 16) and BRCA2 (P =0.01, 46%, n = 27) carriers identified in their cohort, compared with non-carriers (23%; n = 366) [43]. In contrast, a similar study of 221 Korean familial breast cancer patients [44] identified 81 deleterious mutation carriers, and demonstrated increased TN phenotype for BRCA1 mutation carriers (P <0.00001,57%, n = 35), but not BRCA2 mutation carriers (P =0.9, 13.9%, n = 36) compared with non-carriers (13%, n = 130). Neither of these studies presented their findings for cases stratified by diagnosis age 50 years or older.

Our study has shown that TN phenotype is modestly predictive of BRCA2 mutation status in cases diagnosed at 50 years or older, due to a lower TN frequency in non-carriers versus BRCA2 mutation carriers in this age group. Reassuringly, these TN frequency differences mirror the results seen for ER-negative grade 3 status in non-carriers and BRCA2 mutation carriers, an analysis based on a much larger sample set.

Possible impact of study limitations

We acknowledge several limitations of our study. Ideally, our reference group would have been drawn from the same source as the mutation carriers, as there may be differences between non-BRCA familial cases and unselected cases. However, in the subset of 30,233 BCAC cases that had data on family history, we did not see any significant differences between this group and the remainder of the sample in terms of the pattern of histological features, nor with those who indicated no first- or second-degree relatives with breast cancer.

In our analyses, we are implicitly assuming that testing for BRCA1/2 mutations was independent of the histopathology features used for prediction of mutation status. Although recently some features with therapeutic implications, such as TN status, are being used as a criterion for testing in some centers, we believe that the vast majority of our CIMBA carriers were tested solely on the basis of their family history. This analysis assumes that mutation testing in CIMBA sample sets was not directed by tumor histology. Mutation status was not known for all BCAC samples. However, mutation testing of BCAC samples had been performed for many studies with selected design that might be expected to be enriched for BRCA1 and BRCA2 mutation carriers, and these known mutation carriers were excluded from analysis.

Further, our sensitivity analyses suggest that, at very most, 2.5% of BCAC cases might carry an undetected mutation, and also show that our results would not be substantially affected by this level of contamination of the reference group.

The various sensitivity analyses conducted for the ER-grade dataset provided no convincing evidence for obvious differences for the factors being assessed. We did not see any marked difference in LR estimates for analyses restricted to individuals of European ancestry, but the small numbers of cases from other ethnic/racial groups did not allow us to assess reliably tumor histopathological features for other ethnic groups, and so may not be generalizable to patients of non-European ancestry. Although it is possible that variation in pathology grading and IHC testing methods might occur between countries or over time, our investigations provided no evidence that such differences would meaningfully confound interpretation of the results, and thus should not limit the use of the information generated for multifactorial likelihood analysis of BRCA1 or BRCA2 variants across continents.

Use of revised LR estimates for future multifactorial likelihood analyses

This study has re-estimated the likelihood of BRCA1 or BRCA2 mutation status associated with breast tumor features commonly measured in the clinical setting, by analyzing much larger datasets than previously used for this purpose. Our findings provide measures of confidence in the individual LR estimates, and in particular, allow age at diagnosis to be incorporated into the pathology component of the multifactorial likelihood model. Figure 1 provides a flowchart indicating the proposed application of pathology-based LRs, dependent on what breast tumor pathology information is available for a variant carrier. As indicated, ER-grade LRs should be applied in preference to other pathology LR estimates, where both ER and grade information is available. The ER-grade LRs were derived from analysis of the largest sample sizes and thus have the greatest precision, and application of 12 strata provided by three grade categories refines both positive and negative prediction of mutation status. For example, a patient with a high-grade ER-negative tumor is three- to fourfold more likely to carry a BRCA1 mutation than not, whereas a patient with a low-grade ER-positive tumor is about 10 times more likely to be mutation-negative than mutation-positive. Given that grade and ER are almost universally used to assess prognosis and predict response to antiestrogen therapies, these features are generally readily available on standard pathology reports.

Figure 1
figure 1

Proposed strategy for application of pathology likelihood ratios in multifactorial likelihood analysis of BRCA1 or BRCA2 rare sequence variants. Cases carrying a variant of uncertain clinical significance, and with information on relevant pathology variables, are first assessed to determine that breast tumor pathology information was not a criterion used to trigger gene testing. ER, estrogen-receptor breast tumor status; PR, progesterone-receptor breast tumor status; HER2, HER2 breast tumor status; TN, triple-negative breast tumor status; Not TN, breast tumor status not triple-negative, after measurement of ER, PR, and HER2 status; ER-neg, ER-negative status; ER-pos, ER-positive status; G, grade; <50, breast cancer diagnosis at younger than 50 years for tumor with relevant pathology data; ≥50, breast cancer diagnosis at 50 to 70 years for tumor with relevant pathology data.

This study could not provide a comparison to existing LR estimates of BRCA1 mutation status based on ER-CK status, determined from analysis of 182 BRCA1 and 109 age-matched cases [6]. However, we caution that very large confidence limits exist around the previously estimated LRs for ER-CK characteristics, and recommend further study of large carrier and reference sample sets to provide more-robust LR estimates for ER-CK phenotype in relation to mutation status.

It is important to note that the LRs estimated in this study were from analysis of sample sets that were, to our knowledge, unselected for tumor pathology status. Therefore, it will be necessary to consider potential for bias when individuals are screened for mutations on the basis of their tumor phenotype. This is expected to occur increasingly, now that BRCA1/2 mutation-prediction programs such as BOADICEA include pathology as a component [16], and given recent evidence supporting implementation of the National Comprehensive Cancer Network (NCCN) guidelines that recommend testing of all TN breast cancer patients aged 60 years or younger [45]. In this scenario, multifactorial likelihood analysis should exclude tumor-pathology information from individuals who had previously contributed to risk prediction used to prioritize families for mutation screening. However, pathology data generated subsequently from other variant carrier relatives can still provide independent information toward variant classification.

We are aware that, in the future, other tumor characteristics could provide useful information for variant classification. Array Comparative Genomic Hybridization (CGH) has been shown as an effective method to identify BRCA1-mutated breast cancers and sporadic cases with a BRCA1-like profile [46],[47] for appropriate chemotherapeutics, and to distinguish BRCA2-mutated tumors from sporadic breast tumors [48]. If introduced widely as a routine test, this approach might be considered in the future as an alternative predictor in multifactorial modeling. Furthermore, the mutual exclusivity of BRCA1-germline mutations and BRCA1 promoter methylation in tumors with BRCA1-like CGH profile [49] suggests that BRCA1 promoter methylation tests would add value in distinguishing somatic from germline loss of BRCA1 function, as is established for clinical testing triage and variant classification relating to MLH1 mismatch repair cancer-predisposition gene [50].

Alternatively, genome-wide tumor-methylation profiles may prove of value to distinguish between individual with and without a germline BRCA1 mutation [51]. Further, additional substratification of currently used histological features may add value in prediction of mutation status. Options include PTEN loss of expression in addition to TN status as a marker of BRCA1 mutation status [11], or gene-expression arrays to identify BRCA2 mutation carriers among the subset of luminal B tumors [15].

Recent research has also shown the value of considering further stratification of breast cancer subtype in the prediction of BRCA mutation status. For example although ER-negative status clearly predicts BRCA1 mutation status, even ER-positive BRCA1-related breast cancers are more likely to be grade 3, CK14+, and show high mitotic rate compared with ER-positive sporadic cancers [52].

In addition, possibilities exist to extend histopathological analyses to tumors other than female breast cancer. The combination of modified Nottingham grade 3 serous or undifferentiated histology, prominent intraepithelial lymphocytes, marked nuclear atypia with giant nuclei, and high mitotic index has recently been reported to be a significant predictor of BRCA1 mutation status in women with epithelial ovarian cancer [53]. Further, breast tumors of male BRCA2 mutation carriers are more likely to present as high-grade, PR-negative, and relatively high rates of HER2-positivity with a micropapillary component to histology have been reported [54],[55]. Investigation of these features in larger sample sizes should be considered in the future.

Although this article has focused on the utility of histopathologic features of breast cancers in the context of the classification of variants in the BRCA1 and BRCA2 genes, these results should also be useful in a range of other applications. The information provided in the main tables can be used to estimate sensitivities and specificities of histopathological predictors by broad age-group (for example, triple-negative tumor status has sensitivity of 0.67 and specificity 0.82 for detection of BRCA1 mutation status in women diagnosed at younger than age 50 years, whereas the sensitivity is 0.57 and the specificity 0.87 for women diagnosed at age 50 or older. As such, these results, in conjunction with other predictors of mutation status, could be useful to guide systematic genetic testing of germline DNA from patients to determine the appropriateness of the use of PARP inhibitors in therapy. The results arising from this study are also likely to inform future development of parallel models, which estimate the probability of an individual carrying a BRCA1 or BRCA2 mutation, to determine eligibility and/or priority for genetic testing (in particular, the BOADICEA model, which has recently been updated to include additional histopathologic characteristics from large data resources [56]).

Conclusions

The results from this large-scale analysis refine likelihood ratio estimates for predicting BRCA1 and BRCA2 mutation status by using commonly measured histopathological features. We demonstrate the importance of considering age at diagnosis for analyses, and show that grade is more informative than ER status for BRCA2 mutation-carrier prediction. The estimates will improve BRCA1 and BRCA2 variant classification by using multifactorial likelihood analysis, and inform patient mutation testing and clinical management.

Additional file