Breast cancer is one of the most common cancer in women around the world. For diagnosis, pathologists evaluate the expression of biomarkers such as HER2 protein using immunohistochemistry over tissue extracted by a biopsy. This assessment is performed through microscopic inspection, estimating intensity and integrity of the membrane cells’s staining and scoring the sample as 0 (negative), 1+, 2+, or 3+ (positive): a subjective decision that depends on the interpretation of the pahologist.

This work is aimed to achieve consensus among opinions of pathologists in cases of HER2 breast cancer biopsies, using supervised learning methods based on multiple experts. The main goal is to generate a reliable public breast cancer gold-standard, to be used as training/testing dataset in future developments of machine learning methods for automatic HER2 overexpression assessment.

There were collected 30 breast cancer biopsies, with positive and negative diagnosis, where tumor regions were marked as regions-of-interest (ROIs). Magnification of \(20\times \) was used to crop non-overlapping rectangular sections according to a grid over the ROIs, leading a dataset with 1.250 images.

In order to collect the pathologists’ opinions, an Android application was developed. The biopsy sections are presented in a random way, and for each image, the expert must assign a score (0, 1+, 2+, 3+). Currently, six referent Chilean breast cancer pathologists are working on the same set of samples.

Getting the pathologists’ acceptance was a hard and time consuming task. Even more, obtaining the scoring of pathologists is a task that requires subtlety communication and time to manage their progress in the use of the application.


Breast cancer Intra-variability Inter-variability Expert opinion Biopsy score consensus 



Violeta Chang thanks pathologists M.D. Fernando Gabler, M.D. Valeria Cornejo, M.D. Leonor Moyano, M.D. Ivan Gallegos, M.D. Gonzalo De Toro and M.D. Claudia Ramis for their willing collaboration in the manual scoring of breast cancer biopsy sections. The author thanks Jimena Lopez for support with cancer tissue digitalization and the Biobank of Tissues and Fluids of the University of Chile for support with the collection of cancer biopsies. This research is funded by FONDECYT 3160559.


  1. 1.
    Akbar, S., Jordan, L., Purdie, C., Thompson, A., McKenna, S.: Comparing computer-generated and pathologist-generated tumour segmentations for immunohistochemical scoring of breast tissue microarrays. Br. J. Cancer 113(7), 1075–1080 (2015)CrossRefGoogle Scholar
  2. 2.
    Barlett, J., Mallon, E., Cooke, T.: The clinical evaluation of her-2 status: which test to use. J. Pathol. 199(4), 411–417 (2003)CrossRefGoogle Scholar
  3. 3.
    Boland, M., Markey, M., Murphy, R.: Automated recognition of patterns characteristic of subcellular structures in fluorescence microscopy images. Cytometry 33(3), 366–375 (1998)CrossRefGoogle Scholar
  4. 4.
    Boland, M., Murphy, R.: A neural network classifier capable of recognizing the patterns of all major subcellular structures in fluorescence microscope images of hela cells. Bioinformatics 17(12), 1213–1223 (2001)CrossRefGoogle Scholar
  5. 5.
    Braunschweig, T., Chung, J.-Y., Hewitt, S.: Perspectives in tissue microarrays. Comb. Chem. High Throughput Screen. 7(6), 575–585 (2004)CrossRefGoogle Scholar
  6. 6.
    Braunschweig, T., Chung, J.-Y., Hewitt, S.: Tissue microarrays: Bridging the gap between research and the clinic. Expert. Rev. Proteomics 2(3), 325–336 (2005)CrossRefGoogle Scholar
  7. 7.
    Brugmann, A., et al.: Digital image analysis of membrane connectivity is a robust measure of HER2 immunostains. Breast Cancer Res. Treat. 132(1), 41–49 (2012)CrossRefGoogle Scholar
  8. 8.
    Camp, R., Chung, G., Rimm, D.: Automated subcellural localization and quantification of protein expression in tissue microarrays. Nat. Med. 8(11), 1323–1327 (2002)CrossRefGoogle Scholar
  9. 9.
    Camp, R., Dolled-Filhart, M., King, B., Rimm, D.: Quantitative analysis of breast cancer tissue microarrays shows that both high and normal levels of HER2 expression are associated with poor outcome. Cancer Res. 63(7), 1445–1448 (2003)Google Scholar
  10. 10.
    Chang, V., et al.: Gold-standard and improved framework for sperm head segmentation. Comput. Methods Programs Biomed. 117(2), 225–237 (2014)CrossRefGoogle Scholar
  11. 11.
    Chen, R., Jing, Y., Jackson, H.: Identifying Metastases in Sentinel Lymph Nodes with Deep Convolutional Neural Networks arXiv:1608.01658 (2016)
  12. 12.
    Ciampa, A., et al.: HER-2 status in breast cancer correlation of gene amplification by fish with immunohistochemistry expression using advanced cellular imaging system. Appl. Immunohistochem. Mol. Morphol. 14(2), 132–137 (2006)CrossRefGoogle Scholar
  13. 13.
    Dobson, L., et al.: Image analysis as an adjunct to manual HER-2 immunohistochemical review: a diagnostic tool to standardize interpretation. Histopathology 57(1), 27–38 (2010)CrossRefGoogle Scholar
  14. 14.
    Ellis, C., Dyson, M., Stephenson, T., Maltby, E.: HER2 amplification status in breast cancer: a comparison between immunohistochemical staining and fluorescence in situ hybridisation using manual and automated quantitative image analysis scoring techniques. J. Clin. Pathol. 58(7), 710–714 (2005)CrossRefGoogle Scholar
  15. 15.
    Feng, S., et al.: A framework for evaluating diagnostic discordance in pathology discovered during research studies. Arch. Pathol. Lab. Med. 138(7), 955–961 (2014)CrossRefGoogle Scholar
  16. 16.
    Fink, M., Ullman, S.: From aardvark to zorro: a benchmark for mammal image classification. Int. J. Comput. Vis. 77(1–3), 143–156 (2008)CrossRefGoogle Scholar
  17. 17.
    Fuchs, T., Buhmann, J.: Computational pathology: challenges and promises for tissue analysis. Comput. Med. Imaging Graph. 35(7–8), 515–530 (2011)CrossRefGoogle Scholar
  18. 18.
    Gomes, D., Porto, S., Balabram, D., Gobbi, H.: Inter-observer variability between general pathologists and a specialist in breast pathology in the diagnosis of lobular neoplasia, columnar cell lesions, atypical ductal hyperplasia and ductal carcinoma in situ of the breast. Diagn. Pathol. 9, 121 (2014)CrossRefGoogle Scholar
  19. 19.
    Gurcan, M., Boucheron, L., Can, A., Madabhushi, A., Rajpoot, N., Yener, B.: Histopathological image analysis: a review. IEEE Rev. Biomed. Eng. 2, 147–171 (2009)CrossRefGoogle Scholar
  20. 20.
    Jantzen, J., Norup, J., Dounias, G., Bjerregaard, B.: PAP-smear benchmark data for pattern classification. In: Proceedings of Nature inspired Smart Information Systems (NiSIS 2005), pp. 1–9 (2005)Google Scholar
  21. 21.
    Khan, A., et al.: A novel system for scoring of hormone receptors in breast cancer histopathology slides. In: 2nd IEEE Middle East Conference on Biomedical Engineering, pp. 155–158 (2014)Google Scholar
  22. 22.
    Lacroix-Triki, M., et al.: High inter-observer agreement in immunohistochemical evaluation of HER-2/neu expression in breast cancer: a multicentre GEFPICS study. Eur. J. Cancer 42(17), 2946–2953 (2006)CrossRefGoogle Scholar
  23. 23.
    Laurinaviciene, A., Dasevicius, D., Ostapenko, V., Jarmalaite, S., Lazutka, J., Laurinavicius, A.: Membrane connectivity estimated by digital image analysis of HER2 immunohistochemistry is concordant with visual scoring and fluorescence in situ hybridization results: algorithm evaluation on breast cancer tissue microarrays. Diagn. Pathol. 6(1), 87–96 (2011)CrossRefGoogle Scholar
  24. 24.
    Lecun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)CrossRefGoogle Scholar
  25. 25.
    Lehr, H., Jacobs, T., Yaziji, H., Schnitt, S., Gown, A.: Quantitative evaluation of HER-2/NEU status in breast cancer by fluorescence in situ hybridization and by immunohistochemistry with image analysis. Am. J. Clin. Pathol. 115(6), 814–822 (2001)CrossRefGoogle Scholar
  26. 26.
    Masmoudi, H., Hewitt, S., Petrick, N., Myers, K., Gavrielides, M.: Automated quantitative assessment of HER-2/NEU immunohistochemical expression in breast cancer. IEEE Trans. Med. Imaging 28(6), 916–925 (2009)CrossRefGoogle Scholar
  27. 27.
    McHugh, M.: Interrater reliability: the kappa statistic. Biochem. Med. 22(3), 276–282 (2012)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Payne, A., Singh, S.: A benchmark for indoor/outdoor scene classification. In: Singh, S., Singh, M., Apte, C., Perner, P. (eds.) ICAPR 2005, Part II. LNCS, vol. 3687, pp. 711–718. Springer, Heidelberg (2005). Scholar
  29. 29.
    Prati, R., Apple, S., He, J., Gornbein, J., Chang, H.: Histopathologic characteristics predicting HER-2/NEU amplification in breast cancer. Breast J. 11(1), 433–439 (2005)CrossRefGoogle Scholar
  30. 30.
    Press, M., et al.: Diagnostic evaluation of HER-2 as a molecular target: an assessment of accuracy and reproducibility of laboratory testing in large, prospective, randomized clinical trials. Clin. Cancer Res. 11(18), 6598–6607 (2005)CrossRefGoogle Scholar
  31. 31.
    Prieto M.: Epidemiología del cáncer de mama en Chile. Revista Médica Clínica Las Condes (2011)Google Scholar
  32. 32.
    Seidal, T., Balaton, A., Battifora, H.: Interpretation and quantification of immunostains. Am. J. Surg. Pathol. 25(1), 1204–1207 (2001)CrossRefGoogle Scholar
  33. 33.
    Sim, T., Baker, S., Bsat, M.: The CMU pose, illumination, and expression database. IEEE Trans. Pattern Anal. Mach. Intell. 25(12), 1615–1618 (2003)CrossRefGoogle Scholar
  34. 34.
    Wolff, A., et al.: American society of clinical oncology, and college of american pathologists: recommendations for human epidermal growth factor receptor 2 testing in breast cancer: American Society of Clinical Oncology/College of American Pathologists clinical practice guideline update. J. Clin. Oncol. 31(31), 3997–4013 (2013)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2018

Authors and Affiliations

  1. 1.Laboratory for Scientific Image Analysis SCIANLab, Anatomy and Developmental Biology Department, Faculty of MedicineUniversity of ChileSantiagoChile

Personalised recommendations