CpG methylation signature predicts prognosis in breast cancer

  • Tonghua Du
  • Bin Liu
  • Zhenyu Wang
  • Xiaoyu Wan
  • Yuanyu WuEmail author
Preclinical study



DNA methylation can be used as prognostic biomarkers in various types of cancers. We aimed to identify a CpG methylation pattern for breast cancer.


In this study, using the microarray data from the cancer genome atlas (TCGA) and gene expression omnibus (GEO), we profiled DNA methylation between 97 healthy control samples and 786 breast cancer samples in a training cohort (from TCGA, n = 883) to build a gene classifier using a penalized regression model. We validated the prognostic accuracy of this gene classifier in an internal validation cohort (from GEO, n = 72).


A total of 1777 differentially methylated CpGs corresponding to 1777 different methylated genes (DMGs) between breast cancer and control were chosen for this study. Subsequently, 16 CpGs were generated to classify patients into high-risk and low-risk groups in the training cohort. Patients with high-risk scores in the training cohort had shorter overall survival (hazard ratio [HR], 4.674; 95% CI 2.918 to 7.487; P = 1.678e–12) than patients with low-risk scores. The prognostic accuracy was also validated in the validation cohorts. Furthermore, among patients with low-risk scores in the combined training and validation cohorts, the patients with the age > 60 years compared with the patients with the age < 60 years were associated with improved overall survival (HR 2.088, 95% CI 1.348 to 3.235; p = 7.575e–04) in patients with a high-risk score but not in patients with low-risk score (HR 1.246, 95% CI 0.515 to 3.011; p = 0.625). The patients treated with radiotherapy compared with the patients without radiotherapy were associated with improved overall survival (HR 0.418, 95% CI 0.249 to 0.703; p = 6.991e-04) in patients with a high-risk score but not in patients with low-risk score (HR 2.092, 95% CI 0.574 to 7.629; p = 0.253). For the patients with recurrence and the patients without recurrence both groups were all associated with improved overall survival (HR 7.475, 95% CI 4.333 to 12.901; p = 6.991e–04) in patients with a high-risk score and in patients with low-risk score (HR 14.33, 95% CI 4.265 to 48.17; p = 4.883e–13).


The 16 CpG-based signature is useful as a biomarker in predicting prognosis for patients with breast cancer.


Breast cancer DNA methylation Prognosis Overall survival CpG sites 


Compliance with ethical standards

Conflicts of interest

The authors declare that they have no conflict of interest.

Supplementary material

10549_2019_5417_MOESM1_ESM.pdf (248 kb)
Supplementary material 1 (PDF 247 kb) Figure S1 WGCNA of DMGs between control group and breast cancer group. (A) Hierarchical cluster tree showing comethylation modules identified by WGCNA. Each leaf in the tree represents one gene. The major tree branches constitute 6 modules, labeled with different colors. (B) Module–trait association. Each row corresponds to a module, labeled with a color as in (A). Each column corresponds to a clinical trait (death, recurrence, age, ER, PR, HER2, type, M, N, T, stage, radiation and group). The color of each cell at the row—column intersection indicates the correlation coefficient between the module and the trait. A p value was also listed in the brackets under the coefficient


  1. 1.
    Johnson RH, Chien FL, Bleyer A (2013) Incidence of breast cancer with distant involvement among women in the United States, 1976 to 2009. JAMA 309(8):800–805. CrossRefGoogle Scholar
  2. 2.
    Ferlay J, Soerjomataram I, Dikshit R, Eser S, Mathers C, Rebelo M, Parkin DM, Forman D, Bray F (2015) Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012. Int J Cancer 136(5):E359–386. CrossRefGoogle Scholar
  3. 3.
    Tang Q, Holland-Letz T, Slynko A, Cuk K, Marme F, Schott S, Heil J, Qu B, Golatta M, Bewerunge-Hudler M, Sutter C, Surowy H, Wappenschmidt B, Schmutzler R, Hoth M, Bugert P, Bartram CR, Sohn C, Schneeweiss A, Yang R, Burwinkel B (2016) DNA methylation array analysis identifies breast cancer associated RPTOR, MGRN1 and RAPSN hypomethylation in peripheral blood DNA. Oncotarget 7(39):64191–64202. Google Scholar
  4. 4.
    Cady B (2007) Local therapy and survival in breast cancer. N Engl J Med 357(10):1051–1052 author reply 1052 CrossRefGoogle Scholar
  5. 5.
    Hudis CA (2007) Trastuzumab–mechanism of action and use in clinical practice. N Engl J Med 357(1):39–51CrossRefGoogle Scholar
  6. 6.
    Jones PA, Baylin SB (2007) The epigenomics of cancer. Cell 128(4):683–692CrossRefGoogle Scholar
  7. 7.
    Bibikova M, Barnes B, Tsan C, Ho V, Klotzle B, Le JM, Delano D, Zhang L, Schroth GP, Gunderson KL, Fan JB, Shen R (2011) High density DNA methylation array with single CpG site resolution. Genomics 98(4):288–295CrossRefGoogle Scholar
  8. 8.
    Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit S (2005) Bioinformatics and computational biology solutions using R and Bioconductor. Springer, New YorkCrossRefGoogle Scholar
  9. 9.
    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (Methodological) 57(1):289–300Google Scholar
  10. 10.
    Langfelder P, Horvath S (2008) WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9:559CrossRefGoogle Scholar
  11. 11.
    Wang P, Wang Y, Hang B, Zou X, Mao JH (2016) A novel gene expression-based prognostic scoring system to predict survival in gastric cancer. Oncotarget 7(34):55343–55351Google Scholar
  12. 12.
    Goeman JJ (2010) L1 penalized estimation in the Cox proportional hazards model. Biom J 52(1):70–84. Google Scholar
  13. 13.
    Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385–395CrossRefGoogle Scholar
  14. 14.
    Shan M, Zhang L, Liu Y, Gao C, Kang W, Yang W, He Y, Zhang G (2019) DNA methylation profiles and their diagnostic utility in BC. Dis Mark 2019:6328503. Google Scholar
  15. 15.
    Lesicka M, Jablonska E, Wieczorek E, Seroczynska B, Kalinowski L, Skokowski J, Reszka E (2019) A different methylation profile of circadian genes promoter in breast cancer patients according to clinicopathological features. Chronobiol Int. Google Scholar
  16. 16.
    Yang Y, Wu L, Shu XO, Cai Q, Shu X, Li B, Guo X, Ye F, Michailidou K, Bolla MK, Wang Q, Dennis J, Andrulis IL, Brenner H, Chenevix-Trench G, Campa D, Castelao JE, Gago-Dominguez M, Dork T, Hollestelle A, Lophatananon A, Muir K, Neuhausen SL, Olsson H, Sandler DP, Simard J, Kraft P, Pharoah PDP, Easton DF, Zheng W, Long J (2019) Genetically predicted levels of DNA methylation biomarkers and breast cancer risk: data from 228,951 women of European descent. J Natl Cancer Inst. Google Scholar
  17. 17.
    He LH, Ma Q, Shi YH, Ge J, Zhao HM, Li SF, Tong ZS (2013) CHL1 is involved in human breast tumorigenesis and progression. Biochem Biophys Res Commun 438(2):433–438CrossRefGoogle Scholar
  18. 18.
    Martin-Sanchez E, Mendaza S, Ulazia-Garmendia A, Monreal-Santesteban I, Blanco-Luquin I, Cordoba A, Vicente-Garcia F, Perez-Janices N, Escors D, Megias D, Lopez-Serra P, Esteller M, Illarramendi JJ, Guerrero-Setas D (2017) CHL1 hypermethylation as a potential biomarker of poor prognosis in breast cancer. Oncotarget 8(9):15789–15801CrossRefGoogle Scholar
  19. 19.
    Cao WH, Liu XP, Meng SL, Gao YW, Wang Y, Ma ZL, Wang XG, Wang HB (2016) USP4 promotes invasion of breast cancer cells via Relaxin/TGF-beta1/Smad2/MMP-9 signal. Eur Rev Med Pharmacol Sci 20(6):1115–1122Google Scholar
  20. 20.
    Li Y, Jiang D, Zhang Q, Liu X, Cai Z (2016) Ubiquitin-specific protease 4 inhibits breast cancer cell growth through the upregulation of PDCD4. Int J Mol Med 38(3):803–811. CrossRefGoogle Scholar
  21. 21.
    Turner AW, Nikpay M, Silva A, Lau P, Martinuk A, Linseman TA, Soubeyrand S, McPherson R (2015) Functional interaction between COL4A1/COL4A2 and SMAD3 risk loci for coronary artery disease. Atherosclerosis 242(2):543–552CrossRefGoogle Scholar
  22. 22.
    JingSong H, Hong G, Yang J, Duo Z, Li F, WeiCai C, XueYing L, YouSheng M, YiWen O, Yue P, Zou C (2017) siRNA-mediated suppression of collagen type IV alpha 2 (COL4A2) mRNA inhibits triple-negative breast cancer cell proliferation and migration. Oncotarget 8(2):2585–2593Google Scholar
  23. 23.
    Paik S, Tang G, Shak S, Kim C, Baker J, Kim W, Cronin M, Baehner FL, Watson D, Bryant J, Costantino JP, Geyer CE Jr, Wickerham DL, Wolmark N (2006) Gene expression and benefit of chemotherapy in women with node-negative, estrogen receptor-positive breast cancer. J Clin Oncol 24(23):3726–3734CrossRefGoogle Scholar
  24. 24.
    Suzuki MM, Bird A (2008) DNA methylation landscapes: provocative insights from epigenomics. Nat Rev Genet 9(6):465–476CrossRefGoogle Scholar
  25. 25.
    Teschendorff AE, Menon U, Gentry-Maharaj A, Ramus SJ, Gayther SA, Apostolidou S, Jones A, Lechner M, Beck S, Jacobs IJ, Widschwendter M (2009) An epigenetic signature in peripheral blood predicts active ovarian cancer. PLoS ONE 4(12):e8274. CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Breast SurgeryThe Second Clinical Hospital Of Jilin UniversityChangchunChina
  2. 2.Department of Gastrointestinal and Colorectal SurgeryChina-Japan Union Hospital of Jilin UniversityChangchunChina

Personalised recommendations