System-Scale Network Modeling of Cancer Using EPoC
One of the central problems of cancer systems biology is to understand the complex molecular changes of cancerous cells and tissues, and use this understanding to support the development of new targeted therapies. EPoC (Endogenous Perturbation analysis of Cancer) is a network modeling technique for tumor molecular profiles. EPoC models are constructed from combined copy number aberration (CNA) and mRNA data and aim to (1) identify genes whose copy number aberrations significantly affect target mRNA expression and (2) generate markers for long- and short-term survival of cancer patients. Models are constructed by a combination of regression and bootstrapping methods. Prognostic scores are obtained from a singular value decomposition of the networks. We have previously analyzed the performance of EPoC using glioblastoma data from The Cancer Genome Atlas (TCGA) consortium, and have shown that resulting network models contain both known and candidate disease-relevant genes as network hubs, as well as uncover predictors of patient survival. Here, we give a practical guide how to perform EPoC modeling in practice using R, and present a set of alternative modeling frameworks.
KeywordsSingular Value Decomposition Bayesian Information Criterion Copy Number Aberration Altered Copy Number International Cancer Genome Consortium
The authors thank the editors and reviewer for their constructive comments. This project receives funding from Cancerfonden, Barncancerfonden (NB-CNS consortium), Vetenskapsradet (SN,RJ), BioCare (SN).
- 4.Fisher R (1926) The arrangement of field experiments. J Ministry Agric Great Britain 33: 503–515Google Scholar
- 10.Golub GH, Loan CFV (1996) Matrix computations. Johns Hopkins University Press, Baltimore, MD, USAGoogle Scholar
- 12.Hastie T, Friedman J et al (2009) Elements of statistical learning, 2nd ed. Springer Verlag. Corr. 3rd printing 5th Printing, Springer-Verlag, New YorkGoogle Scholar
- 15.Jörnsten R, Abenius T et al (2011) Large-scale network modeling and prognostic scoring of the effects of DNA copy number aberrations on gene expression in glioblastoma. Mol Syst Biol. Nature Publishing Group, 1(7)Google Scholar
- 20.Margolin AA, Nemenman I et al (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinformatics 7 Suppl 1:S7Google Scholar
- 23.Peng J, Zhu J et al (2010) Regularized Multivariate Regression for Identifying Master Predictors with Application to Integrative Genomics Study of Breast Cancer. Ann Math Stat 53–77Google Scholar
- 26.Savageau MA (1976) Biochemical systems analysis : a study of function and design in molecular biology; with a foreword by Robert Rosen. Advanced Book Program Addison-Wesley Pub Co, Addison-Wesley Reading, MA, USAGoogle Scholar
- 30.Skogestad S, Postlethwaite I (1996) Multivariable feedback control: analysis and design? Wiley, Chichester and New YorkGoogle Scholar
- 36.Troyanskaya O, Cantor M et al (2001) Missing value estimation methods for DNA microarrays Bioinformatics 17(6):520–525Google Scholar
- 37.Verhaak CPRG, Hoadley KA et al (2009) Reproducible Gene Expression Subtypes of Glioblastoma Show Associations with Chromosomal Aberrations Gene Mutations, and Clinical Phenotypes. ManuscriptGoogle Scholar
- 40.Zou H, Hastie T et al (2006) Sparse Principal Component Analysis. J Comput Graph Stat 2:262–286Google Scholar