Skip to main content

Dissecting Pathway Disturbances Using Network Topology and Multi-platform Genomics Data

Abstract

Complex diseases such as cancers usually result from accumulated disturbance of pathways instead of the disruptions of one or a few major genes. As opposed to single-platform analyses, it is likely that integrating diverse molecular regulatory elements and their interactions can lead to more insights on pathway-level disturbances of biological systems and their potential consequences in disease development and progression. To explore the benefit of pathway-based analysis, we focus on multi-platform genomics, epigenomics, and transcriptomics (-omics, for short) from 11 cancer types collected by The Cancer Genome Atlas project. Specifically, we use a well-studied oncogenic pathway, the BRAF pathway, to investigate the relevant copy number variants (CNVs), methylations, and gene expressions, and quantify their effects on discovering tumor-specific aberrations across multiple tumor lineages. We also perform simulation studies to further investigate the effects of network topology and multiple omics on dissecting pathway disturbances. Our analysis shows that adding molecular regulatory elements such as CNVs and/or methylations to the baseline mRNA molecules can improve our power of discovering tumorous aberrances. Also, incorporating CNVs with the baseline mRNA molecules can be more beneficial than incorporating methylations. Moreover, employing regulatory topologies can improve the discoveries of tumorous aberrances. Finally, our analysis reveals similarities and differences among diverse cancer types based on disturbance of the BRAF pathway.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

References

  1. 1.

    Benjamini Y, Hochberg Y (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc B 57:289–300

    MathSciNet  MATH  Google Scholar 

  2. 2.

    Efron B, Tibshirani R (2007) On testing the significance of sets of genes. Ann Appl Stat 1:107–129

    MathSciNet  Article  MATH  Google Scholar 

  3. 3.

    Fallahi-Sichani M, Moerke NJ, Niepel M, Zhang T, Gray NS, Sorger PK (2015) Systematic analysis of BRAFV600E melanomas reveals a role for JNK/c-Jun pathway in adaptive resistance to drug-induced apoptosis. Mol Syst Biol 11(3):797

    Article  Google Scholar 

  4. 4.

    Hyman DM, Puzanov I, Subbiah V, Faris JE, Chau I, Blay JY, Wolf J, Raje NS, Diamond EL, Hollebecque A et al (2015) Vemurafenib in multiple nonmelanoma cancers with BRAF V600 mutations. N Engl J Med 373(8):726–736

    Article  Google Scholar 

  5. 5.

    Liu L, Ruan J (2013) Network-based pathway enrichment analysis. In: 2013 IEEE international conference on bioinformatics and biomedicine (BIBM). IEEE, p 218–221

  6. 6.

    Liu Q, Dinu I, Adewale AJ, Potter JD, Yasui Y (2007) Comparative evaluation of gene-set analysis methods. BMC Bioinform 8(1):431

    Article  Google Scholar 

  7. 7.

    Ma J, Shojaie A, Michailidis G (2016) Network-based pathway enrichment analysis with incomplete network information. Bioinformatics 32(20):3165–3174

    Article  Google Scholar 

  8. 8.

    Maciejewski H (2013) Gene set analysis methods: statistical models and methodological differences. Brief Bioinform. doi:10.1093/bib/bbt002

  9. 9.

    Shojaie A, Michailidis G (2009) Analysis of gene sets based on the underlying regulatory network. J Comput Biol 16(3):407–426

    MathSciNet  Article  Google Scholar 

  10. 10.

    Shojaie A, Michailidis G (2010) Network enrichment analysis in complex experiments. Stat Appl Genet Mol Biol. doi:10.2202/1544-6115.1483

  11. 11.

    Subramanian A, Tamayo P, Mootha VK, Mukherjee S, Ebert BL, Gillette MA, Paulovich A, Pomeroy SL, Golub TR, Lander ES et al (2005) Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc Natl Acad Sci USA 102(43):15545–15550

    Article  Google Scholar 

  12. 12.

    Tian L, Greenberg SA, Kong SW, Altschuler J, Kohane IS, Park PJ (2005) Discovering statistically significant pathways in expression profiling studies. Proc Natl Acad Sci USA 102(38):13544–13549

    Article  Google Scholar 

  13. 13.

    Tomczak K, Czerwińska P, Wiznerowicz M (2015) The cancer genome atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19(1A):A68

    Google Scholar 

  14. 14.

    Wu D, Lim E, Vaillant F, Asselin-Labat ML, Visvader JE, Smyth GK (2010) ROAST: rotation gene set tests for complex microarray experiments. Bioinformatics 26(17):2176–2182

    Article  Google Scholar 

Download references

Acknowledgements

We thank all the members of the Statistical and Applied Mathematical Sciences Institute (SAMSI) Data Integration: TCGA Working Group as part of the SAMSI Beyond Bioinformatics Program. We are grateful for the support of Dr. Sujit Ghosh at SAMSI. This research was partially supported by the InCHIP Faculty Affiliate Seed Grant at UConn (to YZ), Faculty Research Excellence Program Award at UConn (to YZ), the CICATS PreK Career Development Award at UConn (to YZ), and the Research Starter Grant in Informatics from PhRMA Foundation (to ZO). VB was partially supported by the following grants: NIH Grants R01 CA160736, R01CA194391, P30 CA016672, and NSF DMS 1463233 and the National Institutes of Health (NIH) Grants R01 GM59507 (to HZ), P01 CA154295 (to HZ), and P30 CA016359 (to HZ).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Yuping Zhang.

Appendix

Appendix

See Figs. 7 and 8 and Tables 2, 3 and 4.

Fig. 7
figure7

Simulation power by method, mean scenario, and gene set. The powers are calculated based on the B–H FDR controlling procedure [1] with a q value of 0.05. Unbalanced sample sizes with \(n_\mathrm{c}=50,\,n_\mathrm{t}=500\)

Fig. 8
figure8

Simulation power by method, mean scenario, and gene set. The powers are calculated based on the B–H FDR controlling procedure [1] with a q value of 0.05. Unbalanced sample sizes with \(n_\mathrm{c}=10,\, n_\mathrm{t}=500\)

Table 2 Simulation power by method, mean scenario, and gene set (expected power indicates the proportion of the gene set that is differentially expressed under the mean scenario)
Table 3 Simulation power by method, mean scenario, and gene set (expected power indicates the proportion of the gene set that is differentially expressed under the mean scenario)
Table 4 Simulation power by method, mean scenario, and gene set (expected power indicates the proportion of the gene set that is differentially expressed under the mean scenario)

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, Y., Linder, M.H., Shojaie, A. et al. Dissecting Pathway Disturbances Using Network Topology and Multi-platform Genomics Data. Stat Biosci 10, 86–106 (2018). https://doi.org/10.1007/s12561-017-9193-0

Download citation

Keywords

  • Data integration
  • Multi-platform genomics
  • Network topology
  • Pathway analysis