An Extension of Deep Pathway Analysis: A Pathway Route Analysis Framework Incorporating Multi-dimensional Cancer Genomics Data

  • Yue ZhaoEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10847)


Recent breakthroughs in cancer research have happened via the up-and-coming field of pathway analysis. By applying statistical methods to previously known gene and protein regulatory information, pathway analysis provides a meaningful way to interpret genomic data. In this paper we propose systematic methodology framework for studying biological pathways; one that cross-analyzes mutation information, transcriptome and proteomics data. Each pathway route is encoded as a bayesian network which is initialized with a sequence of conditional probabilities specifically designed to encode directionality of regulatory relationships defined by the pathways. Proteomics regulations, such as phosphorylation, is modeled by dynamically generated bayesian network through combining certain type of proteomics data to the regulated target. The entire pipeline is automated in R. The effectiveness of our model is demonstrated through its ability to distinguish real pathways from decoy pathways on TCGA mRNA-seq, mutation, copy number variation and phosphorylation data for both breast cancer and ovarian cancer study.


Pathway analysis Bayesian network Data integration 


  1. 1.
    Zhao, Y., Hoang, T.H., Joshi, P., Hong, S.H., Shin, D.G.: Deep pathway analysis incorporating mutation information and gene expression data. In: 2016 IEEE International Conference on Bioinformatics and Biomedicine, BIBM, pp. 260–265. IEEE (2016)Google Scholar
  2. 2.
    Zhao, Y., Hoang, T.H., Joshi, P., Hong, S.H., Giardina, C., Shin, D.G.: A route-based pathway analysis framework integrating mutation information and gene expression data. Methods 124, 3–12 (2017)CrossRefGoogle Scholar
  3. 3.
    Vaske, C.J., Benz, S.C., Sanborn, J.Z., Earl, D., Szeto, C., Zhu, J., Haussler, D., Stuart, J.M.: Inference of patient-specific pathway activities from multi-dimensional cancer genomics data using PARADIGM. Bioinformatics 26(12), i237–i245 (2010)CrossRefGoogle Scholar
  4. 4.
    Subramanian, A., Tamayo, P., Mootha, V.K., Mukherjee, S., Ebert, B.L., Gillette, M.A., Paulovich, A., Pomeroy, S.L., Golub, T.R., Lander, E.S., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Nat. Acad. Sci. 102(43), 15545–15550 (2005)CrossRefGoogle Scholar
  5. 5.
    Li, C., Li, H.: Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24(9), 1175–1182 (2008)CrossRefGoogle Scholar
  6. 6.
    Tarca, A.L., Draghici, S., Khatri, P., Hassan, S.S., Mittal, P., Kim, J.S., Kim, C.J., Kusanovic, J.P., Romero, R.: A novel signaling pathway impact analysis. Bioinformatics 25(1), 75–82 (2009)CrossRefGoogle Scholar
  7. 7.
    Vandin, F., Upfal, E., Raphael, B.J.: Algorithms for detecting significantly mutated pathways in cancer. J. Comput. Biol. 18(3), 507–522 (2011)MathSciNetCrossRefGoogle Scholar
  8. 8.
    Verbeke, L.P., Van den Eynden, J., Fierro, A.C., Demeester, P., Fostier, J., Marchal, K.: Pathway relevance ranking for tumor samples through network-based data integration. PLoS ONE 10(7), e0133503 (2015)CrossRefGoogle Scholar
  9. 9.
    Korucuoglu, M., Isci, S., Ozgur, A., Otu, H.H.: Bayesian pathway analysis of cancer microarray data. PLoS ONE 9(7), e102803 (2014)CrossRefGoogle Scholar
  10. 10.
    Isci, S., Ozturk, C., Jones, J., Otu, H.H.: Pathway analysis of high-throughput biological data within a Bayesian network framework. Bioinformatics 27(12), 1667–1674 (2011)CrossRefGoogle Scholar
  11. 11.
    Kanehisa, M., Goto, S.: KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28(1), 27–30 (2000)CrossRefGoogle Scholar
  12. 12.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Cambridge (2009)zbMATHGoogle Scholar
  13. 13.
    Mertins, P., Mani, D., Ruggles, K.V., Gillette, M.A., Clauser, K.R., Wang, P., Wang, X., Qiao, J.W., Cao, S., Petralia, F., et al.: Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534(7605), 55–62 (2016)CrossRefGoogle Scholar
  14. 14.
    Zhang, H., Liu, T., Zhang, Z., Payne, S.H., Zhang, B., McDermott, J.E., Zhou, J.Y., Petyuk, V.A., Chen, L., Ray, D., et al.: Integrated proteogenomic characterization of human high-grade serous ovarian cancer. Cell 166(3), 755–765 (2016)CrossRefGoogle Scholar
  15. 15.
    Broad Institute TGDA Center: Mutation assessor (2016)Google Scholar
  16. 16.
    Broad Institute TGDA Center: SNP6 copy number analysis (GISTIC2) (2016)Google Scholar
  17. 17.
    Zhang, J.D., Wiemann, S.: KEGGgraph: a graph approach to KEGG PATHWAY in R and bioconductor. Bioinformatics 25(11), 1470–1471 (2009)CrossRefGoogle Scholar
  18. 18.
    Højsgaard, S.: Graphical independence networks with the gRain package for R. J. Stat. Softw. 46(10), 1–26 (2012)Google Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Computer Science and Engineering DepartmentUniversity of ConnecticutStorrsUSA

Personalised recommendations