Network-Based Inference of Cancer Progression from Microarray Data

  • Yongjin Park
  • Stanley Shackney
  • Russell Schwartz
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


Cancer cells exhibit a common phenotype of uncontrolled cell growth, but this phenotype may arise from many different combinations of mutations. By inferring how cells evolve in individual tumors, a process called cancer progression, we may be able to identify important mutational events for different tumor types, potentially leading to new therapeutics and diagnostics. Prior work has shown that it is possible to infer frequent progression pathways by using gene expression profiles to estimate “distances” between tumors. Individual mutations can, however, result in large shifts in expression levels, making it difficult to accurately identify evolutionary distance from differences in expression. Here, we apply gene network models in order to improve our ability to estimate evolutionary distances from expression data by controlling for correlations among co-regulated genes. We test two variants of this approach, one using full regulatory networks inferred from a candidate gene set and the other using simplified modular networks inferred from clusters of similarly expressed genes. Application to a set of E2F-responsive genes from a lung cancer microarray data set shows a small improvement in phylogenies when correcting from the full network but a more substantial improvement when correcting from the modular network. These results suggest that a network correction approach can lead to better identification of tumor similarity, but that sophisticated network models are needed to control for the large hypothesis space and sparse data currently available.


Bayesian Network Minimum Span Tree Network Inference Full Network Modular Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Antoniak, J.R.: Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annals Stat. 2, 1152–1174 (1974)zbMATHCrossRefMathSciNetGoogle Scholar
  2. 2.
    Cormen, T.H., Leiserson, C.A., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  3. 3.
    Desper, R., Khan, J., Schaffer, A.A.: Tumor classification using phylogenetic methods on expression data. J. Theor. Biol. 228, 477–496 (2004)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Fang, Z.H., Han, Z.C.: The transcription factor E2F: a crucial switch in the control of homeostasis and tumorigenesis. Histol. Histopathol. 21, 403–413 (2006)Google Scholar
  5. 5.
    Friedman, N., Linial, M., Nachman, I., Pe’er, D.: Using Bayesian networks to analyze expression data. J. Comput. Biol. 7, 601–620 (2000)CrossRefGoogle Scholar
  6. 6.
    Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeej, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Cligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286, 531–537 (1999)CrossRefGoogle Scholar
  7. 7.
    Jones, M.H., Virtanen, C., Honjoh, D., Miyoshi, T., Satoh, Y., Okumura, S., Nakagawa, K., Nomura, H., Ishikawa, Y.: Two prognostically significant subtypes of high-grade lung neuroenedocrine tumours independent of small-cell and large-cell neuroendocrine carcinomas identified by gene expression profiles. Lancet 363, 775–781 (2004)CrossRefGoogle Scholar
  8. 8.
    Kim, S., Imoto, S., Miyano, S.: Dynamic Bayesian network and nonparametric regression for nonlinear modeling of gene networks from time series gene expression data. Biosystems 75, 57–65 (2004)CrossRefGoogle Scholar
  9. 9.
    Maere, S., Heymans, K., Kuiper, M.: BiNGO: A Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks. Bioinformatics 21, 3448–3449 (2005)CrossRefGoogle Scholar
  10. 10.
    Murphy, K.: Bayes net toolbox for Matlab (2007),
  11. 11.
    Neal, R.M.: Markov chain sampling methods for Dirichlet process mixture models. J. Comput. Graph. Stat. 9(2), 249–265 (2000)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Nowell, P.C.: The clonal evolution of tumor cell populations. Science 194, 23–28 (1976)CrossRefGoogle Scholar
  13. 13.
    Perou, C.M., Sorlie, T., Eisen, M.B., van de Rijn, M.M., Jeffrey, S.S., Rees, C.A., Pollack, J.R., Ross, D.T., Johnsen, H., Akslen, L.A., Fluge, O., Pergamenschikov, A., WIlliams, C., Zhu, S.X., Lonning, P.E., Borresen-Dale, A.-L., Brown, P.O., Botstein, D.: Molecular portraits of human breast tumors. Nature 406, 747–752 (2000)CrossRefGoogle Scholar
  14. 14.
    Qin, Z.S.: Clustering microarray gene expression data using weighted Chinese restaurant process. Bioinformatics 22(16), 1988–1997 (2006)CrossRefGoogle Scholar
  15. 15.
    Rasmussen, C.E.: The infinite Gaussian mixture model. In: Solla, S.A., Lean, T.K., Muller, K.-R. (eds.) Advances in Neural Information Processing Systems, vol. 12, pp. 554–560. MIT Press, Cambridge (2000)Google Scholar
  16. 16.
    Schmidt, M., Niculescu-Mizil, A., Murphy, K.: Learning graphical model structure using L1-regularization paths. In: Proceedings of the 22nd Conference on Artificial Intelligence (AAAI 2007) (2007)Google Scholar
  17. 17.
    Segal, E., Shapira, M., Regev, A., Pe’er, D., Botstein, D., Koller, D., Friedman, N.: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat. Genet. 34(2), 166–176 (2003)CrossRefGoogle Scholar
  18. 18.
    Shackney, S.E., Silverman, J.F.: Molecular evolutionary patterns in breast cancer. Anat. Pathology 10, 278–290 (2003)CrossRefGoogle Scholar
  19. 19.
    Sorlie, T., Perou, C.M., Tibshirani, R., Aas, T., Geisler, S., Johnsen, H., Hastie, T., Eisen, M.B., van de Rijn, M., Jeffrey, S.S., Thorsen, T., Quist, H., Matese, J.C., Brown, P.O., Botstein, D., Lonning, P.E., Borresen-Dale, A.-L.: Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. USA 98, 10869–10874 (2001)CrossRefGoogle Scholar
  20. 20.
    Tavazoie, S., Hughes, J.D., Campbell, M.J., Cho, R.J., Church, G.M.: Systematic determination of genetic network architecture. Nature Genet 22, 281–285 (1999)CrossRefGoogle Scholar
  21. 21.
    Teyssier, M., Koller, D.: Ordering-based search: A simple and effective algorithm for learning Bayesian networks. In: Proceedings of the 21th Annual Conference on Uncertainty in Artificial Intelligence (UAI-2005), pp. 584–559 (2005)Google Scholar
  22. 22.
    Tsantoulis, P.K., Gorgoulis, V.G.: Involvement of E2F transcription factor family in cancer. Eur. J. Cancer 41, 2403–2413 (2005)CrossRefGoogle Scholar
  23. 23.
    van ’t Veer, L., Dai, H., van de Vijver, M., He, Y., Hart, A., Mao, M., Peterse, H., van der Kooy, K., Marton, M., Witteveen, A., Schreiber, G., Kerkhoven, R., Roberts, C., Linsley, P., Bernards, R., Friend, S.: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871), 530–536 (2002)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yongjin Park
    • 1
  • Stanley Shackney
    • 2
  • Russell Schwartz
    • 1
  1. 1.Department of Biological SciencesCarnegie Mellon UniversityPittsburgh 
  2. 2.Departments of Human Oncology and Human GeneticsDrexel University School of MedicinePittsburgh 

Personalised recommendations