Building Networks with Microarray Data
This chapter describes methods for learning gene interaction networks from high-throughput gene expression data sets. Many genes have unknown or poorly understood functions and interactions, especially in diseases such as cancer where the genome is frequently mutated. The gene interactions inferred by learning a network model from the data can form the basis of hypotheses that can be verified by subsequent biological experiments. This chapter focuses specifically on Bayesian network models, which have a level of mathematical detail greater than purely conceptual models but less than detailed differential equation models. From a network learning perspective the most severe problem with microarray data is the limited sample size, since there are usually many plausible networks for modeling the system. Since these cannot be reliably distinguished using the number of samples found in current microarray data sets, we describe robust network learning strategies for reducing the number of false interactions detected. We perform preliminary clustering using co-expression network analysis and gene shaving. Subsequently we construct Bayesian networks to obtain a global perspective of the relationships between these gene clusters. Throughout this chapter, we illustrate the concepts being expounded by referring to an ongoing example of a publicly available breast cancer data set.
Key wordsBayesian network co-expression network microarray cancer scale-free topology gene modules gene shaving bagging bayesian bootstrap
Kim-Anh Do was partially funded by the National Institutes of Health via the University of Texas SPORE in Breast Cancer (CA-116199) and the Cancer Center Support Grant (CA016672).
- 3.Pelloski, C. E., Mahajan, A., Maor, M., Chang, E. L., Woo, S., Gilbert, M., Colman, H., Yang, H., Ledoux, A., Blair, H., Passe, S., Jenkins, R. B., and Aldape, K. D. (2005) YKL-40 expression is associated with poorer response to radiation and shorter overall survival in glioblastoma. Clinical Cancer Research 11(9), 3326–3334.PubMedCrossRefGoogle Scholar
- 5.Ideker, T., and Lauffenberger, D. (2003) Building with a scaffold: emerging strategies for high to low-level cellular modeling. Trends in Biotechnology 21(6).Google Scholar
- 6.Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems. Morgan Kauffman, San Francisco, CA.Google Scholar
- 8.Loi, S., Haibe-Kains, B., Desmedt, C., Lallemand, F., Tutt, A. M., Gillet, C., Ellis, P., Harris, A., Bergh, J., Foekens, J. A., Klijn, J. G., Larsimont, D., Buyse, M., Bontempi, G., Delorenzi, M., Piccart, M. J., and Sotiriou, C. (2007) Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. Journal of Clinical Oncology 25(10), 1239–1246.PubMedCrossRefGoogle Scholar
- 10.Zhang, B., and Horvath, S. (2005) A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology 4(1). Article 17.Google Scholar
- 12.Hastie, T., Tibshirani, R., Eisen, M. B., Alizadeh, A., Levy, R., Staudt, L., Chan, W. C., Botstein, D., and Brown, P. (2000) ‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1(2).Google Scholar
- 13.Breiman, L. (1996) Bagging predictors. Machine Learning 24(2), 123–140.Google Scholar
- 14.Do, K.-A., Broom, B. M., and Wen, S. (2003) Geneclust. In Parmigiani, G., Garrett, E. S., Irizarry, R. A., and Zeger, S. L., ed., The Analysis of Gene Expression Data: Methods and Software, chapter 15, p. 342–361. Springer, New York, NY.Google Scholar
- 17.Bergamaschi, A., Kim, Y. H., Wang, P., Sørlie, T., Hernandez-Boussard, T., Lonning, P. E., Tibshirani, R., Børresen-Dale, A.-L., and Pollack, J. R. (November 2006) Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer. Genes Chromosomes Cancer 45(11), 1033–1040.PubMedCrossRefGoogle Scholar
- 18.Bayes, T. (1763) An essay towards solving a problem in the doctrine of chances. by the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philosophical Transactions 53, 370–418. Giving Some Account of the Present Undertakings, Studies and Labours of the Ingenious in Many Considerable Parts of the World.Google Scholar
- 19.Bayes, T. (1763/1958) Studies in the history of probability and statistics: IX. Thomas Bayes’ essay towards solving a problem in the doctrine of chances. Biometrika 45, 296–315. Bayes’ essay in modernized notation.Google Scholar
- 20.Chickering, D. M. (1996) Learning bayesian networks is NP-complete. In Fisher, D. H., and Lenz, H.-J., (ed.), Learning from Data: Artificial Intelligence and Statistics V, chapter 12, p. 121–130. Springer-verlag.Google Scholar