Abstract
This chapter describes methods for learning gene interaction networks from high-throughput gene expression data sets. Many genes have unknown or poorly understood functions and interactions, especially in diseases such as cancer where the genome is frequently mutated. The gene interactions inferred by learning a network model from the data can form the basis of hypotheses that can be verified by subsequent biological experiments. This chapter focuses specifically on Bayesian network models, which have a level of mathematical detail greater than purely conceptual models but less than detailed differential equation models. From a network learning perspective the most severe problem with microarray data is the limited sample size, since there are usually many plausible networks for modeling the system. Since these cannot be reliably distinguished using the number of samples found in current microarray data sets, we describe robust network learning strategies for reducing the number of false interactions detected. We perform preliminary clustering using co-expression network analysis and gene shaving. Subsequently we construct Bayesian networks to obtain a global perspective of the relationships between these gene clusters. Throughout this chapter, we illustrate the concepts being expounded by referring to an ongoing example of a publicly available breast cancer data set.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Schena, M., Shalon, D., Davis, R., and Brown, P. (October 1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270(5235), 467–470.
Eisen, M. B., Spellman, P. T., Brown, P. O., and Botstein, D. (1998) Cluster analysis and display of genome-wide expression patterns. Proceedings of the National Academy of Sciences 95(25), 14863–14868.
Pelloski, C. E., Mahajan, A., Maor, M., Chang, E. L., Woo, S., Gilbert, M., Colman, H., Yang, H., Ledoux, A., Blair, H., Passe, S., Jenkins, R. B., and Aldape, K. D. (2005) YKL-40 expression is associated with poorer response to radiation and shorter overall survival in glioblastoma. Clinical Cancer Research 11(9), 3326–3334.
Airoldi, E. M. (December 2007) Getting started in probabilistic graphical models. PLoS Comput Biol 3(12), e252.
Ideker, T., and Lauffenberger, D. (2003) Building with a scaffold: emerging strategies for high to low-level cellular modeling. Trends in Biotechnology 21(6).
Pearl, J. (1988) Probabilistic Reasoning in Intelligent Systems. Morgan Kauffman, San Francisco, CA.
Baggerly, K. A., Coombes, K. R., and Neeley, E. S. (2008) Run batch effects potentially compromise the usefulness of genomic signatures for ovarian cancer. Journal of Clinical Oncology 26(7), 1186–1187.
Loi, S., Haibe-Kains, B., Desmedt, C., Lallemand, F., Tutt, A. M., Gillet, C., Ellis, P., Harris, A., Bergh, J., Foekens, J. A., Klijn, J. G., Larsimont, D., Buyse, M., Bontempi, G., Delorenzi, M., Piccart, M. J., and Sotiriou, C. (2007) Definition of clinically distinct molecular subtypes in estrogen receptor-positive breast carcinomas through genomic grade. Journal of Clinical Oncology 25(10), 1239–1246.
Xu, X., Wang, L., and Ding, D. (December 1984) Learning module networks from genome-wide location and expression data. FEBS Letters 578(3), 297–304.
Zhang, B., and Horvath, S. (2005) A general framework for weighted gene co-expression network analysis. Statistical Applications in Genetics and Molecular Biology 4(1). Article 17.
Ravasz, E., Somera, A. L., Mongru, D. A., Oltvai, N. Z., and Barabasi, A. L. (August 2002) Hierarchical organization of modularity in metabolic networks. Science 297, 1151–1155.
Hastie, T., Tibshirani, R., Eisen, M. B., Alizadeh, A., Levy, R., Staudt, L., Chan, W. C., Botstein, D., and Brown, P. (2000) ‘Gene shaving’ as a method for identifying distinct sets of genes with similar expression patterns. Genome Biology 1(2).
Breiman, L. (1996) Bagging predictors. Machine Learning 24(2), 123–140.
Do, K.-A., Broom, B. M., and Wen, S. (2003) Geneclust. In Parmigiani, G., Garrett, E. S., Irizarry, R. A., and Zeger, S. L., ed., The Analysis of Gene Expression Data: Methods and Software, chapter 15, p. 342–361. Springer, New York, NY.
Rubin, D. B. (January 1981) The bayesian bootstrap. The Annals of Statistics 9(1), 130–134.
Beers, E. H. V., and Nederlof, P. M. (2006) Array-CGH and breast cancer. Breast Cancer Research. 8(3), 210.
Bergamaschi, A., Kim, Y. H., Wang, P., Sørlie, T., Hernandez-Boussard, T., Lonning, P. E., Tibshirani, R., Børresen-Dale, A.-L., and Pollack, J. R. (November 2006) Distinct patterns of DNA copy number alteration are associated with different clinicopathological features and gene-expression subtypes of breast cancer. Genes Chromosomes Cancer 45(11), 1033–1040.
Bayes, T. (1763) An essay towards solving a problem in the doctrine of chances. by the late Rev. Mr. Bayes, F. R. S. communicated by Mr. Price, in a letter to John Canton, A. M. F. R. S. Philosophical Transactions 53, 370–418. Giving Some Account of the Present Undertakings, Studies and Labours of the Ingenious in Many Considerable Parts of the World.
Bayes, T. (1763/1958) Studies in the history of probability and statistics: IX. Thomas Bayes’ essay towards solving a problem in the doctrine of chances. Biometrika 45, 296–315. Bayes’ essay in modernized notation.
Chickering, D. M. (1996) Learning bayesian networks is NP-complete. In Fisher, D. H., and Lenz, H.-J., (ed.), Learning from Data: Artificial Intelligence and Statistics V, chapter 12, p. 121–130. Springer-verlag.
Friedman, N., and Koller, D. (2003) Being bayesian about network structure: A bayesian approach to structure discovery in bayesian networks. Machine Learning 50, 95–126.
Acknowledgments
Kim-Anh Do was partially funded by the National Institutes of Health via the University of Texas SPORE in Breast Cancer (CA-116199) and the Cancer Center Support Grant (CA016672).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Broom, B.M., Rinsurongkawong, W., Pusztai, L., Do, KA. (2010). Building Networks with Microarray Data. In: Bang, H., Zhou, X., van Epps, H., Mazumdar, M. (eds) Statistical Methods in Molecular Biology. Methods in Molecular Biology, vol 620. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-580-4_10
Download citation
DOI: https://doi.org/10.1007/978-1-60761-580-4_10
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-578-1
Online ISBN: 978-1-60761-580-4
eBook Packages: Springer Protocols