Abstract
The goal of systems biology is to gain a more complete understanding of biological systems by viewing all of their components and the interactions between them simultaneously. Until recently, the most complete global view of a biological system was through the use of gene expression or protein-protein interaction data. With the increasing number of high-throughput technologies for measuring genomic, proteomic, and metabolomic data, scientists now have the opportunity to create complex network-based models for drug discovery, protein function annotation, and many other problems. Each technology used to measure a biological system inherently presents a limited view of the system. However, the combination of multiple technologies can provide a more complete picture. Much recent work has studied integrating these heterogeneous data types into single networks. Here we provide a survey of integrative network-based approaches to problems in systems biology. We focus on describing the variety of algorithms used in integrative network inference. Ultimately, the survey of current approaches leads us to the conclusion that there is an urgent need for a standard set of evaluation metrics and data sets in this field.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Abbreviations
- PPI:
-
Protein-protein interaction
- GO:
-
Gene ontology
- TF:
-
Transcription factor
- TFBS:
-
Transcription factor binding site
- eQTL:
-
Expression quantitative trait locus
References
Schadt EE, Friend SH, Shaywitz DA (2009) A network view of disease and compound screening. Nat Rev Drug Discov 8:286–295
Friedman N, Linial M, Nachman I, Pe’er D (2000) Using Bayesian networks to analyze expression data. J Comput Biol 7:601–620
Rao A, Hero AO, States DJ, Engel JD (2007) Using directed information to build biologically relevant influence networks. Comput Syst Bioinform/Life Sci Soc Comput Syst Bioinform Conf 6:145–156
De Smet R, Marchal K (2010) Advantages and limitations of current network inference methods. Nat Rev Micro 8:717–729
Sharan R, Ulitsky I, Shamir R (2007) Network-based prediction of protein function. Mol Syst Biol 3
Hecker M, Lambeck S, Toepfer S, Van Someren E, Guthke R (2009) Gene regulatory network inference: data integration in dynamic models: a review. Biosystems 96:86–103
Gitter A, Siegfried Z, Klutstein M, Fornes O, Oliva B et al (2009) Backup in gene regulatory networks explains differences between binding and knockout results. Mol Syst Biol 5
Califano A, Butte A, Friend S, Ideker T, Schadt EE (2011) Integrative network-based association studies: leveraging cell regulatory models in the post-GWAS era. Nat Precedings 10
Bebek G, Koyutürk M, Price ND, Chance MR (2012) Network biology methods integrating biological data for translational science. Briefings Bioinform
Canales R, Luo Y, Willey J, Austermiller B, Barbacioru C et al (2006) Evaluation of dna microarray results with quantitative gene expression platforms. Nat Biotechnol 24:1115–1122
Quackenbush J (2002) Microarray data normalization and transformation. Nat Genet 32:496
Christie KR, Hong EL, Cherry JM (2009) Functional annotations for the Saccharomyces cerevisiae genome: the knowns and the known unknowns. Trends Microbiol 17:286–294
Hanisch D, Zien A, Zimmer R, Lengauer T (2002) Co-clustering of biological networks and gene expression data. Bioinformatics 18:S145–S154
Datta S, Datta S (2003) Comparisons and validation of statistical clustering techniques for microarray gene expression data. Bioinformatics 19:459–466
Eisen MB, Spellman PT, Brown PO, Botstein D (1998) Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci 95:14863–14868
Margolin A, Nemenman I, Basso K, Wiggins C, Stolovitzky G et al (2006) ARACNE: an algorithm for the reconstruction of gene regulatory networks in a mammalian cellular context. BMC Bioinform 7:S7
Meyer P, Lafitte F, Bontempi G (2008) Minet: AR/Bioconductor package for inferring large transcriptional networks using mutual information. BMC Bioinform 9:461
Sen T, Kloczkowski A, Jernigan R (2006) Functional clustering of yeast proteins from the protein-protein interaction network. BMC Bioinform 7:355
Aparicio O, Geisberg JV, Sekinger E, Yang A, Moqtaderi Z et al (2005) Chromatin immunoprecipitation for determining the association of proteins with specific genomic sequences in vivo. In: Ausubel FM et al Current protocols in molecular biology. Chapter 21
Jansen R (2001) Genetical genomics: the added value from segregation. Trends Genet 17:388–391
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H et al (2000) Gene Ontology: tool for the unification of biology. Nat Genet 25:25–29
Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M (2012) KEGG for integration and interpretation of large-scale molecular data sets. Nucleic Acid Res 40:D109–D114
Keseler IM, Collado-Vides J, Gama-Castro S, Ingraham J, Paley S et al (2005) EcoCyc: a comprehensive database resource for Escherichia coli. Nucleic Acids Res 33
Hughes TR, Marton MJ, Jones AR, Roberts CJ, Stoughton R et al (2000) Functional discovery via a compendium of expression profiles. Cell 102:109–126
Steuer R, Kurths J, Daub CO, Weise J, Selbig J (2002) The mutual information: detecting and evaluating dependencies between variables. Bioinformatics 18:S231–S240
de Matos Simoes R, Emmert-Streib F (2011) Influence of statistical estimators of mutual information and data heterogeneity on the inference of gene regulatory networks. PLoS ONE 6:e29279
Mason M, Fan G, Plath K, Zhou Q, Horvath S (2009) Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells. BMC Genomics 10:327
Zhou X, Kao MCC, Hung W (2002) Transitive functional annotation by shortest-path analysis of gene expression data. Proc Natl Acad Sci U S A 99:12783–12788
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B (Methodol):267–288
Li C, Li H (2008) Network-constrained regularization and variable selection for analysis of genomic data. Bioinformatics 24:1175–1182
Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–441
Shimamura T, Imoto S, Yamaguchi R, Miyano S (2007) Weighted lasso in graphical gaussian modeling for large gene network estimation based on microarray data. Genome Inform 19:142–153
Gustafsson M, Hornquist M, Lombardi A (2005) Constructing and analyzing a large-scale gene-to-gene regulatory network lasso-constrained inference and biological validation. IEEE/ACM Trans Comput Biol Bioinform 2:254–261
Li W, Zhang S, Liu C, Zhou X (2012) Identifying multi-layer gene regulatory modules from multi-dimensional genomic data. Bioinformatics on line
Li S, Hsu L, Peng J, Wang P (2011) Bootstrap inference for network construction. Arxiv, preprint arXiv:11115028
Needham CJ, Bradford JR, Bulpitt AJ, Westhead DR (2007) A primer on learning in Bayesian networks for computational biology. PLoS Comput Biol 3:e129
Maxwell Chickering D, Heckerman D (1997) Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Mach Learn 29:181–212
Heckerman D (2008) A tutorial on learning with Bayesian networks. Innovations in Bayesian networks, pp 33–82
Zhu J, Zhang B, Smith EN, Drees B, Brem RB et al (2008) Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks. Nat Genet 40:854–861
Hartemink AJ, Gifford DK, Jaakkola TS, Young RA (2002) Combining location and expression data for principled discovery of genetic regulatory network models. Pacific Symp Biocomput:437–449
Tamada Y, Kim S, Bannai H, Imoto S, Tashiro K et al (2003) Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection. Bioinformatics 19:2
Imoto S, Higuchi T, Goto T, Tashiro K, Kuhara S et al (2003) Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks. Proc IEEE Comput Soc Bioinform Conf 2:104–113
Doss S, Schadt EE, Drake TA, Lusis AJ (2005) Cis-acting expression quantitative trait loci in mice. Genome Res 15:681–691
Wainwright M, Ravikumar P, Lafferty J (2007) High-dimensional graphical model selection using \(l~\)1-regularized logistic regression. In: Advances in neural information processing systems vol 19. p 1465
Choi M, Tan V, Anandkumar A, Willsky A (2011) Learning latent tree graphical models. J Mach Learn Res 12:1729–1770
Srebro N (2001) Maximum likelihood bounded tree-width markov networks. In: Proceedings of the 17th conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, pp 504–511
Friedman N, Nachman I (2000) Gaussian process networks. In: Proceedings of the 16th conference on uncertainty in artificial intelligence. Morgan Kaufmann Publishers Inc, pp 211–219
Tu Z, Wang L, Arbeitman MN, Chen T, Sun F (2006) An integrative approach for causal gene identification and gene regulatory pathway inference. Bioinformatics 22:e489–e496
Lee I, Date SV, Adai AT, Marcotte EM (2004) A probabilistic functional network of yeast genes. Science 306:1555–1558
Ernst J, Vainas O, Harbison CT, Simon I, Bar-Joseph Z (2007) Reconstructing dynamic regulatory maps. Mol Syst Biol 3
Deng M, Chen T, Sun F (2004) An integrated probabilistic model for functional prediction of proteins. J Comput Biol 11:463–475
Ucar D, Beyer A, Parthasarathy S, Workman CT (2009) Predicting functionality of protein-DNA interactions by integrating diverse evidence. Bioinformatics 25:i137–144
Ernst J, Beg QK, Kay KA, Balázsi G, Oltvai ZN et al (2008) A semi-supervised method for predicting transcription factor-gene interactions in Escherichia coli. PLoS Comput Biol 4:e1000044
Hwang D, Rust AG, Ramsey S, Smith JJ, Leslie DM et al (2005) A data integration methodology for systems biology. Proc Natl Acad Sci U S A 102:17296
modENCODE Consortium, Roy S, Ernst J, Kharchenko PV, Kheradpour P et al (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science (New York) 330:1787–1797
Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A et al (2010) The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acid Res 39:D561–D568
Davis DA, Chawla NV (2011) Exploring and exploiting disease interactions from multi-relational gene and phenotype networks. PLoS ONE 6:e22670
Segal MR, Dahlquist KD, Conklin BR (2003) Regression approaches for microarray data analysis. J Comput Biol 10:961–980
Kim H, Hu W, Kluger Y (2006) Unraveling condition specific gene transcriptional regulatory networks in Saccharomyces cerevisiae. BMC Bioinform 7:165
Gao F, Foat B, Bussemaker H (2004) Defining transcriptional networks through integrative modeling of mRNA expression and transcription factor binding data. BMC Bioinform 5:31
Luscombe NM, Madan Babu M et al (2004) Genomic analysis of regulatory network dynamics reveals large topological changes. Nature 431:308–312
Tanay A, Sharan R, Kupiec M, Shamir R (2004) Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc Natl Acad Sci U S A 101:2981–2986
Lemmens K, De Bie T, Dhollander T, De Keersmaecker S, Thijs I et al (2009) DISTILLER: a data integration framework to reveal condition dependency of complex regulons in Escherichia coli. Genome Biol 10:R27
Jeong H, Tombor B, Albert R, Oltvai ZN, Barabási AL (2000) The large-scale organization of metabolic networks. Nature 407:651–654
van Noort V, Snel B, Huynen MA (2004) The yeast coexpression network has a small-world, scale-free architecture and can be explained by a simple model. EMBO Rep 5:280–284
Yip A, Horvath S (2007) Gene network interconnectedness and the generalized topological overlap measure. BMC Bioinform 8:22
Clauset A, Shalizi CR, Newman MEJ (2009) Power-law distributions in empirical data. SIAM Rev 51:661
Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D et al (2010) Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci 107:6286–6291
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Rider, A.K., Chawla, N.V., Emrich, S.J. (2013). A Survey of Current Integrative Network Algorithms for Systems Biology. In: Prokop, A., Csukás, B. (eds) Systems Biology. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-6803-1_17
Download citation
DOI: https://doi.org/10.1007/978-94-007-6803-1_17
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-007-6802-4
Online ISBN: 978-94-007-6803-1
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)