Learning Global Models of Transcriptional Regulatory Networks from Data

  • Aviv Madar
  • Richard Bonneau
Part of the Methods in Molecular Biology book series (MIMB, volume 541)


Organisms must continually adapt to changing cellular and environmental factors (e.g., oxygen levels) by altering their gene expression patterns. At the same time, all organisms must have stable gene expression patterns that are robust to small fluctuations in environmental factors and genetic variation. Learning and characterizing the structure and dynamics of Regulatory Networks (RNs), on a whole-genome scale, is a key problem in systems biology. Here, we review the challenges associated with inferring RNs in a solely data-driven manner, concisely discuss the implications and contingencies of possible procedures that can be used, specifically focusing on one such procedure, the Inferelator. Importantly, the Inferelator explicitly models the temporal component of regulation, can learn the interactions between transcription factors and environmental factors, and attaches a statistically meaningful weight to every edge. The result of the Inferelator is a dynamical model of the RN that can be used to model the time-evolution of cell state.

Key words

Network inference biclustering network reconstruction microarray data-integration cMonkey archaea the Inferelator 


  1. 1.
    Bonneau R, Reiss DJ, Shannon P, et al. The Inferelator: An algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo. Genome Biol 2006, 7:R36.PubMedCrossRefGoogle Scholar
  2. 2.
    Jacob F, Monod J, Sanchez C, Perrin D. Operon: A group of genes with the expression coordinated by an operator. C R Hebd Seances Acad Sci 1960, 250: 1727–29.PubMedGoogle Scholar
  3. 3.
    Davidson EH. Gene Activity in Early Development . San Diego: Academic Press, 1977.Google Scholar
  4. 4.
    Samanta MP, Tongprasit W, Istrail S, et al. The transcriptome of the sea urchin embryo. Science 2006, 314:960–62.PubMedCrossRefGoogle Scholar
  5. 5.
    Dynlacht BD. Regulation of transcription by proteins that control the cell cycle. Nature 1997, 389:149–52.PubMedCrossRefGoogle Scholar
  6. 6.
    Cheng Y, Church GM. Biclustering of expression data. Proc Int Conf Intell Syst Mol Biol 2000, 8:93–103.PubMedGoogle Scholar
  7. 7.
    Kluger Y, Basri R, Chang JT, Gerstein M. Spectral biclustering of microarray data: Coclustering genes and conditions. Genome Res 2003, 13:703–16.PubMedCrossRefGoogle Scholar
  8. 8.
    Sheng Q, Moreau Y, De Moor B. Biclustering microarray data by Gibbs sampling. Bioinformatics 2003, 19(Suppl 2):II196–205.PubMedGoogle Scholar
  9. 9.
    Tanay A, Sharan R, Kupiec M, Shamir R. Revealing modularity and organization in the yeast molecular network by integrated analysis of highly heterogeneous genomewide data. Proc Natl Acad Sci USA 2004, 101:2981–86.PubMedCrossRefGoogle Scholar
  10. 10.
    Tanay A, Sharan R, Shamir R. Discovering statistically significant biclusters in gene expression data. Bioinformatics 2002, 18(Suppl 1):S136–44.PubMedGoogle Scholar
  11. 11.
    Liu X, Sivaganesan S, Yeung KY, Guo J, Bumgarner RE, Medvedovic M. Context-specific infinite mixtures for clustering gene expression profiles across diverse microarray dataset. Bioinformatics 2006, 22:1737–44.PubMedCrossRefGoogle Scholar
  12. 12.
    Reiss DJ, Baliga NS, Bonneau R. Integrated biclustering of heterogeneous genome-wide datasets for the inference of global regulatory networks. BMC Bioinformatics 2006, 7:280.PubMedCrossRefGoogle Scholar
  13. 13.
    Reinartz J, Bruyns E, Lin JZ, et al. Massively parallel signature sequencing (MPSS) as a tool for in-depth quantitative gene expression profiling in all organisms. Brief Funct Genomic Proteomic 2002, 1:95–104.PubMedCrossRefGoogle Scholar
  14. 14.
    Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science 1995, 270:484–87.PubMedCrossRefGoogle Scholar
  15. 15.
    Boyer LA, Lee TI, Cole MF, et al. Core transcriptional regulatory circuitry in human embryonic stem cells. Cell 2005, 122:947–56.PubMedCrossRefGoogle Scholar
  16. 16.
    Harbison CT, Gordon DB, Lee TI, et al. Transcriptional regulatory code of a eukaryotic genome. Nature 2004, 431:99–104.PubMedCrossRefGoogle Scholar
  17. 17.
    Lee TI, Rinaldi NJ, Robert F, et al. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 2002, 298:799–804.PubMedCrossRefGoogle Scholar
  18. 18.
    Eilbeck K, Lewis SE, Mungall CJ, et al. The sequence ontology: A tool for the unification of genome annotations. Genome Biol 2005, 6:R44.PubMedCrossRefGoogle Scholar
  19. 19.
    Keseler IM, Collado-Vides J, Gama-Castro S, et al. EcoCyc: A comprehensive database resource for Escherichia coli. Nucleic Acids Res 2005, 33:D334–37.PubMedCrossRefGoogle Scholar
  20. 20.
    Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. The KEGG resource for deciphering the genome. Nucl Acids Res 2004, 32:D277–80.PubMedCrossRefGoogle Scholar
  21. 21.
    Wingender E, Chen X, Hehl R, et al. TRANSFAC: An integrated system for gene expression regulation. Nucl Acids Res 2000, 28:316–19.PubMedCrossRefGoogle Scholar
  22. 22.
    Boeckmann B, Bairoch A, Apweiler R, et al. The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucl Acids Res 2003, 31:365–70.PubMedCrossRefGoogle Scholar
  23. 23.
    Vidal M, Legrain P. Yeast forward and reverse 'n'-hybrid systems. Nucl Acids Res 1999, 27:919–29.PubMedCrossRefGoogle Scholar
  24. 24.
    Goodlett DR, Yi EC. Proteomics without polyacrylamide: Qualitative and quantitative uses of tandem mass spectrometry in proteome analysis. Funct Integr Genomics 2002, 2:138–53.PubMedCrossRefGoogle Scholar
  25. 25.
    Gunsalus KC, Ge H, Schetter AJ, et al. Predictive models of molecular machines involved in Caenorhabditis elegans early embryogenesis. Nature 2005, 436:861–65.PubMedCrossRefGoogle Scholar
  26. 26.
    Weston AD, Baliga NS, Bonneau R, Hood L. Systems approaches applied to the study of Saccharomyces cerevisiae and Halobacterium sp. Cold Spring Harb Symp Quant Biol 2003, 68:345–57.PubMedCrossRefGoogle Scholar
  27. 27.
    Savageau MA. Design principles for elementary gene circuits: Elements, methods, and examples. Chaos 2001, 11:142–59.PubMedCrossRefGoogle Scholar
  28. 28.
    Wall ME, Hlavacek WS, Savageau MA. Design of gene circuits: Lessons from bacteria. Nat Rev Genet 2004, 5:34–42.PubMedCrossRefGoogle Scholar
  29. 29.
    Laub MT, McAdams HH, Feldblyum T, Fraser CM, Shapiro L. Global analysis of the genetic network controlling a bacterial cell cycle. Science 2000, 290:2144–48.PubMedCrossRefGoogle Scholar
  30. 30.
    Finn V. Jensen. Bayesian Networks and Decision Graphs. New York: Springer-Verlag, 2001.Google Scholar
  31. 31.
    Friedman N, Linial M, Nachman I, Pe'er D. Using Bayesian networks to analyze expression data. J Comput Biol 2000, 7:601–20.PubMedCrossRefGoogle Scholar
  32. 32.
    Pearl J. Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Francisco: Morgan Kaufmann Publishers Inc., 1988.Google Scholar
  33. 33.
    Bonneau R, Facciotti MT, Reiss DJ, et al. A predictive model for transcriptional control of physiology in a free living cell. Cell 2007; 131:1354–65.Google Scholar
  34. 34.
    Yeung KY, Medvedovic M, Bumgarner RE. From co-expression to co-regulation: How many microarray experiments do we need? Genome Biol 2004, 5:R48.PubMedCrossRefGoogle Scholar
  35. 35.
    Vert JP, Kanehisa M. Extracting active pathways from gene expression data. Bioinformatics 2003, 19(Suppl 2):II238–44.PubMedGoogle Scholar
  36. 36.
    Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D. Prolinks: A database of protein functional linkages derived from coevolution. Genome Biol 2004, 5:R35.PubMedCrossRefGoogle Scholar
  37. 37.
    Mellor JC, Yanai I, Clodfelter KH, Mintseris J, DeLisi C. Predictome: A database of putative functional links between proteins. Nucleic Acids Res 2002, 30:306–09.PubMedCrossRefGoogle Scholar
  38. 38.
    Efron B, Hastie T, Johnstone I, Tibshirani R. Least angle regression. Ann Statist 2004, 32:407–99.CrossRefGoogle Scholar
  39. 39.
    Thorsson V, Hornquist M, Siegel AF, Hood L. Reverse engineering galactose regulation in yeast through model selection. Stat Appl Genet Mol Biol 2005, 4:Article28.PubMedGoogle Scholar
  40. 40.
    Trevor H, Robert T, Jerome F. The Elements of Statistical Learning. New York: Springer, 2001.Google Scholar
  41. 41.
    Shannon P, Markiel A, Ozier O, et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13:2498–504.PubMedCrossRefGoogle Scholar
  42. 42.
    Shannon PT, Reiss DJ, Bonneau R, Baliga NS. The gaggle: An open-source software system for integrating bioinformatics software and data sources. BMC Bioinformatics 2006, 7:176.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Aviv Madar
    • 1
  • Richard Bonneau
    • 2
  1. 1.Center for Comparative Functional GenomicsNew York UniversityNew YorkUSA
  2. 2.Department of Computer Science, Courant Institute of Mathematical SciencesNew York UniversityNew YorkUSA

Personalised recommendations