Inference of Regulatory Networks from Microarray Data with R and the Bioconductor Package qpgraph

  • Robert CasteloEmail author
  • Alberto Roverato
Part of the Methods in Molecular Biology book series (MIMB, volume 802)


Regulatory networks inferred from microarray data sets provide an estimated blueprint of the functional interactions taking place under the assayed experimental conditions. In each of these experiments, the gene expression pathway exerts a finely tuned control simultaneously over all genes relevant to the cellular state. This renders most pairs of those genes significantly correlated, and therefore, the challenge faced by every method that aims at inferring a molecular regulatory network from microarray data, lies in distinguishing direct from indirect interactions. A straightforward solution to this problem would be to move directly from bivariate to multivariate statistical approaches. However, the daunting dimension of typical microarray data sets, with a number of genes p several orders of magnitude larger than the number of samples n, precludes the application of standard multivariate techniques and confronts the biologist with sophisticated procedures that address this situation. We have introduced a new way to approach this problem in an intuitive manner, based on limited-order partial correlations, and in this chapter we illustrate this method through the R package qpgraph, which forms part of the Bioconductor project and is available at its Web site (1).

Key words

Molecular regulatory network Microarray data Reverse engineering Network inference Non-rejection rate qpgraph 



This work is supported by the Spanish Ministerio de Ciencia e Innovación (MICINN) [TIN2008-00556/TIN] and the ISCIII COMBIOMED Network [RD07/0067/0001]. R.C. is a research fellow of the “Ramon y Cajal” program from the Spanish MICINN [RYC-2006-000932]. A.R. acknowledges support from the Ministero dell’Università e della Ricerca [PRIN-2007AYHZWC].


  1. 1.
  2. 2.
    Butte AJ, Tamayo P, Slonim D et al (2000) Discovering functional relationships between RNA expression and chemotherapeutic susceptibility using relevance networks. Proc Natl Acad Sci U S A 97:12182–12186.PubMedCrossRefGoogle Scholar
  3. 3.
    Basso K, Margolin AA, Stolovitzky G et al (2005) Reverse engineering of regulatory networks in human B cells. Nat Genet 37:382–390.PubMedCrossRefGoogle Scholar
  4. 4.
    Faith JJ, Hayete B, Thaden JT et al (2007) Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol 5:e8.PubMedCrossRefGoogle Scholar
  5. 5.
    Edwards D (2000) Introduction to graphical modelling. Springer, New York.CrossRefGoogle Scholar
  6. 6.
    Dykstra RL (1970) Establishing Positive Definiteness of Sample Covariance Matrix. Ann Math Statist 41:2153–2154.CrossRefGoogle Scholar
  7. 7.
    Barabasi A-L, Oltvai ZN (2004) Network biology: understanding the cell’s functional organization. Nat Rev Genet 5:101–113.PubMedCrossRefGoogle Scholar
  8. 8.
    Dobra A, Hans C, Jones B et al (2004) Sparse graphical models for exploring gene expression data. J. Multivariate. Anal. 90:196–212.CrossRefGoogle Scholar
  9. 9.
    Friedman J, Hastie T, Tibshirani R (2008) Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9:432–441.PubMedCrossRefGoogle Scholar
  10. 10.
    Yuan M, Lin Y (2007) Model selection and estimation in the Gaussian graphical model. Biometrika 94:19–35.CrossRefGoogle Scholar
  11. 11.
    Schäfer J, Strimmer K (2005) A shrinkage approach to large-scale covariance matrix estimation and implications for functional genomics. Stat. Appl. Genet. Mol. Biol. 4:1–32.Google Scholar
  12. 12.
    de la Fuente A, Bing N, Hoeschele I et al (2004) Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 20:3565–3574.PubMedCrossRefGoogle Scholar
  13. 13.
    Wille A, Bühlmann P (2006) Low-order conditional independence graphs for inferring genetic networks. Stat. Appl. Genet. Mol. Biol. 5:1.Google Scholar
  14. 14.
    Castelo R, Roverato A (2006) A robust procedure for Gaussian graphical model search from microarray data with p larger than n. J Mach Learn Res 7: 2621–2650.Google Scholar
  15. 15.
    Castelo R, Roverato A (2009) Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J Comput Biol 16:213–227.PubMedCrossRefGoogle Scholar
  16. 16.
  17. 17.
    Falcon S, Gentleman R (2007) Using GOstats to test gene lists for GO term association. Bioinformatics 23:257–258.PubMedCrossRefGoogle Scholar
  18. 18.
    Covert MW, Knight EM, Reed JL et al (2004) Integrating high-throughput and computational data elucidates bacterial networks. Nature 429:92–96.PubMedCrossRefGoogle Scholar
  19. 19.
    Gama-Castro S, Jimenez-Jacinto V, Peralta-Gil M et al (2008) RegulonDB (version 6.0): gene regulation model of Escherichia coli K-12 beyond transcription, active (experimental) annotated promoters and Textpresso navigation. Nucleic Acids Res 36:D120–124.PubMedCrossRefGoogle Scholar
  20. 20.
  21. 21.
  22. 22.
    Gentleman RC, Carey VJ, Bates DM et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5: R80.PubMedCrossRefGoogle Scholar
  23. 23.
    Schmidberger M, Morgan M, Eddelbuettel D et al (2009) State-of-the-art in Parallel Computing with R, Journal of Statistical Software 31:i01.Google Scholar
  24. 24.
    Wasserman WW, Sandelin A (2004) Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5:276–287.PubMedCrossRefGoogle Scholar
  25. 25.
    Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27: 861–874.CrossRefGoogle Scholar
  26. 26.
    Cho, B.-K., Knight, E. M., and Palsson, B. O. (2006) Transcriptional regulation of the fad regulon genes of Escherichia coli by arcA., Microbiology 152, 2207–2219.PubMedCrossRefGoogle Scholar
  27. 27.
  28. 28.
  29. 29.

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  1. 1.Research Program on Biomedical Informatics, Department of Experimental and Health SciencesUniversitat Pompeu Fabra, and Institut Municipal d’Investigació MèdicaBarcelonaSpain
  2. 2.Department of Statistical ScienceUniversità di BolognaBolognaItaly

Personalised recommendations