Identification of cis-Regulatory Elements in Gene Co-expression Networks Using A-GLAM

  • Leonardo Mariño-Ramírez
  • Kannan Tharakaraman
  • Olivier Bodenreider
  • John Spouge
  • David Landsman
Part of the Methods in Molecular Biology book series (MIMB, volume 541)


Reliable identification and assignment of cis-regulatory elements in promoter regions is a challenging problem in biology. The sophistication of transcriptional regulation in higher eukaryotes, particularly in metazoans, could be an important factor contributing to their organismal complexity. Here we present an integrated approach where networks of co-expressed genes are combined with gene ontology–derived functional networks to discover clusters of genes that share both similar expression patterns and functions. Regulatory elements are identified in the promoter regions of these gene clusters using a Gibbs sampling algorithm implemented in the A-GLAM software package. Using this approach, we analyze the cell-cycle co-expression network of the yeast Saccharomyces cerevisiae, showing that this approach correctly identifies cis-regulatory elements present in clusters of co-expressed genes.

Key words

Promoter sequences transcription factor–binding sites co-expression networks gene ontology Gibbs sampling 



The authors would like to thank King Jordan for important suggestions and helpful discussions and Alex Brick for his assistance in obtaining intergenic regions during his internship at NCBI. This research was supported by the Intramural Research Program of the NIH, NLM, NCBI.


  1. 1.
    Harbison CT, Gordon DB, Lee TI, Rinaldi NJ, Macisaac KD, Danford TW, Hannett NM, Tagne JB, Reynolds DB, Yoo J et al. Transcriptional regulatory code of a eukaryotic genome. Nature 2004, 431(7004):99–104.PubMedCrossRefGoogle Scholar
  2. 2.
    Bieda M, Xu X, Singer MA, Green R, Farnham PJ. Unbiased location analysis of E2F1-binding sites suggests a widespread role for E2F1 in the human genome. Genome Res 2006, 16(5):595–605.PubMedCrossRefGoogle Scholar
  3. 3.
    Cawley S, Bekiranov S, Ng HH, Kapranov P, Sekinger EA, Kampa D, Piccolboni A, Sementchenko V, Cheng J, Williams AJ et al. Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 2004, 116(4):499–509.PubMedCrossRefGoogle Scholar
  4. 4.
    Guccione E, Martinato F, Finocchiaro G, Luzi L, Tizzoni L, Dall' Olio V, Zardo G, Nervi C, Bernard L, Amati B. Myc-binding-site recognition in the human genome is determined by chromatin context. Nat Cell Biol 2006, 8(7):764–770.PubMedCrossRefGoogle Scholar
  5. 5.
    Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005, 23(1):137–144.PubMedCrossRefGoogle Scholar
  6. 6.
    Ohler U, Niemann H. Identification and analysis of eukaryotic promoters: recent computational approaches. Trends Genet 2001, 17(2):56–60.PubMedCrossRefGoogle Scholar
  7. 7.
    Marino-Ramirez L, Spouge JL, Kanga GC, Landsman D. Statistical analysis of over-represented words in human promoter sequences. Nucleic Acids Res 2004, 32(3):949–958.PubMedCrossRefGoogle Scholar
  8. 8.
    Tharakaraman K, Marino-Ramirez L, Sheetlin S, Landsman D, Spouge JL. Alignments anchored on genomic landmarks can aid in the identification of regulatory elements. Bioinformatics 2005, 21(Suppl 1):i440–448.PubMedCrossRefGoogle Scholar
  9. 9.
    Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, Landsman D, Lockhart DJ et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol Cell 1998, 2(1):65–73.PubMedCrossRefGoogle Scholar
  10. 10.
    Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 1998, 9(12):3273–3297.PubMedGoogle Scholar
  11. 11.
    Lord PW, Stevens RD, Brass A, Goble CA. Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation. Bioinformatics 2003, 19(10):1275–1283.PubMedCrossRefGoogle Scholar
  12. 12.
    Azuaje F, Wang H, Bodenreider O. Ontology-driven similarity approaches to supporting gene functional assessment. In: Proceedings of the ISMB’2005 SIG Meeting on Bio-Ontologies. Detroit, MI, 2005:9–10.Google Scholar
  13. 13.
    Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13(11):2498–2504.PubMedCrossRefGoogle Scholar
  14. 14.
    Bader GD, Hogue CW. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 2003, 4:2.PubMedCrossRefGoogle Scholar
  15. 15.
    Tsaparas P, Marino-Ramirez L, Bodenreider O, Koonin EV, Jordan IK. Global similarity and local divergence in human and mouse gene co-expression networks. BMC Evol Biol 2006, 6:70.PubMedCrossRefGoogle Scholar
  16. 16.
    Babu MM. An introduction to microarray data analysis. In: Computational Genomics: Theory and Application, Edited by Grant RP. Cambridge, UK: Horizon Bioscience, 2004:225–249.Google Scholar
  17. 17.
    Jordan IK, Marino-Ramirez L, Koonin EV. Evolutionary significance of gene expression divergence. Gene 2005, 345(1):119–126.PubMedCrossRefGoogle Scholar
  18. 18.
    Jordan IK, Marino-Ramirez L, Wolf YI, Koonin EV. Conservation and coevolution in the scale-free human gene coexpression network. Mol Biol Evol 2004, 21(11):2058–2070.PubMedCrossRefGoogle Scholar
  19. 19.
    Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 1993, 262(5131):208–214.PubMedCrossRefGoogle Scholar
  20. 20.
    Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–3402.PubMedCrossRefGoogle Scholar
  21. 21.
    Staden R. Methods for calculating the probabilities of finding patterns in sequences. Comput Appl Biosci 1989, 5(2):89–96.PubMedGoogle Scholar
  22. 22.
    Tharakaraman K, Marino-Ramirez L, Sheetlin S, Landsman D, Spouge JL. Scanning sequences after Gibbs sampling to find multiple occurrences of functional elements. BMC Bioinformatics 2006, 7:408.PubMedCrossRefGoogle Scholar
  23. 23.
    Orwant J, Hietaniemi J, Macdonald J. Mastering Algorithms with Perl. Sebastopol, CA: O'Reilly, 1999.Google Scholar
  24. 24.
    Maere S, Heymans K, Kuiper M. BiNGO: a Cytoscape plugin to assess overrepresentation of gene ontology categories in biological networks. Bioinformatics 2005, 21(16):3448–3449.PubMedCrossRefGoogle Scholar
  25. 25.
    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J Roy Statist Soc Ser B 1995, 57(1):289–300.Google Scholar
  26. 26.
    Marino-Ramirez L, Jordan IK, Landsman D. Multiple independent evolutionary solutions to core histone gene regulation. Genome biology 2006, 7(12):R122.PubMedCrossRefGoogle Scholar
  27. 27.
    Eriksson PR, Mendiratta G, McLaughlin NB, Wolfsberg TG, Marino-Ramirez L, Pompa TA, Jainerin M, Landsman D, Shen CH, Clark DJ. Global regulation by the yeast Spt10 protein is mediated through chromatin structure and the histone upstream activating sequence elements. Mol Cell Biol 2005, 25(20):9127–9137.PubMedCrossRefGoogle Scholar
  28. 28.
    Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 1990, 18(20):6097–6100.PubMedCrossRefGoogle Scholar
  29. 29.
    Crooks GE, Hon G, Chandonia JM, Brenner SE. WebLogo: a sequence logo generator. Genome Res 2004, 14(6):1188–1190.PubMedCrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • Leonardo Mariño-Ramírez
    • 1
  • Kannan Tharakaraman
    • 1
  • Olivier Bodenreider
    • 2
  • John Spouge
    • 1
  • David Landsman
    • 1
  1. 1.Computational Biology Branch, National Center for Biotechnology InformationNational Institutes of HealthBethesdaUSA
  2. 2.National Library of MedicineNational Institutes of HealthBethesdaUSA

Personalised recommendations