On the Concept of Cis-regulatory Information: From Sequence Motifs to Logic Functions

  • Ryan TarpineEmail author
  • Sorin Istrail
Part of the Natural Computing Series book series (NCS)


The regulatory genome is about the “system level organization of the core genomic regulatory apparatus, and how this is the locus of causality underlying the twin phenomena of animal development and animal evolution” (E.H. Davidson. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution, Academic Press, 2006). Information processing in the regulatory genome is done through regulatory states, defined as sets of transcription factors (sequence-specific DNA binding proteins which determine gene expression) that are expressed and active at the same time. The core information processing machinery consists of modular DNA sequence elements, called cis-modules, that interact with transcription factors. The cis-modules “read” the information contained in the regulatory state of the cell through transcription factor binding, “process” it, and directly or indirectly communicate with the basal transcription apparatus to determine gene expression. This endowment of each gene with the information-receiving capacity through their cis-regulatory modules is essential for the response to every possible regulatory state to which it might be exposed during all phases of the life cycle and in all cell types. We present here a set of challenges addressed by our CYRENE research project aimed at studying the cis-regulatory code of the regulatory genome. The CYRENE Project is devoted to (1) the construction of a database, the cis-Lexicon, containing comprehensive information across species about experimentally validated cis-regulatory modules; and (2) the software development of a next-generation genome browser, the cis-Browser, specialized for the regulatory genome. The presentation is anchored on three main computational challenges: the Gene Naming Problem, the Consensus Sequence Bottleneck Problem, and the Logic Function Inference Problem.


Transcription Factor Binding Site Gene Regulatory Network Position Weight Matrix Gene Naming Problem Determine Gene Expression 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Amore G, Davidson EH (2006) Cis-regulatory control of cyclophilin, a member of the ets-dri skeletogenic gene battery in the sea urchin embryo. Dev Biol 293(2):555–64 CrossRefGoogle Scholar
  2. 2.
    Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT et al. (2000) Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet 25(1):25–29 CrossRefGoogle Scholar
  3. 3.
    Benos PV, Bulyk ML, Stormo GD (2002) Additivity in protein–DNA interactions: how good an approximation is it?. Nucleic Acids Res 30(20):4442–4451 CrossRefGoogle Scholar
  4. 4.
    Berg OG, von Hippel PH (1987) Selection of DNA binding sites by regulatory proteins. Statistical-mechanical theory and application to operators and promoters. J Mol Biol 193(4):723–750 CrossRefGoogle Scholar
  5. 5.
    Blanchette M, Tompa M (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 12(5):739 CrossRefGoogle Scholar
  6. 6.
    Sodergren E, Weinstock GM, Davidson EH, Cameron RA, Gibbs RA, Angerer RC, Angerer LM, Arnone MI, Burgess DR et al. (Sea Urchin Genome Sequencing Consortium) (2006) The genome of the sea urchin strongylocentrotus purpuratus. Science 314(5801):941–952 CrossRefGoogle Scholar
  7. 7.
    Das MK, Dai HK (2007) A survey of DNA motif finding algorithms. Feedback Google Scholar
  8. 8.
    Davidson EH (2006) The regulatory genome: gene regulatory networks in development and evolution. Academic Press, New York Google Scholar
  9. 9.
    Davidson EH, Rast JP, Oliveri P, Ransick A, Calestani C, Yuh CH, Minokawa T, Amore G, Hinman V, Arenas-Mena C et al. (2002) A genomic regulatory network for development. Science 295(5560):1678, 1669 CrossRefGoogle Scholar
  10. 10.
    Elsik C, Mackey A, Reese J, Milshina N, Roos D, Weinstock G (2007) Creating a honey bee consensus gene set. Genome Biol 8(1):R13 CrossRefGoogle Scholar
  11. 11.
    Gerstein MB, Bruce C, Rozowsky JS, Zheng D, Du J, Korbel JO, Emanuelsson O, Zhang ZD, Weissman S, Snyder M (2007) What is a gene, post-encode? History and updated definition. Genome Res 17(6):669–681 CrossRefGoogle Scholar
  12. 12.
    Istrail S, Davidson EH (2005) Gene regulatory networks special feature: logic functions of the genomic cis-regulatory code. Proc Natl Acad Sci 102(14):4954 CrossRefGoogle Scholar
  13. 13.
    Istrail S, Ben-Tabou De-Leon S, Davidson EH (2007) The regulatory genome and the computer. Dev Biol 310(2):187–195 CrossRefGoogle Scholar
  14. 14.
    Lawrence CE, Altschul SF, Boguski MS, Liu JS, Neuwald AF, Wootton JC (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262(5131):208–214 CrossRefGoogle Scholar
  15. 15.
    Li N, Tompa M (2006) Analysis of computational approaches for motif discovery. Algorithms Mol Biol 1(8) Google Scholar
  16. 16.
    Livi CB, Davidson EH (2007) Regulation of spblimp1/krox1a, an alternatively transcribed isoform expressed in midgut and hindgut of the sea urchin gastrula. Gene Expr Patterns 7(1–2):1–7 CrossRefGoogle Scholar
  17. 17.
    Minokawa T, Wikramanayake AH, Davidson EH (2005) Cis-regulatory inputs of the wnt8 gene in the sea urchin endomesoderm network. Dev Biology 288(2):545–558 CrossRefGoogle Scholar
  18. 18.
    Pevzner PA, Sze SH (2000) Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the eighth international conference on intelligent systems for molecular biology, vol 8, pp 269–278 Google Scholar
  19. 19.
    Ransick A, Davidson EH (2006) Cis-regulatory processing of Notch signaling input to the sea urchin glial cells missing gene during mesoderm specification. Dev Biol 297(2):587–602 CrossRefGoogle Scholar
  20. 20.
    Samanta MP, Tongprasit W, Istrail S, Cameron RA, Tu Q, Davidson EH, Stolc V (2006) The transcriptome of the sea urchin embryo. Science 314(5801):960–962 CrossRefGoogle Scholar
  21. 21.
    Stormo GD (2000) DNA binding sites: representation and discovery. Bioinformatics 16(1):16–23 CrossRefGoogle Scholar
  22. 22.
    Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ et al. (2005) Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 23:137–144 CrossRefGoogle Scholar
  23. 23.
    Wasserman WW, Sandelin A (2004) Applied bioinformatics for the identification of regulatory elements. Nat Rev Genet 5(4):276–287 CrossRefGoogle Scholar
  24. 24.
    Yuh CH, Bolouri H, Davidson EH (1998) Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. Science 279(5358):1896–1902 CrossRefGoogle Scholar
  25. 25.
    Yuh CH, Bolouri H, Davidson EH (2001) Cis-regulatory logic in the endo16 gene: switching from a specification to a differentiation mode of control. Development 128(5):617–629 Google Scholar
  26. 26.
    Yuh CH, Davidson EH (1996) Modular cis-regulatory organization of endo16, a gut-specific gene of the sea urchin embryo. Development 122(4):1069–1082 Google Scholar
  27. 27.
    Yuh C-H, Titus Brown C, Livi CB, Rowen L, Clarke PJC, Davidson EH (2002) Patchy interspecific sequence similarities efficiently identify positive cis-regulatory elements in the sea urchin. Dev Biol 246(1):148–161 CrossRefGoogle Scholar
  28. 28.
    Yuh C-H, Dorman ER, Howard ML, Davidson EH (2004) An otx cis-regulatory module: a key node in the sea urchin endomesoderm gene regulatory network. Dev Biol 269(2):536–551 CrossRefGoogle Scholar
  29. 29.
    Zimmer C (2008) What is a specie? Sci Am Mag 298(6):72–79 CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  1. 1.Department of Computer Science and Center for Computational Molecular BiologyBrown UniversityProvidenceUSA

Personalised recommendations