Application of Regulatory Sequence Analysis and Metabolic Network Analysis to the Interpretation of Gene Expression Data

  • Jacques van Helden
  • David Gilbert
  • Lorenz Wernisch
  • Michael Schroeder
  • Shoshana Wodak
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2066)


We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked:
  1. 1.

    Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements.

  2. 2.

    What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways.


We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (;;


Gene Expression Data Significance Index Methionine Biosynthesis Sulfur Assimilation Putative Regulatory Element 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. [1]
    DeRisi, J.L., Iyer, V.R. & Brown, P.O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680–6 (1997).CrossRefGoogle Scholar
  2. [2]
    Brown, P.O. & Botstein, D. Exploring the new world of the genome with DNA microarrays. Nat Genet 21, 33–7 (1999).CrossRefGoogle Scholar
  3. [3]
    Eisen, M.B. & Brown, P.O. DNA arrays for analysis of gene expression. Methods Enzymol 303, 179–205 (1999).CrossRefGoogle Scholar
  4. [4]
    Tamayo, P. et al. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci U S A 96, 2907–12 (1999).CrossRefGoogle Scholar
  5. [5]
    Vilo, J., Brazma, A., Jonassen, I. & Ukkonen, E. Mining for Putative Regulatory Elements in the Yeast Genome Using Gene Expression Data. ISMB (2000).Google Scholar
  6. [6]
    Wingender, E. et al. TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 28, 316–319 (2000).CrossRefGoogle Scholar
  7. [7]
    Salgado, H. et al. RegulonDB (version 3.0): transcriptional regulation and operon organization in Escherichia coli K-12. Nucleic Acids Res 28, 65–67 (2000).CrossRefGoogle Scholar
  8. [8]
    van Helden, J. et al. From molecular activities and processes to biological function. Briefings in Bioinformatics in press(2001).Google Scholar
  9. [9]
    van Helden, J. et al. Representing and analysing molecular and cellular function using the computer. Biol Chem 381, 921–35 (2000).CrossRefGoogle Scholar
  10. [10]
    Spellman, P.T. et al. Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell 9, 3273–97 (1998).Google Scholar
  11. [11]
    Eisen, M.B., Spellman, P.T., Brown, P.O. & Botstein, D. Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A 95, 14863–8 (1998).CrossRefGoogle Scholar
  12. [12]
    Gilbert, D., Schroeder, M. & van Helden, J. Interactive visualization and exploration of relationships between biological objects. Trends in Biotechnology 18, 487–495 (2000).CrossRefGoogle Scholar
  13. [13]
    van Helden, J., Andre, B. & Collado-Vides, J. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. J Mol Biol 281, 827–42 (1998).CrossRefGoogle Scholar
  14. [14]
    van Helden, J., Rios, A.F. & Collado-Vides, J. Discovering regulatory elements in noncoding sequences by analysis of spaced dyads. Nucleic Acids Res 28, 1808–18 (2000).CrossRefGoogle Scholar
  15. [15]
    Brazma, A., Jonassen, I., Vilo, J. & Ukkonen, E. Predicting gene regulatory elements in silico on a genomic scale. Genome Res 8, 1202–15 (1998).Google Scholar
  16. [16]
    Graber, J.H., Cantor, C.R., Mohr, S.C. & Smith, T.F. Genomic detection of new yeast premRNA 3’-end-processing signals. Nucleic Acids Res 27, 888–94 (1999).CrossRefGoogle Scholar
  17. [17]
    Reinert, G. & Schbath, S. Compound Poisson and Poisson process approximations for occurrences of multiple words in Markov chains. J Comput Biol 5, 223–53 (1998).CrossRefGoogle Scholar
  18. [18]
    van Helden, J., del Olmo, M. & Perez-Ortin, J.E. Statistical analysis of yeast genomic downstream sequences reveals putative polyadenylation signals. Nucleic Acids Res 28, 1000–10 (2000).CrossRefGoogle Scholar
  19. [19]
    Karp, P.D. et al. The EcoCyc and MetaCyc databases. Nucleic Acids Res 28, 56–59 (2000).CrossRefGoogle Scholar
  20. [20]
    Kanehisa, M. & Goto, S. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 28, 27–30 (2000).CrossRefGoogle Scholar
  21. [21]
    Thomas, D. & Surdin-Kerjan, Y. Metabolism of sulfur amino acids in Saccharomyces cerevisiae. Microbiol Mol Biol Rev 61, 503–32 (1997).Google Scholar
  22. [22]
    Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D. & Yeates, T.O. Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A 96, 4285–8 (1999).CrossRefGoogle Scholar
  23. [23]
    Marcotte, E.M. et al. Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–3 (1999).CrossRefGoogle Scholar
  24. [24]
    Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O. & Eisenberg, D. A combined algorithm for genome-wide prediction of protein function [see comments]. Nature 402, 83–6 (1999).CrossRefGoogle Scholar
  25. [25]
    Enright, A.J., Iliopoulos, I., Kyrpides, N.C. & Ouzounis, C.A. Protein interaction maps for complete genomes based on gene fusion events [see comments]. Nature 402, 86–90 (1999).CrossRefGoogle Scholar
  26. [26]
    van Helden, J., Andre, B. & Collado-Vides, J. A web site for the computational analysis of yeast regulatory sequences. Yeast 16, 177–87 (2000).CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2001

Authors and Affiliations

  • Jacques van Helden
    • 1
    • 2
  • David Gilbert
    • 2
    • 3
  • Lorenz Wernisch
    • 2
  • Michael Schroeder
    • 3
  • Shoshana Wodak
    • 1
    • 2
  1. 1.SCMBBUniversité Libre de BruxellesBruxellesBelgique
  2. 2.Genome Campus -European Bioinformatics InstituteCambridgeUK
  3. 3.Department of ComputingCity UniversityLondonUK

Personalised recommendations