Identification of Transcription Factor Binding Sites in Promoter Regions by Modularity Analysis of the Motif Co-occurrence Graph

  • Alexandre P. Francisco
  • Arlindo L. Oliveira
  • Ana T. Freitas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


Many algorithms have been proposed to date for the problem of finding biologically significant motifs in promoter regions. They can be classified into two large families: combinatorial methods and probabilistic methods. Probabilistic methods have been used more extensively, since their output is easier to interpret. Combinatorial methods have the potential to identify hard to detect motifs, but their output is much harder to interpret, since it may consist of hundreds or thousands of motifs. In this work, we propose a method that processes the output of combinatorial motif finders in order to find groups of motifs that represent variations of the same motif, thus reducing the output to a manageable size. This processing is done by building a graph that represents the co-occurrences of motifs, and finding communities in this graph. We show that this innovative approach leads to a method that is as easy to use as a probabilistic motif finder, and as sensitive to low quorum motifs as a combinatorial motif finder. The method was integrated with two combinatorial motif finders, and made available on the Web.


Transcription Factor Binding Site Combinatorial Method Complex Motif Sparse Graph Relation Graph 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Sandve, G., Drablos, F.: A survey of motif discovery methods in an integrated framework. Biology Direct. 1(1), 11 (2006)CrossRefGoogle Scholar
  2. 2.
    Segal, E., Sharan, R.: A discriminative model for identifying spatial cis-regulatory modules. Journal of Computational Biology 12(6), 822–834 (2005)CrossRefGoogle Scholar
  3. 3.
    Buhler, J., Tompa, M.: Finding motifs using random projections. Journal of Computational Biology 9(2), 225–242 (2002)CrossRefGoogle Scholar
  4. 4.
    Bailey, T., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36 (1994)Google Scholar
  5. 5.
    Lawrence, C.E., Altschul, S.F., Boguski, M.S., Liu, J.S., Neuwald, A.F., Wootton, J.C.: Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science 262(5131), 208–214 (1993)CrossRefGoogle Scholar
  6. 6.
    Roth, F.P., Hughes, J.D., Estep, P.W., Church, G.M.: Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome mRNA quantitation. Nature Biotechnology 16, 939–945 (1998)CrossRefGoogle Scholar
  7. 7.
    Liu, X., Brutlag, D.L., Liu, J.S.: BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. In: Pacific Symposium on Biocomputing, vol. 6, pp. 127–138 (2001)Google Scholar
  8. 8.
    Sagot, M.F.: Spelling approximate repeated or common motifs using a suffix tree. Latin 98, 111–127 (1998)Google Scholar
  9. 9.
    Pevzner, P.A., Sze, S.H.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the International Conference on Intelligent Systems for Molecular Biology, vol. 8, pp. 269–278 (2000)Google Scholar
  10. 10.
    Carvalho, A.M., Freitas, A.T., Oliveira, A.L., Sagot, M.-F.: An efficient algorithm for the identification of structured motifs in DNA promoter sequences. IEEE Transactions on Computational Biology and Bioinformatics 3(2), 126–140 (2006)CrossRefGoogle Scholar
  11. 11.
    Marsan, L., Sagot, M.F.: Algorithms for extracting structured motifs using a suxffix tree with an application to promoter and regulatory site consensus identification. Journal of Computational Biology 7(3-4), 345–362 (2000)CrossRefGoogle Scholar
  12. 12.
    Mendes, N., Casimiro, A., Santos, P., Sá-Correia, I., Oliveira, A., Freitas, A.: MUSA: A parameter free algorithm for the identification of biologically significant motifs. Bioinformatics 22, 2996–3002 (2006)CrossRefGoogle Scholar
  13. 13.
    Kankainen, M., Loytynoja, A.: MATLIGN: a motif clustering, comparison and matching tool. BMC Bioinformatics 8(1), 189 (2007)CrossRefGoogle Scholar
  14. 14.
    Mahony, S., Benos, P.V.: STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Research (2007)Google Scholar
  15. 15.
    Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 7821 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  16. 16.
    Newman, M.E.J., Girvan, M.: Finding and evaluating community structure in networks. Physical Review E 69, 026113 (2004)CrossRefGoogle Scholar
  17. 17.
    Newman, M.E.J.: Fast algorithm for detecting community structure in networks. Physical Review E 69, 066133 (2004)CrossRefGoogle Scholar
  18. 18.
    Clauset, A., Newman, M.E.J., Moore, C.: Finding community structure in very large networks. Physical Review E 70, 066111 (2004)CrossRefGoogle Scholar
  19. 19.
    Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press, Cambridge (2001)zbMATHGoogle Scholar
  20. 20.
    Teixeira, M.C., Monteiro, P., Jain, P., Tenreiro, S., Fernandes, A.R., Mira, N.P., Alenquer, M., Freitas, A.T., Oliveira, A.L., Sá-Correia, I.: The YEASTRACT database: a tool for the analysis of transcription regulatory associations in saccharomyces cerevisiae. Nucleic Acids Research 34, D446–D451 (2006)CrossRefGoogle Scholar
  21. 21.
    DeRisi, J., van den Hazel, B., Marc, P., Balzi, E., Brown, P., Jack, C., Goffeau, A.: Genome microarray analysis of transcriptional activation in multidrug resistance yeast mutants. FEBS Letters 470, 156–160 (2000)CrossRefGoogle Scholar
  22. 22.
    Courel, M., Lallet, S., Camadro, J.M., Blaiseau, P.L.: Direct activation of genes involved in intracellular iron use by the yeast iron-responsive transcription factor Aft2 without its paralog Aft1. Molecular Cell Biology 25(15), 6760–6771 (2005)CrossRefGoogle Scholar
  23. 23.
    Cohen, B.A., Pilpel, Y., Mitra, R.D., Church, G.M.: Discrimination between paralogs using microarray analysis: application to the Yap1p and Yap2p transcriptional networks. Molecular Biology of the Cell 13(7), 1608–1614 (2002)CrossRefGoogle Scholar
  24. 24.
    Teixeira, M.C., Fernandes, A.R., Mira, N.P., Becker, J.D., Sá-Correia, I.: Early transcriptional response of Saccharomyces cerevisiae to stress imposed by the herbicide 2, 4-dichlorophenoxyacetic acid. FEMS Yeast Research 6(2), 230–248 (2006)CrossRefGoogle Scholar
  25. 25.
    Blaiseau, P.L., Lesuisse, E., Camadro, J.M.: Aft2p, a novel iron-regulated transcription activator that modulates, with Aft1p, intracellular iron use and resistance to oxidative stress in yeast. Journal of Biological Chemistry 276(36), 34221–34226 (2001)CrossRefGoogle Scholar
  26. 26.
    Harbison, C.T., Gordon, D.B., Lee, T.I., Rinaldi, N.J., Macisaac, K.D., Danford, T.W., Hannett, N.M., Tagne, J.-B., Reynolds, D.B., Yoo, J., Jennings, E.G., Zeitlinger, J., Pokholok, D.K., Kellis, M., Rolfe, P.A., Takusagawa, K.T., Lander, E.S., Gifford, D.K., Fraenkel, E., Young, R.A.: Transcriptional regulatory code of a eukaryotic genome. Nature 431(7004), 99–104 (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Alexandre P. Francisco
    • 1
  • Arlindo L. Oliveira
    • 1
  • Ana T. Freitas
    • 1
  1. 1.INESC-ID/ISTTechnical University of LisbonPortugal

Personalised recommendations