Structure-Based Ab Initio Prediction of Transcription Factor–Binding Sites

  • L. Angela Liu
  • Joel S. Bader
Part of the Methods in Molecular Biology book series (MIMB, volume 541)


We present an all-atom molecular modeling method that can predict the binding specificity of a transcription factor based on its 3D structure, with no further information required. We use molecular dynamics and free energy calculations to compute the relative binding free energies for a transcription factor with multiple possible DNA sequences. These sequences are then used to construct a position weight matrix to represent the transcription factor–binding sites. Free energy differences are calculated by morphing one base pair into another using a multi-copy representation in which multiple base pairs are superimposed at a single DNA position. Water-mediated hydrogen bonds between transcription factor side chains and DNA bases are known to contribute to binding specificity for certain transcription factors. To account for this important effect, the simulation protocol includes an explicit molecular water solvent and counter-ions. For computational efficiency, we use a standard additive approximation for the contribution of each DNA base pair to the total binding free energy. The additive approximation is not strictly necessary, and more detailed computations could be used to investigate non-additive effects.

Key words

Transcription factor–binding sites molecular dynamics free energy position weight matrix (PWM) multi-copy thermodynamic integration protein–DNA binding 



LAL acknowledges funding from the Department of Energy (DE-FG0204ER25626). JSB acknowledges funding from NSF CAREER 0546446, NIH/NCRR U54RR020839, and the Whitaker foundation. We acknowledge a starter grant and an MRAC grant of computer time from the Pittsburgh Supercomputer Center, MCB060010P, MCB060033P, and MCB060056N.


  1. 1.
    Pabo CO, Sauer RT. Transcription factors: structural families and principles of DNA recognition. Annu Rev Biochem 1992, 61:1053–1095.PubMedCrossRefGoogle Scholar
  2. 2.
    Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 1990, 249(4968):505–510.PubMedCrossRefGoogle Scholar
  3. 3.
    Ren B, Robert F, Wyrick JJ, et al. Genome-wide location and function of DNA binding proteins. Science 2000, 290(5500):2306–2309.PubMedCrossRefGoogle Scholar
  4. 4.
    Mukherjee S, Berger MF, Jona G, et al. Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat Genet 2004, 36(12):1331–1339.PubMedCrossRefGoogle Scholar
  5. 5.
    Morozov AV, Havranek JJ, Baker D, Siggia ED. Protein-DNA binding specificity predictions with structural models. Nucleic Acids Res 2005, 33(18):5781–5798.PubMedCrossRefGoogle Scholar
  6. 6.
    Paillard G, Lavery R. Analyzing protein-DNA recognition mechanisms. Structure (Camb) 2004, 12(1):113–122.CrossRefGoogle Scholar
  7. 7.
    Endres RG, Schulthess TC, Wingreen NS. Toward an atomistic model for predicting transcription-factor binding sites. Proteins 2004, 57(2):262–268.PubMedCrossRefGoogle Scholar
  8. 8.
    Schneider TD, Stephens RM. Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 1990, 18(20):6097–6100.PubMedCrossRefGoogle Scholar
  9. 9.
    Leach A. Molecular Modelling: Principles and Applications, 2nd ed. Prentice Hall, Harlow, England; New York, 2001.Google Scholar
  10. 10.
    Frenkel D, Smit B. Understanding Molecular Simulations: From Algorithms to Applications, 2nd ed. San Diego: Academic Press, 2002.Google Scholar
  11. 11.
    Berman HM, Westbrook J, Feng Z, et al. The protein data bank. Nucleic Acids Res 2000, 28(1):235–242.PubMedCrossRefGoogle Scholar
  12. 12.
    Olson WK, Bansal M, Burley SK, et al. A standard reference frame for the description of nucleic acid base-pair geometry. J Mol Biol 2001, 313(1):229–237.PubMedCrossRefGoogle Scholar
  13. 13.
    Cheatham TE, III, Young MA. Molecular dynamics simulation of nucleic acids: successes, limitations, and promise. Biopolymers 2000, 56(4):232–256.PubMedCrossRefGoogle Scholar
  14. 14.
    Mackerell AD, Jr. Empirical force fields for biological macromolecules: overview and issues. J Comput Chem 2004, 25(13):1584–1604.PubMedCrossRefGoogle Scholar
  15. 15.
    Kollman P. Free energy calculations: applications to chemical and biochemical phenomena. Chem Rev 1993, 93:2395–2417.CrossRefGoogle Scholar

Copyright information

© Humana Press, a part of Springer Science+Business Media, LLC 2009

Authors and Affiliations

  • L. Angela Liu
    • 1
  • Joel S. Bader
    • 2
  1. 1.Department of Biomedical Engineering and Institute for Multiscale Modeling of Biological InteractionsJohn Hopkins UniversityBaltimoreUSA
  2. 2.Department of Biomedical Engineering and High-Throughput Biology CenterJohn Hopkins UniversityBaltimoreUSA

Personalised recommendations