Skip to main content

Predicting Transcription Factor Binding Sites Using Structural Knowledge

  • Conference paper
Research in Computational Molecular Biology (RECOMB 2005)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3500))

Abstract

Current approaches for identification and detection of transcription factor binding sites rely on an extensive set of known target genes. Here we describe a novel structure-based approach applicable to transcription factors with no prior binding data. Our approach combines sequence data and structural information to infer context-specific amino acid-nucleotide recognition preferences. These are used to predict binding sites for novel transcription factors from the same structural family. We apply our approach to the Cys2His2 Zinc Finger protein family, and show that the learned DNA-recognition preferences are compatible with various experimental results. To demonstrate the potential of our algorithm, we use the learned preferences to predict binding site models for novel proteins from the same family. These models are then used in genomic scans to find putative binding sites of the novel proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barash, Y., et al.: Modeling dependencies in Protein-DNA binding sites. In: Proc. of the 7th International Conf. on Research in Computational Molecular Biology, pp. 28–37 (2003)

    Google Scholar 

  2. Barash, Y., et al.: CIS: Compound Importance Sampling method for protein-DNA binding site p-value estimation. Bioinformatics (2004)

    Google Scholar 

  3. Benos, P.V., Bulyk, M.L., Stormo, G.D.: Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res. 30, 4442–4451 (2002)

    Article  Google Scholar 

  4. Benos, P.V., Lapedes, A.S., Stormo, G.D.: Probabilistic code for DNA recognition by proteins of the EGR family. J. Mol. Biol. 323, 701–727 (2002)

    Article  Google Scholar 

  5. Berg, J.M.: Sp1 and the subfamily of zinc finger proteins with guanine-rich binding sites. Proc. Natl. Acad. Sci. USA 89, 11109–11110 (1992)

    Article  Google Scholar 

  6. Bulyk, M.L., et al.: Exploring the DNA-binding specificities of zinc fingers with DNA microarrays. Proc. Natl. Acad. Sci. USA 98, 7158–7163 (2001)

    Article  Google Scholar 

  7. Bulyk, M.L., Johnson, P.L.F., Church, G.M.: Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res. 30, 1255–1261 (2002)

    Article  Google Scholar 

  8. Cawley, S., et al.: Unbiased mapping of transcription factor binding sites along human chromosomes 21 and 22 points to widespread regulation of noncoding RNAs. Cell 116(4), 499–509 (2004)

    Article  Google Scholar 

  9. Choo, Y., Klug, A.: Selection of DNA binding sites for zinc fingers using rationally randomized DNA reveals coded interactions. Proc. Natl. Acad. Sci. USA 91, 11168–11172 (1994)

    Article  Google Scholar 

  10. Choo, Y., Klug, A.: Toward a code for the interactions of zinc fingers with DNA: selection of randomized fingers displayed on phage. Proc. Natl. Acad. Sci. USA 91, 11163–11167 (1994)

    Article  Google Scholar 

  11. Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum Likelihood form incomplete data via the EM algorithm. J. Royal Stat. Soc. B. 39, 1–38 (1977)

    MATH  MathSciNet  Google Scholar 

  12. Eddy, S.R.: Profile hidden Markov models. Bioinformatics 14, 755–763 (1998)

    Article  Google Scholar 

  13. Elrod-Erickson, M., Benson, T.E., Pabo, C.O.: High-resolution structures of variant Zif268-DNA complexes: implications for understanding zinc finger-DNA recognition. Structure 6, 451–464 (1998)

    Article  Google Scholar 

  14. Kono, H., Sarai, A.: Structure-based prediction of DNA target sites by regulatory proteins. Proteins 35, 114–131 (1999)

    Article  Google Scholar 

  15. Kriwacki, R.W., et al.: Sequence-specific recognition of DNA by zinc-finger peptides derived from the transcription factor Sp1. Proc. Natl. Acad. Sci. USA 89, 9759–9763 (1992)

    Article  Google Scholar 

  16. Luscombe, N.M., Laskowski, R.A., Thornton, J.M.: Amino acid-base interactions: a three-dimensional analysis of protein-DNA interactions at an atomic level. Nucleic Acids Res. 29, 2860–2874 (2001)

    Article  Google Scholar 

  17. Mandel-Gutfreund, Y., Baron, A., Margalit, H.: A structure-based approach for prediction of protein binding sites in gene upstream regions. In: Proc. of the Pac. Symp. Biocomput., pp. 139–150 (2001)

    Google Scholar 

  18. Mandel-Gutfreund, Y., Schueler, O., Margalit, H.: Comprehensive analysis of hy- drogen bonds in regulatory protein DNA-complexes: in search of common principles. J. Mol. Biol. 253, 370–382 (1995)

    Article  Google Scholar 

  19. Mandel-Gutfreund, Y., Margalit, H.: Quantitative parameters for amino acid-base interaction: implications for prediction of protein-DNA binding sites. Nucleic Acids Res. 26, 2306–2312 (1998)

    Article  Google Scholar 

  20. Pavletich, N.P., Pabo, C.O.: Zinc finger-DNA recognition: crystal structure of a Zif268-DNA complex at 2.1 Å. Science 252, 809–817 (1991)

    Article  Google Scholar 

  21. Robison, K., McGuire, A.M., Church, G.M.: A comprehensive library of DNA-binding site matrices for 55 proteins applied to the complete Escherichia coli K-12 genome. J. Mol. Biol. 284, 241–254 (1998)

    Article  Google Scholar 

  22. Shultzaberger, R.K., Schneider, T.D.: Using sequence logos and information analysis of Lrp DNA binding sites to investigate discrepancies between natural selection and SELEX. Nucleic Acids Res. 27, 882–887 (1999)

    Article  Google Scholar 

  23. Steffen, N.R., et al.: DNA sequence and structure: direct and indirect recognition in protein-DNA binding. Bioinformatics 18(suppl. 1), S22–S30 (2002)

    Google Scholar 

  24. Stormo, G.D.: DNA binding sites: representation and discovery. Bioinformatics 16(1), 16–23 (2000)

    Article  Google Scholar 

  25. Suzuki, M., Gerstein, M., Yagi, N.: Stereochemical basis of DNA recognition by Zn fingers. Nucleic Acids Res. 22, 3397–3405 (1994)

    Article  Google Scholar 

  26. Tupler, R., Perini, G., Green, M.R.: Expressing the human genome. Nature 409(6822), 832–833 (2001)

    Article  Google Scholar 

  27. Wingender, E., et al.: The TRANSFAC system on gene expression regulation. Nucleic Acids Res. 29, 281–283 (2001)

    Article  Google Scholar 

  28. Wolfe, S.A., et al.: Analysis of zinc fingers optimized via phage display: evaluating the utility of a recognition code. J. Mol. Biol. 285, 1917–1934 (1999)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaplan, T., Friedman, N., Margalit, H. (2005). Predicting Transcription Factor Binding Sites Using Structural Knowledge. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds) Research in Computational Molecular Biology. RECOMB 2005. Lecture Notes in Computer Science(), vol 3500. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11415770_40

Download citation

  • DOI: https://doi.org/10.1007/11415770_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-25866-7

  • Online ISBN: 978-3-540-31950-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics