Skip to main content

Relative Avidity, Specificity, and Sensitivity of Transcription Factor–DNA Binding in Genome-Scale Experiments

  • Protocol
  • First Online:
Protein Networks and Pathway Analysis

Part of the book series: Methods in Molecular Biology ((MIMB,volume 563))

Abstract

One of the most crucial problems with genome-wide experimental analysis is how to extract meaningful biological phenomena from the resulting large data sets. Here, we present modeling and prediction techniques that are applied to genome-wide identification of in vivo protein–DNA binding sites from ChIP-based data sets. We develop a simple mixture probabilistic model of occurrence of non-specific and specific TF–DNA binding events for transcription factor binding to any site in the genome. We calculated the statistical significance of specific and non-specific random binding events using Kolmogorov–Waring and exponential functions, respectively. The binding events in the chromosome regions associated with non-specific, non-random binding loci were also identified and filtered out. The mixture model fits equally well to five different TFs (ERE, CREB, STAT1, Nanog, Oct4) data provided by ChIP-PET, SACO, and ChIP-Seq methods included in this study. We present a uniform methodology for estimating specificity, total number of binding sites, and sensitivity of data sets detected by these ChIP-based genome-wide experimental systems. We demonstrate strong heterogeneity of specific TF–DNA binding sites in terms of their avidity and by correlation between observed relative binding avidity of specific TF–DNA binding site and the level of mRNA transcription of the nearest gene target. Finally, we conclude that the sensitivity problem has not been resolved by current ChIP-based methods, including ChIP-Seq.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ren, B., Robert, F., Wyrick, J.J., Aparicio, O., Jennings, E.G., Simon, I., Zeitlinger, J., Schreiber, J., Hannett, N., Kanin, E., Volkert, T.L., Wilson, C.J., Bell, S.P., and Young, R.A. (2000) Genome-wide location and function of DNA binding proteins. Science 290(5500), 2306–9.

    Article  PubMed  CAS  Google Scholar 

  2. Kim, T.H. and Ren, B. (2006) Genome-wide analysis of protein-DNA interactions. Annu. Rev. Genom. Human Genet. 7, 81–102.

    Article  Google Scholar 

  3. Hartman, S.E., Bertone, P., Nath, A.K. et al. (2005) Global changes in STAT target selection and transcription regulation upon interferon treatments, Genes Dev. 19(24), 2953–68.

    Article  PubMed  CAS  Google Scholar 

  4. Wei, C.L., Wu, Q., Vega, V.B. et al. (2006) A global map of p53 transcription-factor binding sites in the human genome. Cell 124(1), 207–19.

    Article  PubMed  CAS  Google Scholar 

  5. Stormo, G. DNA binding sites: Representation and discovery. Bioinformatics 16, 16–23.

    Google Scholar 

  6. Down, T.A. and Hubbard, T.J. (2005) NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence. Nucleic Acids Res. 33, 1445–53.

    Article  PubMed  CAS  Google Scholar 

  7. Lovegrove, F.E., Peña-Castillo, L., Mohammad, N., Liles, W.C., Hughes, T.R., and Kain, K.C. (2006) Simultaneous host and parasite expression profiling identifies tissue-specific transcriptional programs associated with susceptibility or resistance to experimental cerebral malaria. BMC Genomics 7, 295.

    Article  PubMed  Google Scholar 

  8. Fernandez, P.C., Frank, S.R., Wang, L., Schroeder, M., Liu, S., Greene, J., Cocito, A., and Amati, B. (2006) Genomic targets of the human c-Myc protein. Genes Dev. 17(9), 1115–29.

    Article  Google Scholar 

  9. Loh, Y.H., Wu, Q., Chew, J.L., et al. (2006) The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat. Genet. 38(4), 431–40.

    Article  PubMed  CAS  Google Scholar 

  10. Zeller, K.I., Zhao, X., Lee, C.W., et al. (2006) Global mapping of c-Myc binding sites and target gene networks in human B cells. Proc. Natl. Acad. Sci. U S A 103(47), 17834–9.

    Article  PubMed  CAS  Google Scholar 

  11. Chen, X., Yuan, P., Fang, F., et al. (2008) Integration of external signalling pathways with the core transcriptional network in embryonic stem cells. Cell 133, 1106–17.

    Article  PubMed  CAS  Google Scholar 

  12. Lin, C.Y., Vega, V.B., Thomsen, J.S., Zhang, T., Kong, S.L., Xie, M., Chiu, K.P., Lipovich, L., Barnett, D.H., Stossi, F., George, J., Kuznetsov, V.A., Lee, Y.K., Charn, T.H., Palanisamy, N., Katzenellenbogen, B.S., Miller, L.D., Ruan, Y., Bourque, G., Wei, C.L., and Liu, E.T. (2007) Whole-genome cartography of estrogen receptor α binding sites. PLoS Genet. 3(6), e87.

    Google Scholar 

  13. Kuznetsov, V.A. (2002) Statistics of the numbers of transcripts and protein sequences encoded in the genome. In: Computational and Statistical Methods to Genomics (W. Zhang and I. Shmulevish, Eds.; 1st Ed.). Kluwer: Boston-Dordrecht, pp. 125–71.

    Google Scholar 

  14. Kuznetsov, V.A., Orlov, Y.L., Ruan, Y., and Wei, C.L. (2007) Computational analysis of genome-scale avidity distribution of TFBS in ChIP-PET experiments. Genome Informatics 19, 83–94.

    Article  PubMed  CAS  Google Scholar 

  15. Johnson, D.S., Mortazavi, A., Myers, R.M., and Wold, B. (2007) Genome-wide mapping of in vivo protein-DNA interactions. Science 316(5830), 1497–502.

    Article  PubMed  CAS  Google Scholar 

  16. Robertson, G., Hirst, M., Bainbridge, M. et al. (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods 4(8), 651–7.

    Article  PubMed  CAS  Google Scholar 

  17. Barski, A., Cuddapah, S., Cui, K. et al. (2007) High-resolution profiling of histone methylations in the human genome. Cell 129(4), 823–37.

    Article  PubMed  CAS  Google Scholar 

  18. Massie, C.E. and Mills, I.G. (2008) ChIPping away at gene regulation. EMBO Rep. 9(4), 337–43.

    Article  PubMed  CAS  Google Scholar 

  19. Mardis, E.R. (2007) ChIP–Seq: Welcome to the new frontier. Nat. Methods, 4, 613–4.

    Article  PubMed  CAS  Google Scholar 

  20. Euskirchen, G.M., Rozowsky, J.S., Wei, C.L., Lee, W.H., Zhang, Z.D., Hartman, S., Emanuelsson, O., Stolc, V., Weissman, S., Gerstein, M.B., Ruan, Y., and Snyder, M. (2007) Mapping of transcription factor binding regions in mammalian cells by ChIP: comparison of array- and sequencing-based technologies. Genome Res. 17(6), 898–909.

    Article  PubMed  CAS  Google Scholar 

  21. Impey, S., McCorkle, S.R., Cha-Molstad, H., Dwyer, J.M., Yochum, G.S., Boss, J.M., McWeeney, S., Dunn, J.J., Mandel, G., and Goodman, R.H. (2004) Defining the CREB regulation: a genome-wide analysis of transcription factor regulatory regions. Cell 119(7), 1041–54.

    PubMed  CAS  Google Scholar 

  22. Ozsolak, F., Song, J.S., Liu, X.S., and Fisher, D.E. (2007) High-throughput mapping of the chromatin structure of human promoters. Nat. Biotechnol. 25(2), 244–8.

    Article  PubMed  CAS  Google Scholar 

  23. Lieb, J.D., Liu, X., Botstein, D., and Brown, P.O. (2001) Promoter-specific binding of Rap1 revealed by genome-wide maps of protein-DNA association. Nat. Genet. 28, 327–34.

    Article  PubMed  CAS  Google Scholar 

  24. Bhinge, A.A., Kim, J., Euskirchen, G.M., Snyder, M., and Iyer, V.R. (2007) Mapping the chromosomal targets of STAT1 by Sequence Tag Analysis of Genomic Enrichment (STAGE). Genome Res. 17(6), 910–6.

    Article  PubMed  CAS  Google Scholar 

  25. Kuznetsov, V.A. (2003) Family of skewed distributions associated with the gene expression and proteome evolution. Signal Processing 83, 889–910.

    Article  Google Scholar 

  26. Johnson, N.L., Kotz, S., and Balakrishnan, N. (1997) Discrete Multivariate Distributions, John Wiley & Sons, Inc.: New York, p. 299.

    Google Scholar 

  27. Kuznetsov, V.A. (2006) Emergence of size-dependent networks on genome scale. In: Lecture Series on Computer and Computational Sciences (Brill Acad. Publishers: The Netherlands), 7A, pp. 754–7.

    Google Scholar 

  28. Scafoglio, C., Ambrosino, C., Cicatiello, L., Altucci, L., Ardovino, M., Bontempo, P., Medici, N., Molinari, A.M., Nebbioso, A., Facchiano, A., Calogero, R.A., Elkon, R., Menini, N., Ponzone, R., Biglia, N., Sismondi, P., De Bortoli, M., and Weisz, A. (2006) Comparative gene expression profiling reveals partially overlapping but distinct genomic actions of different antiestrogens in human breast cancer cell. J. Cell. Biochem. 98(5), 1163–84.

    Article  PubMed  CAS  Google Scholar 

  29. Chiu, K.P., Wong, C.H., Chen, Q. et al. (2006) PET-Tool: A software suite for comprehensive processing and managing of Paired-End diTag (PET) sequence data. BMC Bioinformatics 7, 390.

    Article  PubMed  Google Scholar 

  30. Wormald, S., Hilton, D.J., Smyth, G.K., and Speed, T.P. (2006) Proximal genomic localization of STAT1 binding and regulated transcriptional activity. BMC Genomics 7, 254.

    Article  PubMed  Google Scholar 

  31. Hartman, S.E., Bertone, P., Nath, A.K., Royce, T.E., Gerstein, M., Weissman, S., and Snyder, M. (2005) Global changes in STAT target selection and transcription regulation upon interferon treatments. Genes Dev. 19(24), 2953–68.

    Article  PubMed  CAS  Google Scholar 

  32. Jothi, R., Cuddapah, S., Barski, A., Cui, K., and Zhao, K. (2008) Genome-wide identification of in vivo protein-DNA binding sites from ChIP-Seq data. Nucleic Acids Res. 36(16), 5221–31.

    Article  PubMed  CAS  Google Scholar 

  33. Zhang, Z.D., Rozowsky, J., Snyder, M., Chang, J., and Gerstein, M. (2008) Modeling ChIP sequencing in silico with applications. PLoS Comput. Biol. 4(8), e1000158.

    Article  PubMed  Google Scholar 

  34. Kuznetsov, V.A., Singh, O., Huck, Ng, and Wei, C.L. (2008) Modeling and prediction of DNA-protein interaction events of transcription factors (TF) in ChIP-seq experiments. In: The Sixth International Conference on Bioinformatics of Genome Regulation and Structure (BGRS’2008). Institute of Cytology and Genetics SB RAS: Novosibirsk, Russia, June 22–28, 2008, p. 131. ISBN 978-5-91291-005-0.

    Google Scholar 

Download references

Acknowledgments

I thank Chia Lin Wei, Chiu Kow Ping, and Ruan Yujin for providing access to ChIP-Seq data sets of T2G DB and for very useful discussions of their ChIP-PET method. I also thank Piroon Jenjaroenpoon, Yuri Orlov, and Onkar Singh for partial but important computational support of analytical part of this work. I express my special acknowledgment to Yuri Nikolsky for his stimulated interest in this study. This work was supported by BII/A-Star, Singapore.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Humana Press, a part of Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Kuznetsov, V.A. (2009). Relative Avidity, Specificity, and Sensitivity of Transcription Factor–DNA Binding in Genome-Scale Experiments. In: Nikolsky, Y., Bryant, J. (eds) Protein Networks and Pathway Analysis. Methods in Molecular Biology, vol 563. Humana Press. https://doi.org/10.1007/978-1-60761-175-2_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-175-2_2

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-174-5

  • Online ISBN: 978-1-60761-175-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics