Using ChIPMotifs for De Novo Motif Discovery of OCT4 and ZNF263 Based on ChIP-Based High-Throughput Experiments

  • Brian A. Kennedy
  • Xun Lan
  • Tim H.-M. Huang
  • Peggy J. Farnham
  • Victor X. JinEmail author
Part of the Methods in Molecular Biology book series (MIMB, volume 802)


DNA motifs are short sequences varying from 6 to 25 bp and can be highly variable and degenerated. One major approach for predicting transcription factor (TF) binding is using position weight matrix (PWM) to represent information content of regulatory sites; however, when used as the sole means of identifying binding sites suffers from the limited amount of training data available and a high rate of false-positive predictions. ChIPMotifs program is a de novo motif finding tool developed for ChIP-based high-throughput data, and W-ChIPMotifs is a Web application tool for ChIPMotifs. It composes various ab initio motif discovery tools such as MEME, MaMF, Weeder and optimizes the significance of the detected motifs by using bootstrap re-sampling error estimation and a Fisher test. Using these techniques, we determined a PWM for OCT4 which is similar to canonical OCT4 consensus sequence. In a separate study, we also use de novo motif discovery to suggest that ZNF263 binds to a 24-nt site that differs from the motif predicted by the zinc finger code in several positions.

Key words

Motif ChIP Position weight matrix OCT4 ZNF263 


  1. 1.
    Lockhart D, Dong H, Byrne MC et al (1996) Expression monitoring by hybridization to high-density oligonucleotide arrays. Nat Biotechnol 14:1675–1680PubMedCrossRefGoogle Scholar
  2. 2.
    Schena M, Shalon D, Davis RW et al (1995) Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science 270:467–470PubMedCrossRefGoogle Scholar
  3. 3.
    Iyer VR, Horak CE, Scafe CS et al (2001) Genomic binding sites of the yeast cell-cycle transcription factor SBF and MBF. Nature 409:533–538PubMedCrossRefGoogle Scholar
  4. 4.
    Ren B, Robert F, Wyrick JJ et al (2000) Genome-wide location and function of DNA binding proteins. Science 290:2306–2309PubMedCrossRefGoogle Scholar
  5. 5.
    Steensel B, Henikoff S (2000) Identification of in vivo DNA targets of chromatin proteins using tethered dam methyltransferase. Nat Biotechnol 18:424–428PubMedCrossRefGoogle Scholar
  6. 6.
    Crawford GE, Davis S, Scacheri PC et al (2006) DNase-chip: a high-resolution method to identify DNase I hypersensitive sites using tiled microarrays. Nat Methods 3:503–509PubMedCrossRefGoogle Scholar
  7. 7.
    Loh YH, Wu Q, Chew JL et al (2006) The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nature Genet 38:431–440PubMedCrossRefGoogle Scholar
  8. 8.
    Pedersen JT, Moult J (1996) Genetic algorithms for protein structure prediction. Curr Opin Struct Biol 6:227–231PubMedCrossRefGoogle Scholar
  9. 9.
    Lawrence C, Altschul S, Boguski M et al (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262:208–214PubMedCrossRefGoogle Scholar
  10. 10.
    Bailey TL, Elkan C (1995) The value of prior knowledge in discovering motifs with MEME. Proc Int Conf Intell Syst Mol Biol 3:21–29PubMedGoogle Scholar
  11. 11.
    Pavesi G, Mereghetti P, Mauri G et al (2004) Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes. Nucleic Acids Res 32:W199-203PubMedCrossRefGoogle Scholar
  12. 12.
    Liu J, Stormo GD (2008) Context-dependent DNA recognition code for C2H2 zinc-finger transcription factors. Bioinformatics 24:1850–1857PubMedCrossRefGoogle Scholar
  13. 13.
    Kel AE, Gossling E, Reuter I et al (2003) MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res 31:3576–3579PubMedCrossRefGoogle Scholar
  14. 14.
    Wingender E, Chen X, Hehl R et al (2000) TRANSFAC: an integrated system for gene expression regulation. Nucleic Acids Res 28:316–319PubMedCrossRefGoogle Scholar
  15. 15.
    Alkema WB, Johansson O, Lagergren J et al (2004) MSCAN: identification of functional clusters of transcription factor binding sites. Nucleic Acids Res 32:W195-198PubMedCrossRefGoogle Scholar
  16. 16.
    Sandelin A, Alkema W, Engstrom P et al (2004). JASPAR: an open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res 32:D91-94PubMedCrossRefGoogle Scholar
  17. 17.
    Weinmann AS, Yan PS, Oberley MJ et al (2002) Isolating human transcription factor targets by coupling chromatin immunoprecipitation and CpG island microarray analysis. Gene Dev 16:235–244PubMedCrossRefGoogle Scholar
  18. 18.
    Barski A, Cuddapah S, Cui K et al (2007) High-resolution profiling of histone methylations in the human genome. Cell 129:823–837PubMedCrossRefGoogle Scholar
  19. 19.
    Robertson G, Hirst M, Bainbridge M et al (2007) Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods 4:651–657PubMedCrossRefGoogle Scholar
  20. 20.
    Ettwiller L, Paten B, Ramialison M et al (2007) Trawler: de novo regulatory motif discovery pipeline for chromatin immunoprecipitation. Nat Methods 4:563–565PubMedCrossRefGoogle Scholar
  21. 21.
    Gordon DB, Nekludova L, McCallum et al (2005) TAMO: a flexible, object-oriented framework for analyzing transcriptional regulation using DNA-sequence motifs. Bioinformatics 21:3164–3165Google Scholar
  22. 22.
    Hong P, Liu XS, Zhou Q et al (2005) A boosting approach for motif modeling using ChIP-chip data. Bioinformatics 21:2636–2643PubMedCrossRefGoogle Scholar
  23. 23.
    Jin VX, O’Geen H, Iyengar S et al (2007) Identification of an OCT4 and SRY regulatory module using integrated computational and experimental genomics approaches. Genome Res 17:807–817PubMedCrossRefGoogle Scholar
  24. 24.
    Jin VX, Apostolos J, Nagisetty NS et al (2009) W-ChIPMotifs: a web application tool for de novo motif discovery from ChIP-based high-throughput data. Bioinformatics 25: 3191–3193PubMedCrossRefGoogle Scholar
  25. 25.
    Jin VX, Leu YW, Liyanarachchi S et al (2004) Identifying estrogen receptor alpha target genes using integrated computational genomics and chromatin immunoprecipitation microarray. Nucleic Acids Res 32:6627–6635PubMedCrossRefGoogle Scholar
  26. 26.
    Mahony S, Benos PV (2007) STAMP: a web tool for exploring DNA-binding motif similarities. Nucleic Acids Res 35:W253-258PubMedCrossRefGoogle Scholar
  27. 27.
    Badis G, Berger MF, Philippakis AA et al (2009) Diversity and complexity in DNA recognition by transcription factors. Science 324:1720–1723PubMedCrossRefGoogle Scholar
  28. 28.
    Frietze S, Lan X, Jin VX et al (2010) Genomic targets of the KRAB and SCAN domain-containing zinc finger protein 263 (ZNF263). J Biol Chem 285:1393–1403PubMedCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2012

Authors and Affiliations

  • Brian A. Kennedy
    • 1
  • Xun Lan
    • 1
  • Tim H.-M. Huang
    • 2
  • Peggy J. Farnham
    • 3
  • Victor X. Jin
    • 1
    Email author
  1. 1.Department of Biomedical InformaticsThe Ohio State UniversityColumbusUSA
  2. 2.Department of Molecular Virology, Immunology & Medical GeneticsThe Ohio State UniversityColumbusUSA
  3. 3.Department of Biochemistry & Molecular Biology, Norris Comprehensive Cancer CenterUniversity of Southern CaliforniaLos AngelesUSA

Personalised recommendations