Skip to main content

Kernel-Based Identification of Regulatory Modules

  • Protocol
  • First Online:

Part of the book series: Methods in Molecular Biology ((MIMB,volume 674))

Abstract

The challenge of identifying cis-regulatory modules (CRMs) is an important milestone for the ultimate goal of understanding transcriptional regulation in eukaryotic cells. It has been approached, among others, by motif-finding algorithms that identify overrepresented motifs in regulatory sequences. These methods succeed in finding single, well-conserved motifs, but fail to identify combinations of degenerate binding sites, like the ones often found in CRMs. We have developed a method that combines the abilities of existing motif finding with the discriminative power of a machine learning technique to model the regulation of genes (Schultheiss et al. (2009) Bioinformatics 25, 2126–2133). Our software is called kirmes, which stands for kernel-based identification of regulatory modules in eukaryotic sequences. Starting from a set of genes thought to be co-regulated, kirmes can identify the key CRMs responsible for this behavior and can be used to determine for any other gene not included on that list if it is also regulated by the same mechanism. Such gene sets can be derived from microarrays, chromatin immunoprecipitation experiments combined with next-generation sequencing or promoter/whole genome microarrays. The use of an established machine learning method makes the approach fast to use and robust with respect to noise. By providing easily understood visualizations for the results returned, they become interpretable and serve as a starting point for further analysis. Even for complex regulatory relationships, kirmes can be a helpful tool in directing the design of biological experiments.

This is a preview of subscription content, log in via an institution.

Buying options

Protocol
USD   49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Springer Nature is developing a new tool to find and evaluate Protocols. Learn more

References

  1. Boser, B., Guyon, I., and Vapnik, V. (1992) A training algorithm for optimal margin classifiers. ACM Press Proceedings COLT’ 92 , 144–152.

    Google Scholar 

  2. Noble, W.S. (2006) What is a support vector machine? Nat Biotechnol 24, 1565–1567.

    Article  PubMed  CAS  Google Scholar 

  3. Lawrence, C.E., Altschul, S.F., Boguski, M.S. et al. (1993) Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science 262, 208–214.

    Article  PubMed  CAS  Google Scholar 

  4. Gupta, M., and Liu, J. (2005) De novo cis-regulatory module elicitation for eukaryotic genomes. Proc Natl Acad Sci USA 102, 7079–7084.

    Article  PubMed  CAS  Google Scholar 

  5. Howard, M.L., and Davidson, E.H. (2004) cis-Regulatory control circuits in development. Dev Biol 271, 109–118.

    Article  PubMed  CAS  Google Scholar 

  6. Blanchette M., Bataille, A.R., Chen, X. et al. (2006) Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res 16, 656–668.

    Article  PubMed  CAS  Google Scholar 

  7. Thijs, G., Lescot, M., Marchal, K. et al. (2001) A higher order background model improves the detection of regulatory elements by Gibbs sampling. Bioinformatics 17, 1113–1122.

    Article  PubMed  CAS  Google Scholar 

  8. Gordân, R., Narlikar, L., and Hartemink, A. (2008) A fast, alignment-free, conservation-based method for transcription factor binding site discovery. LNCS RECOMB Springer, Heidelberg 4955, 98–111.

    Google Scholar 

  9. Schultheiss, S. J., Busch, W., Lohmann, J. U. et al. (2009) KIRMES: kernel-based identification of regulatory modules in euchromatic sequences. Bioinformatics 25, 2126–2133.

    Article  PubMed  CAS  Google Scholar 

  10. Das, P.M., Ramachandran, K., van Wert, J., and Singal, R. (2004) Chromatin immunoprecipitation assay. Biotechniques 37, 961–969.

    PubMed  CAS  Google Scholar 

  11. Buck, M.J., and Lieb, J.D. (2004) ChIP-chip: considerations for the design, analysis, and application of genome-wide chromatin immunoprecipitation experiments. Genomics 83, 349–360.

    Article  PubMed  CAS  Google Scholar 

  12. Barski, A., and Zhao, K. (2009) Genomic location analysis with ChIP-seq. J Cell Biochem 107, 11–18.

    Article  PubMed  CAS  Google Scholar 

  13. Sonnenburg, S., Rätsch, G., Schäfer, C., and Schölkopf, B. (2006) Large-scale multiple kernel learning. J Mach Learn Res 7, 1531–1565.

    Google Scholar 

  14. Giardine, B., Riemer, C., Hardison, R.C. et al. (2005) Galaxy: a platform for interactive large-scale genome analysis. Genome Res 15, 1451–1455.

    Article  PubMed  CAS  Google Scholar 

  15. Davis, J., and Goadrich, M. (2006) The relationship between precision-recall and ROC curves. Proceedings ICML 23, 233–240.

    Article  Google Scholar 

  16. Schneider, T.D., and Stephens, R.M. (1990) Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 18, 6097–6100.

    Article  PubMed  CAS  Google Scholar 

  17. Sonnenburg, S., Zien, A., Philips, P., and Rätsch, G. (2008) POIMs: positional oligomer importance matrices – understanding support vector machine-based signal detectors. Bioinformatics 24, i6–i14.

    Article  PubMed  CAS  Google Scholar 

  18. Rätsch, G., Sonnenburg, S., and Schölkopf, B. (2005) RASE: recognition of alternatively spliced exons in C. elegans. Bioinformatics 21(Suppl. 1), i369–i377.

    Article  PubMed  Google Scholar 

  19. Hughes, T.R., Marton, M.J., Jones, A.R. et al. (2000) Functional discovery via a compendium of expression profiles. Cell 102, 109–126.

    Article  PubMed  CAS  Google Scholar 

  20. Smith, B., Fang, H., Pan, Y. et al. (2007) Evolution of motif variants and positional bias of the cyclic-AMP response element. BMC Evol Biol 7(Suppl. 1), S15.

    Article  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sebastian J. Schultheiss .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Schultheiss, S.J. (2010). Kernel-Based Identification of Regulatory Modules. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_13

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-854-6_13

  • Published:

  • Publisher Name: Humana Press, Totowa, NJ

  • Print ISBN: 978-1-60761-853-9

  • Online ISBN: 978-1-60761-854-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics