KIRMES: kernel-based identification of regulatory modules in euchromatic sequences
- 2.1k Downloads
KeywordsTranscription Factor Binding Site Motif Finding Module Kernel Sequence Logo Weighted Degree
We predict transcription factor (TF) target genes based on their regulatory sequence. A TF binding site is a short segment (~10 bp) near a gene's regulatory region that is recognized by respective TFs. Overrepresented motifs can be identified in regulatory sequences of a set of genes that is enriched with targets for a specific TF. Gibbs-sampling methods that try to identify position weight matrices to characterize binding sites have been successful for small genomes, but are problematic in higher eukaryotes, where motifs are degenerate and form cis-regulatory modules .
We compared our method to a state-of-the-art Gibbs sampler, PRIORITY , on its own dataset with the published settings with respect to successful classification. We achieve correct predictions on 74% of their sets vs. 63% for PRIORITY. We let KIRMES classify gene sets obtained from microarrays of Arabidopsis thaliana. Using conservation as weighting for the WDS kernel improves performance. These results illustrate the power of our approach in exploiting the relationship between motifs as well as conservation to improve the recognition of TF targets. Interpretable results and an easy-to-use web service make this a valuable tool for any researcher interested in gene regulation.
- 2.Schultheiss SJ, Busch W, Lohmann JU, Kohlbacher O, Rätsch G: KIRMES: Kernel-based identification of regulatory modules in euchromatic sequences. Bioinformatics 2009. epub: 23 April 2009. epub: 23 April 2009.Google Scholar
- 4.Gordan R, Narlikar L, Hartemink A: A fast, alignment-free, conservation-based method for transcription factor binding site discovery. In Lecture Notes in Computer Science: RECOMB 2008. Volume 4955. Springer, Heidelberg, Germany; 98–111.Google Scholar
This article is published under license to BioMed Central Ltd.