EpiFIT: functional interpretation of transcription factors based on combination of sequence and epigenetic information
- 57 Downloads
Transcription factor is one of the most important regulators in the transcriptional process. Nevertheless, the functional interpretation of transcription factors is still a main challenge due to the poor performance of methods relating to regulatory regions to genes. Epigenetic information, such as chromatin accessibility, contains genome-wide knowledge about transcription regulation and thus may shed light on the functional interpretation of transcription factors.
We propose EpiFIT (Epigenetic based Functional Interpretation of Transcription factors), a tool to infer functions of transcription factors from ChIP-seq data. Briefly, we adopt a variable distance rule to establish associations between regulatory regions and nearby genes. The associations are then filtered to ensure that the remaining regions and associated genes are co-open. Finally, GO enrichment is applied to all related genes and a ranking list of GO terms is provided as functional interpretation.
We first examined the chromatin openness correlation between regulatory regions and associated genes. The correlation can help EpiFIT purify regulatory region-gene associations. By evaluating EpiFIT on a set of real data, we demonstrated that EpiFIT outperforms other existing methods for precisely interpreting transcription factor functions. We further verify the efficiency of openness in interpretation and the ability of EpiFIT to build distal region-gene associations.
EpiFIT is a powerful tool for interpreting the transcription factor functions. We believe EpiFIT will facilitate the functional interpretation of other regulatory elements, and thus open a new door to understanding the regulatory mechanism.
The application is freely accessible at website: http://bioinfo.au.tsinghua.edu.cn/openness/EpiFIT/.
Keywordstranscription factor functional interpretation epigenetic information
This work has been supported by the National Key Research and Development Program of China (No. 2018YFC0910404), the National Natural Science Foundation of China (Nos. 61873141, 61721003, 61573207, 71871019 and 71471016), and the Tsinghua-Fuzhou Institute for Data Technology.
Compliance with Ethics Guidelines
The authors Shaoming Song, Hongfei Cui, Shengquan Chen, Qiao Liu and Rui Jiang declare that they have no conflict of interests.
This article does not contain any studies with human or animal subjects performed by any of the authors.
- 5.Blahnik, K. R., Dou, L., O’Geen, H., McPhillips, T., Xu, X., Cao, A. R., Iyengar, S., Nicolet, C. M., Ludäscher, B., Korf, I., et al. (2010) Sole-Search: an integrated analysis program for peak detection and functional annotation using ChIP-seq data. Nucleic Acids Res., 38, e13CrossRefGoogle Scholar
- 13.Sherwood, R. I., Hashimoto, T., O’Donnell, C. W., Lewis, S., Barkal, A. A., van Hoff, J. P., Karun, V., Jaakkola, T. and Gifford, D. K. (2014) Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat. Biotechnol., 32, 171–178CrossRefGoogle Scholar
- 15.Chen, S., Wang, Y. and Jiang, R. (2019) OPENANNO: annotating genomic regions with chromatin accessibility. BioRxivGoogle Scholar
- 30.Trubiani, O., Zalzal, S. F., Paganelli, R., Marchisio, M., Giancola, R., Pizzicannella, J., Bühring, H. J., Piattelli, M., Caputi, S. and Nanci, A. (2010) Expression profile of the embryonic markers nanog, OCT-4, SSEA-1, SSEA-4, and frizzled-9 receptor in human periodontal ligament mesenchymal stem cells. J. Cell. Physiol., 225, 123–131CrossRefGoogle Scholar