Abstract
Transcription factors (TFs) play an important role in gene regulation. Computational identification and annotation of TFs at genome scale are the first step toward understanding the mechanism of gene expression and regulation. We started to construct the database of Arabidopsis TFs in 2005 and developed a pipeline for systematic identification of plant TFs from genomic and transcript sequences. In the following years, we built a database of plant TFs (PlantTFDB, http://planttfdb.cbi.pku.edu.cn) which contains putative TFs identified from 22 species including five model organisms and 17 economically important plants with available EST sequences. To provide comprehensive information for the putative TFs, we made extensive annotation at both the family and gene levels. A brief introduction and key references were presented for each family. Functional domain information and cross-references to various well-known public databases were available for each identified TF. In addition, we predicted putative orthologs of the TFs in other species. PlantTFDB has a simple interface to allow users to make text queries, or BLAST searches, and to download TF sequences for local analysis. We hope that PlantTFDB could provide the user community with a useful resource for studying the function and evolution of transcription factors.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
The PlantTFDB was updated to version 2.0 in July 2010, with predicted TFs from more species, and a new interface.
References
Riechmann, J.L., Heard, J., Martin, G., Reuber, L. et al. (2000) Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. Science 290, 2105–2110.
Badis, G., Berger, M.F., Philippakis, A.A. et al. (2009) Diversity and complexity in DNA recognition by transcription factors. Science 324, 1720–1723.
Yu, J.Y., Vodyanik, M.A., Smuga-Otto, K. et al. (2007) Induced pluripotent stem cell lines derived from human somatic cells. Science 318, 1917–1920.
Qu, L.J., and Zhu, Y.X. (2006) Transcription factor families in Arabidopsis: major progress and outstanding issues for future research. Curr Opin Plant Biol 9, 544–549.
Wingender, E., Dietze, P., Karas, H. et al. (1996) TRANSFAC: a database of transcription factors and their DNA binding sites. Nucleic Acids Res 24, 238–241.
Davuluri, R.V., Sun, H., Palaniswamy, S.K. et al. (2003) AGRIS: Arabidopsis gene regulatory information server, an information resource of Arabidopsis cis-regulatory elements and transcription factors. BMC Bioinformatics 23, 25.
Iida, K., Seki, M., Sakurai, T., Satou, M. et al. (2005) RARTF: database and tools for complete sets of Arabidopsis transcription factors. DNA Res 12, 247–256.
Rushton, P.J., Bokowiec, M.T., Laudeman, T.W. et al. (2008) TOBFAC: the database of tobacco transcription factors. BMC Bioinformatics 9, 53.
Riaño-Pachón, D.M., Ruzicic, S., Dreyer, I. et al. (2007) PlnTFDB: an integrative plant transcription factor database. BMC Bioinformatics 8, 42.
Guo, A.Y., Chen, X., Gao, G. et al. (2008) PlantTFDB: a comprehensive plant transcription factor database. Nucleic Acids Res 36, D966–D969.
Gong, W., Shen, Y.P., Ma, LG. et al. (2004) Genome-wide ORFeome cloning and analysis of Arabidopsis transcription factor genes. Plant Physiol 135,773–782.
Guo, A., He, K., Liu, D. et al. (2005) DATF: a database of Arabidopsis transcription factors. Bioinformatics 21, 2568–2569.
Gao, G., Zhong, Y., Guo, A., Zhu, Q. et al. (2006) DRTF: a database of rice transcription factors. Bioinformatics 22, 1286–1287.
Zhu, Q.H., Guo, A.Y., Gao, G. et al. (2007 ) DPTF: a database of poplar transcription factors. Bioinformatics 23, 1307–1308.
Quevillon, E., Silventoinen, V., Pillai, S. et al. (2005) InterProScan: protein domains identifier Nucleic Acids Res 33, W116–W120.
Finn, R.D., Tate, J., Mistry, J. et al. (2008) The Pfam protein families database. Nucleic Acids Res 36, D281–D288.
The Gene Ontology Consortium. (2000) Gene ontology: tool for the unification of biology. Nat Genet 25, 25–29.
Kraulis, P.J. (1991) MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J Appl Cryst 24, 946–950.
Altschul, S.F., Madden, T.L., Schäffer, A.A. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25, 3389–3402.
Durbin, R., Eddy, S., Krogh, A. et al. (1998) Biological sequence analysis: probabilistic models of proteins and nucleic acids. Cambridge University Press, Cambridge, MA.
Larkin, M.A., Blackshields, G., Brown, N.P. et al. (2007) Clustal W and Clustal X version 2.0. Bioinformatics 23, 2947–2948.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
He, K. et al. (2010). Computational Identification of Plant Transcription Factors and the Construction of the PlantTFDB Database. In: Ladunga, I. (eds) Computational Biology of Transcription Factor Binding. Methods in Molecular Biology, vol 674. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-854-6_21
Download citation
DOI: https://doi.org/10.1007/978-1-60761-854-6_21
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-853-9
Online ISBN: 978-1-60761-854-6
eBook Packages: Springer Protocols