PFP: A Computational Framework for Phylogenetic Footprinting in Prokaryotic Genomes
Phylogenetic footprinting is a widely used approach for the prediction of transcription factor binding sites (TFBSs) through identification of conserved motifs in the upstream sequences of orthologous genes in eukaryotic genomes. However, this popular strategy may not be directly applicable to prokaryotic genomes, where typically about half of the genes in a genome form multiple-gene transcription units or operons. The promoter sequences for these operons are located in the inter-operonic rather than inter-genic regions, which require prediction of TFBSs at the transcriptional unit instead of individual gene level. We have formulated as a bipartite graph matching problem the identification of conserved operons (including both single-gene and multi-gene operons) whose individual gene members are orthologous between two genomes and present a graph-theoretic solution. By applying this method to Escherichia coli K12 and 11 of its phylogeneticly neighboring species, we have predicted 2,478 sets of conserved operons, and discovered potential binding motifs for each of these operons. By comparing the prediction results of our approach and other prediction approaches, we conclude that it is advantageous to use our approach for prediction of cis regulatory binding sites in prokaryotes. The prediction software package PFP is available at http://csbl.bmb.uga.edu/~dongsheng/PFP .
KeywordsReference Genome Orthologous Gene Motif Discovery Target Genome Prokaryotic Genome
Unable to display preview. Download preview PDF.
- 1.Tagle, D.A., Koop, B.F., Goodman, M., Slightom, J.L., Hess, D.L., Jones, R.T.: Embryonic epsilon and gamma globin genes of a prosimian primate (Galago crassicaudatus). Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprints 203, 439–455 (1988)Google Scholar
- 4.Wang, T., Stormo, G.D.: Identifying the conserved network of cis-regulatory sites of a eukaryotic genome. In: Proceedings of the National Academy of Sciences of the United States of America, vol. 102, pp. 17400–17405 (2005)Google Scholar
- 6.Wu, H., Mao, F., Olman, V., Xu, Y.: Accurate prediction of orthologous gene groups in microbes. In: Proceedings/ IEEE Computational Systems Bioinformatics Conference, CSB, pp. 73–79 (2005)Google Scholar
- 9.Che, D., Zhao, J., Cai, L., Xu, Y.: Operon Prediction in Microbial Genomes Using Decision Tree Approach. In: Proceedings of IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, pp. 135–142 (2007)Google Scholar
- 22.Liu, X., Brutlag, D., Liu, J.: BioProspector: discovering conserved DNA motifs in upstream regulatory regions of coexpressed genes. Pac. Symp. Biocomput, 127–138 (2001)Google Scholar
- 24.Bailey, T.L., Elkan, C.: Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proceedings of the Second International Conference on Intelligent Systems for Molecular Biology, pp. 28–36. AAAI Press, Menlo Park, California (1994)Google Scholar