Skip to main content
Log in

Computational Challenges in Deciphering Genomic Structures of Bacteria

  • Survey
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

This article addresses how the functionalities of the cellular machinery of a bacterium might have constrained the genomic arrangement of its genes during evolution and how we can study such problems using computational approaches, taking full advantage of the rapidly increasing pool of the sequenced bacterial genomes, potentially leading to a much improved understanding of why a bacterial genome is organized in the way it is. This article discusses a number of challenging computational problems in elucidating the genomic structures at multiple levels and the information that is encoded through these genomic structures, gearing towards the ultimate understanding of the governing rules of bacterial genome organization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Karatan E, Watnick P. Signals, regulatory networks, and materials that build and break bacterial biofilms. Microbiol. Mol. Biol. Rev., 2009, 73(2): 310-347.

    Google Scholar 

  2. An D, Parsek M R. The promise and peril of transcriptional profiling in biofilm communities. Curr. Opin. Microbiol., 2007, 10(3): 292-296.

    Google Scholar 

  3. Hoffman L R, D’Argenio D A, MacCoss M J, Zhang Z, Jones R A, Miller S I. Aminoglycoside antibiotics induce bacterial biofilm formation. Nature, 2005, 436(7054): 1171-1175.

    Google Scholar 

  4. Hall-Stoodley L, Costerton J W, Stoodley P. Bacterial biofilms: From the natural environment to infectious diseases. Nat. Rev. Microbiol., 2004, 2(2): 95-108.

    Google Scholar 

  5. How Deep is the Gene Pool? Astrobiology Magazine European Edition, 2008, http://www.astrobio.net/amee/summer 2008/Interviews/AnthonyPooleInterview.php.

  6. Ben-Jacob E. Bacterial know how: From physics to cybernetics. PhysicaPlus, 2006, 7, http://physicaplus.org.il/zope/home/en/1124811264/1145390912_eshel_en.

  7. Fleischmann R D, Adams M D, White O, Clayton R A, Kirkness E F, Kerlavage A R, Bult C J, Tomb J F, Dougherty B A, Merrick J M et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 1995, 269(5223): 496-512.

    Google Scholar 

  8. Pruitt K D, Tatusova T, Klimke W, Maglott D R. NCBI Reference sequences: Current status, policy and new initiatives. Nucleic Acids Res., 2009, 37(Database Issue): D32-D36.

    Google Scholar 

  9. Rocha E P. The replication-related organization of bacterial genomes. Microbiology, 2004, 150(Pt 6): 1609-1627.

    Google Scholar 

  10. Mackiewicz D, Mackiewicz P, Kowalczuk M, Dudkiewicz M, Dudek M R, Cebrat S. Rearrangements between differently replicating DNA strands in asymmetric bacterial genomes. Acta Microbiol. Pol., 2003, 52(3): 245-260.

    Google Scholar 

  11. Reznikoff W S. The operon revisited. Annu. Rev. Genet., 1972, 6: 133-156.

    Google Scholar 

  12. Ames B N, Martin R G. Biochemical aspects of genetics: The operon. Annu. Rev. Biochem., 1964, 33: 235-258.

    Google Scholar 

  13. Mao F, Dam P, Chou J, Olman V, Xu Y. DOOR: A database for prokaryotic operons. Nucleic Acids Res., 2009, 37(Database Issue): D459-D463.

    Google Scholar 

  14. Dam P, Olman V, Harris K, Su Z, Xu Y. Operon prediction using both genome-specific and general genomic information. Nucleic Acids Res., 2007, 35(1): 288-298.

    Google Scholar 

  15. Su Z, Olman V, Xu Y. Computational prediction of Pho regulons in cyanobacteria. BMC Genomics, 2007, 8: 156.

    Google Scholar 

  16. Claverys J P, Prudhomme M, Martin B. Induction of competence regulons as a general response to stress in gram-positive bacteria. Annu. Rev. Microbiol., 2006, 60(1): 451-475.

    Google Scholar 

  17. Yasbin R E, Cheo D L, Bayles K W. Inducible DNA repair and differentiation in Bacillus subtilis: Interactions between global regulons. Mol. Microbiol., 1992, 6(10): 1263-1270.

    Google Scholar 

  18. Zhou F, Xu Y. RepPop: A database for repetitive elements in Populus trichocarpa. BMC Genomics, 2009, 10: 14.

    Google Scholar 

  19. Zhou F, Olman V, Xu Y. Insertion sequences show diverse recent activities in Cyanobacteria and Archaea. BMC Genomics, 2008, 9: 36.

    Google Scholar 

  20. Zhou F, Tran T, Xu Y. Nezha, a novel active miniature inverted-repeat transposable element in cyanobacteria. Biochem. Biophys. Res. Commun., 2008, 365(4): 790-794.

    Google Scholar 

  21. Hayes F. Transposon-based strategies for microbial functional genomics and proteomics. Annu. Rev. Genet., 2003, 37: 3-29.

    Google Scholar 

  22. Hamer L, DeZwaan T M, Montenegro-Chamorro M V, Frank S A, Hamer J E. Recent advances in large-scale transposon mutagenesis. Curr. Opin. Chem. Biol., 2001, 5(1): 67-73.

    Google Scholar 

  23. Izawa T, Ohnishi T, Nakano T et al. Transposon tagging in rice. Plant Mol. Biol., 1997, 35(1/2): 219-229.

    Google Scholar 

  24. Noguchi H, Park J, Takagi T. MetaGene: Prokaryotic gene finding from environmental genome shotgun sequences. Nucleic Acids Res., 2006, 34(19): 5623-5630.

    Google Scholar 

  25. Nielsen P, Krogh A. Large-scale prokaryotic gene prediction and comparison to genome annotation. Bioinformatics, 2005, 21(24): 4322-4329.

    Google Scholar 

  26. Hannenhalli S S, Hayes W S, Hatzigeorgiou A G, Fickett J W. Bacterial start site prediction. Nucleic Acids Res., 1999, 27(17): 3577-3582.

    Google Scholar 

  27. Solovyev V, Kosarev P, Seledsov I, Vorobyev D. Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol., 2006, 7(Suppl. 1): S10.1-12.

    Google Scholar 

  28. Ellrott K, Guo J T, Olman V, Xu Y. Improving the performance of protein threading using insertion/deletion frequency arrays. J. Bioinform. Comput. Biol., 2008, 6(3): 585-602.

    Google Scholar 

  29. Guo J T, Xu Y. Towards modeling of amyloid fibril structures. Front. Biosci., 2008, 13: 4039-4050.

    Google Scholar 

  30. Marnef A, Sommerville J, Ladomery M R. RAP55: Insights into an evolutionarily conserved protein family. Int. J. Biochem. Cell Biol., 2009, 41(5): 977-981.

    Google Scholar 

  31. Finn R D, Tate J, Mistry J, Coggill P C, Sammut S J, Hotz H R, Ceric G, Forslund K, Eddy S R, Sonnhammer E L et al. The Pfam protein families database. Nucleic Acids Res., 2008, 36(Database Issue): D281-D288.

    Google Scholar 

  32. Hulo N, Bairoch A, Bulliard V, Cerutti L, Cuche B A, de Castro E, Lachaize C, Langendijk-Genevaux P S, Sigrist C J. The 20 years of PROSITE. Nucleic Acids Res., 2008, 36(Database Issue): D245-D249.

    Google Scholar 

  33. Bork P. Powers and pitfalls in sequence analysis: The 70% hurdle. Genome Res., 2000, 10(4): 398-400.

    Google Scholar 

  34. Aravin A A, Hannon G J. Small RNA silencing pathways in germ and stem cells. Cold Spring Harb. Symp. Quant. Biol., 2008, 73: 283-290.

    Google Scholar 

  35. Mattick J S, Amaral P P, Dinger M E, Mercer T R, Mehler M F. RNA regulation of epigenetic processes. Bioessays, 2009, 31(1): 51-59.

    Google Scholar 

  36. Stricklin S L, Griffiths-Jones S, Eddy S R. C. elegans noncoding RNA genes. WormBook, 2005, 1-7.

  37. Goodrich J A, Kugel J F. From bacteria to humans, chromatin to elongation, and activation to repression: The expanding roles of noncoding RNAs in regulating transcription. Crit. Rev. Biochem. Mol. Biol., 2009, 44(1): 3-15.

    Google Scholar 

  38. Bradley R K, Uzilov A V, Skinner M E, Bendana Y R, Barquist L, Holmes I. Evolutionary modeling and prediction of non-coding RNAs in Drosophila. PLoS One, 2009, 4(8): e6478.

    Google Scholar 

  39. Childs L, Nikoloski Z, May P, Walther D. Identification and classification of ncRNA molecules using graph properties. Nucleic Acids Res., 2009, 37(9): e66.

    Google Scholar 

  40. Voss B, Georg J, Schon V, Ude S, Hess W R. Biocomputational prediction of non-coding RNAs in model cyanobacteria. BMC Genomics, 2009, 10: 123.

    Google Scholar 

  41. Song D, Yang Y, Yu B, Zheng B, Deng Z, Lu B L, Chen X, Jiang T. Computational prediction of novel non-coding RNAs in Arabidopsis thaliana. BMC Bioinformatics, 2009, 10(Suppl 1): S36.

    Google Scholar 

  42. Wang S, Wang Y, Du W, Sun F, Wang X, Zhou C, Liang Y. A multi-approaches-guided genetic algorithm with application to operon prediction. Artif. Intell. Med., 2007, 41(2): 151-159.

    Google Scholar 

  43. Tran T T, Dam P, Su Z, Poole F L, 2nd, Adams M W, Zhou G T, Xu Y. Operon prediction in Pyrococcus furiosus. Nucleic Acids Res., 2007, 35(1): 11-20.

    Google Scholar 

  44. Zhang G Q, Cao Z W, Luo Q M, Cai Y D, Li Y X. Operon prediction based on SVM. Comput. Biol. Chem., 2006, 30(3): 233-240.

    MATH  Google Scholar 

  45. Price M N, Arkin A P, Alm E J. OpWise: Operons aid the identification of differentially expressed genes in bacterial microarray experiments. BMC Bioinformatics, 2006, 7: 19.

    Google Scholar 

  46. Alm E J, Huang K H, Price M N, Koche R P, Keller K, Dubchak I L, Arkin A P. The MicrobesOnline Web site for comparative genomics. Genome Res., 2005, 15(7): 1015-1022.

    Google Scholar 

  47. Loewen P C, Hengge-Aronis R. The role of the sigma factor sigma S (KatF) in bacterial global regulation. Annu. Rev. Microbiol., 1994, 48: 53-80.

    Google Scholar 

  48. Errington J. Bacillus subtilis sporulation: Regulation of gene expression and control of morphogenesis. Microbiol. Rev., 1993, 57(1): 1-33.

    Google Scholar 

  49. Stragier P, Losick R. Cascades of sigma factors revisited. Mol. Microbiol., 1990, 4(11): 1801-1806.

    Google Scholar 

  50. Prakash A, Tompa M. Discovery of regulatory elements in vertebrates through comparative genomics. Nat. Biotechnol, 2005, 23(10): 1249-1256.

    Google Scholar 

  51. Tompa M, Li N, Bailey T L, Church G M, De Moor B, Eskin E, Favorov A V, Frith M C, Fu Y, Kent W J et al. Assessing computational tools for the discovery of transcription factor binding sites. Nat. Biotechnol., 2005, 23(1): 137-144.

    Google Scholar 

  52. Chen Y, Zhou F, Li G, Xu Y. A recently active miniature inverted-repeat transposable element, Chunjie, inserted into an operon without disturbing the operon structure in Geobacter uraniireducens Rf4. Genetics, 2008, 179(4): 2291-2297.

    Google Scholar 

  53. Xu Z, Wang H. LTR FINDER: An efficient tool for the prediction of full-length LTR retrotransposons. Nucleic Acids Res., 2007, 35(Web Server Issue): W265-W268.

    Google Scholar 

  54. Feschotte C, Keswani U, Ranganathan N, Guibotsy M L, Levine D. Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes. Genome Biol. Evol., 2009, pp.205-220.

  55. Zhou F, Olman V, Xu Y. Barcodes for genomes and applications. BMC Bioinformatics, 2008, 9(1): 546.

    Google Scholar 

  56. Whitaker J W, McConkey G A, Westhead D R. Prediction of horizontal gene transfers in eukaryotes: Approaches and challenges. Biochem. Soc. Trans., 2009, 37(Pt 4): 792-795.

    Google Scholar 

  57. Fournier G P, Huang J, Gogarten J P. Horizontal gene transfer from extinct and extant lineages: Biological innovation and the coral of life. Philos. Trans. R. Soc. Lond. B. Biol. Sci., 2009, 364(1527): 2229-2239.

    Google Scholar 

  58. Huang J, Gogarten J P. Ancient gene transfer as a tool in phylogenetic reconstruction. Methods Mol. Biol., 2009, 532: 127-139.

    Google Scholar 

  59. Taylor R, Singhal M. Biological network inference and analysis using SEBINI and CABIN. Methods Mol. Biol., 2009, 541: 551-576.

    Google Scholar 

  60. Schadt E E, Zhang B, Zhu J. Advances in systems biology are enhancing our understanding of disease and moving us closer to novel disease treatments. Genetica, 2009, 136(2): 259-269.

    Google Scholar 

  61. Kreutz C, Timmer J. Systems biology: Experimental design. FEBS J., 2009, 276(4): 923-942.

    Google Scholar 

  62. Iyengar R. Computational biochemistry: Systems biology minireview series. J. Biol. Chem., 2009, 284(9): 5425-5426.

    MathSciNet  Google Scholar 

  63. van Gend C, Snoep J L. Systems biology model databases and resources. Essays Biochem., 2008, 45: 223-236.

    Google Scholar 

  64. Sauro H M, Bergmann F T. Standards and ontologies in computational systems biology. Essays Biochem., 2008, 45: 211-222.

    Google Scholar 

  65. Brul S, Mensonides F I, Hellingwerf K J, Teixeira de Mattos M J. Microbial systems biology: New frontiers open to predictive microbiology. Int. J. Food Microbiol., 2008, 128(1): 16-21.

    Google Scholar 

  66. Davidov E, Holland J, Marple E, Naylor S. Advancing drug discovery through systems biology. Drug Discov. Today, 2003, 8(4): 175-183.

    Google Scholar 

  67. Ideker T, Galitski T, Hood L. A new approach to decoding life: Systems biology. Annu. Rev. Genomics. Hum. Genet, 2001, 2: 343-372.

    Google Scholar 

  68. Griswold A. Genome packaging in prokaryotes: The circular chromosome of E. coli. Nature Education, 2008, 1(1).

  69. Mason D J, Powelson D M. Nuclear division as observed in live bacteria by a new technique. J. Bacteriol., 1956, 71(4): 474-479.

    Google Scholar 

  70. Gogarten J P, Townsend J P. Horizontal gene transfer, genome innovation and evolution. Nat. Rev. Microbiol., 2005, 3(9): 679-687.

    Google Scholar 

  71. Koonin E V, Makarova K S, Aravind L. Horizontal gene transfer in prokaryotes: Quantification and classification. Annu. Rev. Microbiol., 2001, 55: 709-742.

    Google Scholar 

  72. Lawrence J G, Hendrickson H. Genome evolution in bacteria: Order beneath chaos. Curr. Opin. Microbiol., 2005, 8(5): 572-578.

    Google Scholar 

  73. Preidis G A, Versalovic J. Targeting the human microbiome with antibiotics, probiotics, and prebiotics: Gastroenterology enters the metagenomics era. Gastroenterology, 2009, 136(6): 2015-2031.

    Google Scholar 

  74. Petrosino J F, Highlander S, Luna R A, Gibbs R A, Versalovic J. Metagenomic pyrosequencing and microbial identification. Clin. Chem., 2009, 55(5): 856-866.

    Google Scholar 

  75. Hattori M, Taylor T D. The human intestinal microbiome: A new frontier of human biology. DNA Res., 2009, 16(1): 1-12.

    Google Scholar 

  76. Sivachenko A Y, Yuryev A, Daraselia N, Mazo I. Molecular networks in microarray analysis. J. Bioinform. Comput. Biol., 2007, 5(2B): 429-456.

    Google Scholar 

  77. Wade J T, Struhl K, Busby S J, Grainger D C. Genomic analysis of protein-DNA interactions in bacteria: Insights into transcription and chromosome organization. Mol. Microbiol., 2007, 65(1): 21-26.

    Google Scholar 

  78. Tian F, Shah P K, Liu X, Negre N, Chen J, Karpenko O, White K P, Grossman R L. Flynet: A genomic resource for Drosophila melanogaster transcriptional regulatory networks. Bioinformatics, 2009, 25(22): 3001-3004.

    Google Scholar 

  79. Kaufmann K,Muino JM, Jauregui R, Airoldi C A, Smaczniak C, Krajewski P, Angenent G C. Target genes of the MADS transcription factor SEPALLATA3: Integration of developmental and hormonal pathways in the Arabidopsis flower. PLoS Biol., 2009, 7(4): e1000090.

    Google Scholar 

  80. Gilchrist D A, Fargo D C, Adelman K. Using ChIP-chip and ChIP-seq to study the regulation of gene expression: Genomewide localization studies reveal widespread regulation of transcription elongation. Methods, 2009, 48(4): 398-408.

    Google Scholar 

  81. Lau K W, Jones A R, Swainston N, Siepen J A, Hubbard S J. Capture and analysis of quantitative proteomic data. Proteomics, 2007, 7(16): 2787-2799.

    Google Scholar 

  82. Budzikiewicz H, Grigsby R D. Mass spectrometry and isotopes: A century of research and discussion. Mass Spectrom Rev., 2006, 25(1): 146-157.

    Google Scholar 

  83. Walker G S, O’Connell T N. Comparison of LC-NMR and conventional NMR for structure elucidation in drug metabolism studies. Expert Opin. Drug Metab. Toxicol., 2008, 4(10): 1295-1305.

    Google Scholar 

  84. Mesnard F, Ratcliffe R G. NMR analysis of plant nitrogen metabolism. Photosynth. Res., 2005, 83(2): 163-180.

    Google Scholar 

  85. Bruckner A, Polge C, Lentze N, Auerbach D, Schlattner U. Yeast two-hybrid, a powerful tool for systems biology. Int. J. Mol. Sci., 2009, 10(6): 2763-2788.

    Google Scholar 

  86. Lee E J, Hyun S, Chun J, Shin S H, Kang S S. Ubiquitylation of Fe65 adaptor protein by neuronal precursor cell expressed developmentally down regulated 4-2 (Nedd4-2) via the WW domain interaction with Fe65. Exp. Mol. Med., 2009, 41(8): 555-568.

    Google Scholar 

  87. Chun J, Kwon T, Lee E J, Hyun S, Hong S K, Kang S S. The subcellular localization of 3-phosphoinositide-dependent protein kinase is controlled by caveolin-1 binding. Biochem. Biophys. Res. Commun., 2005, 326(1): 136-146.

    Google Scholar 

  88. Warren E M, Huang H, Fanning E, Chazin W J, Eichman B F. Physical interactions between MCM10, DNA, AND DNA polymerase α. J. Biol. Chem., 2009, 284(36): 24662-24672.

    Google Scholar 

  89. Hrmova M, Fincher G B. Functional genomics and structural biology in the definition of gene function. Methods Mol. Biol., 2009, 513: 199-227.

    Google Scholar 

  90. Li H, Thanassi D G. Use of a combined cryo-EM and X-ray crystallography approach to reveal molecular details of bacterial pilus assembly by the chaperone/usher pathway. Curr. Opin. Microbiol., 2009, 12(3): 326-332.

    Google Scholar 

  91. Ritchie D W. Recent progress and future directions in proteinprotein docking. Curr. Protein Pept. Sci., 2008, 9(1): 1-15.

    Google Scholar 

  92. Xie G, Keyhani N O, Bonner C A, Jensen R A. Ancient origin of the tryptophan operon and the dynamics of evolutionary change. Microbiol. Mol. Biol. Rev., 2003, 67(3): 303-342.

    Google Scholar 

  93. Mao F, Su Z, Olman V, Dam P, Liu Z, Xu Y. Mapping of orthologous genes in the context of biological pathways: An application of integer programming. Proc. Natl. Acad. Sci. USA, 2006, 103(1): 129-134.

    Google Scholar 

  94. Chen X, Su Z, Xu Y, Jiang T. Computational prediction of operons in Synechococcus sp. WH8102. Genome Inform., 2004, 15(2): 211-222.

    Google Scholar 

  95. Fulton D L, Li Y Y, Laird M R, Horsman B G, Roche F M, Brinkman F S. Improving the specificity of high-throughput ortholog prediction. BMC Bioinformatics, 2006, 7: 270.

    Google Scholar 

  96. Li G, Che D, Xu Y. A universal operon predictor for prokaryotic genomes. J. Bioinform. Comput. Biol., 2009, 7(1): 19-38.

    Google Scholar 

  97. Che D, Li G, Mao F, Wu H, Xu Y. Detecting uber-operons in prokaryotic genomes. Nucleic Acids Res., 2006, 34(8): 2418-2427.

    Google Scholar 

  98. Walker A K, See R, Batchelder C, Kophengnavong T, Gronniger J T, Shi Y, Blackwell T K. A conserved transcription motif suggesting functional parallels between Caenorhabditis elegans SKN-1 and Cap’n’Collar-related basic leucine zipper proteins. J. Biol. Chem., 2000, 275(29): 22166-22171.

    Google Scholar 

  99. Musso G, Zhang Z, Emili A. Retention of protein complex membership by ancient duplicated gene products in budding yeast. Trends Genet., 2007, 23(6): 266-269.

    Google Scholar 

  100. Wang T, Furey T S, Connelly J J, Ji S, Nelson S, Heber S, Gregory S G, Hauser E R. A general integrative genomic feature transcription factor binding site prediction method applied to analysis of USF1 binding in cardiovascular disease. Hum. Genomics, 2009, 3(3): 221-235.

    MATH  Google Scholar 

  101. Conesa A, Gotz S. Blast2GO: A comprehensive suite for functional analysis in plant genomics. Int. J. Plant Genomics., 2008, 2008: 619832.

    Google Scholar 

  102. Yan B, Methe B A, Lovley D R, Krushkal J. Computational prediction of conserved operons and phylogenetic footprinting of transcription regulatory elements in the metal-reducing bacterial family Geobacteraceae. J. Theor. Biol., 2004, 230(1): 133-144.

    Google Scholar 

  103. Sharon I, Davis J V, Yona G. Prediction of protein-protein interactions: A study of the co-evolution model. Methods Mol. Biol., 2009, 541: 61-88.

    Google Scholar 

  104. Ventura M, Turroni F, Canchaya C, Vaughan EE, O’Toole PW, van Sinderen D. Microbial diversity in the human intestine and novel insights from metagenomics. Front. Biosci., 2009, 14: 3214-3221.

    Google Scholar 

  105. Jaeger C, Hatziagelaki E, Petzoldt R, Bretzel R G. Comparative analysis of organ-specific autoantibodies and celiac disease—Associated antibodies in type 1 diabetic patients, their first-degree relatives, and healthy control subjects. Diabetes Care, 2001, 24(1): 27-32.

    Google Scholar 

  106. Morita M, Shibuya M, Kushiro T, Masuda K, Ebizuka Y. Molecular cloning and functional expression of triterpene synthases from pea (Pisum sativum) new alpha-amyrinproducing enzyme is a multifunctional triterpene synthase. Eur. J Biochem., 2000, 267(12): 3453-3460.

    Google Scholar 

  107. Bader M, Abouelhoda M I, Ohlebusch E. A fast algorithm for the multiple genome rearrangement problem with weighted reversals and transpositions. BMC Bioinformatics, 2008, 9: 516.

    Google Scholar 

  108. Jiang X F, Yang J. A novel approach to predict proteinprotein interactions related to alzheimer’s disease based on complex network. Protein Pept. Lett., Sept. 2009.

  109. Moriya Y, Itoh M, Okuda S, Yoshizawa A C, Kanehisa M. KAAS: An automatic genome annotation and pathway reconstruction server. Nucleic Acids Res., 2007, 35(Web Server Issue): W182-W185.

    Google Scholar 

  110. Berglund A C, Sjolund E, Ostlund G, Sonnhammer E L. In-Paranoid 6: Eukaryotic ortholog clusters with inparalogs. Nucleic Acids Res., 2008, 36(Database Issue): D263-D266.

    Google Scholar 

  111. Tatusov R L, Fedorova N D, Jackson J D, Jacobs A R, Kiryutin B, Koonin E V, Krylov D M, Mazumder R, Mekhedov S L, Nikolskaya A N et al. The COG database: An updated version includes eukaryotes. BMC Bioinformatics, 2003, 4: 41.

    Google Scholar 

  112. Lathe W C, 3rd, Snel B, Bork P. Gene context conservation of a higher order than operons. Trends Biochem. Sci., 2000, 25(10): 474-479.

    Google Scholar 

  113. Karlin S, Mrazek J, Ma J, Brocchieri L. Predicted highly expressed genes in archaeal genomes. Proc. Natl. Acad. Sci. USA, 2005, 102(20): 7303-7308.

    Google Scholar 

  114. Cormen T H, Leiserson C E, Rivest R L, Stein C. Introduction to Algorithms, Second Edition. Cambridge, MA: The MIT Press, 2001.

    MATH  Google Scholar 

  115. Fani R, Brilli M, Lio P. The origin and evolution of operons: The piecewise building of the proteobacterial histidine operon. J. Mol. Evol., 2005, 60(3): 378-390.

    Google Scholar 

  116. Su Z, Mao F, Dam P, Wu H, Olman V, Paulsen IT, Palenik B, Xu Y. Computational inference and experimental validation of the nitrogen assimilation regulatory network in cyanobacterium Synechococcus sp. WH 8102. Nucleic Acids Res., 2006, 34(3): 1050-1065.

    Google Scholar 

  117. Salgado H, Gama-Castro S, Martinez-Antonio A, Diaz-Peredo E, Sanchez-Solano F, Peralta-Gil M, Garcia-Alonso D, Jimenez-Jacinto V, Santos-Zavaleta A, Bonavides-Martinez C et al. RegulonDB (version 4.0): Transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic Acids Res., 2004, 32(Database Issue): D303-D306.

    Google Scholar 

  118. De Hoon M J, Imoto S, Kobayashi K, Ogasawara N, Miyano S. Predicting the operon structure of Bacillus subtilis using operon length, intergene distance, and gene expression information. Pac. Symp. Biocomput., 2004, 9: 276-287.

    Google Scholar 

  119. Lin H Y, Bledsoe P J, Stewart V. Activation of yeaR-yoaG operon transcription by the nitrate-responsive regulator NarL is independent of oxygen-responsive regulator Fnr in Escherichia coli K-12. J. Bacteriol., 2007, 189(21): 7539-7548.

    Google Scholar 

  120. Barthelmebs L, Lecomte B, Divies C, Cavin J F. Inducible metabolism of phenolic acids in Pediococcus pentosaceus is encoded by an autoregulated operon which involves a new class of negative transcriptional regulator. J. Bacteriol., 2000, 182(23): 6724-6731.

    Google Scholar 

  121. Dale C J, Moses E K, Ong C C, Morrow C J, Reed M B, Hasse D, Strugnell R A. Identification and sequencing of the groE operon and flanking genes of Lawsonia intracellularis: Use in phylogeny. Microbiology, 1998, 144(Pt 8): 2073-2084.

    Google Scholar 

  122. Bockhorst J, Craven M, Page D, Shavlik J, Glasner J. A Bayesian network approach to operon prediction. Bioinformatics, 2003, 19(10): 1227-1235.

    Google Scholar 

  123. Kowarz L, Robbe-Saule V, Norel F. Identification of cis-acting DNA sequences involved in the transcription of the virulence regulatory gene spvR in Salmonella typhimurium. Mol. Gen. Genet., 1996, 251(2): 225-235.

    Google Scholar 

  124. Mrazek J, Karlin S. Strand compositional asymmetry in bacterial and large viral genomes. Proc. Natl. Acad. Sci. USA, 1998, 95(7): 3720-3725.

    Google Scholar 

  125. Yachie N, Arakawa K, Tomita M. On the interplay of gene positioning and the role of rho-independent terminators in Escherichia coli. FEBS Lett., 2006, 580(30): 6909-6914.

    Google Scholar 

  126. Bockhorst J, Qiu Y, Glasner J, Liu M, Blattner F, Craven M. Predicting bacterial transcription units using sequence and expression data. Bioinformatics, 2003, 19(Suppl 1): i34-i43.

    Google Scholar 

  127. Stormo G D, Hartzell G W, 3rd. Identifying protein-binding sites from unaligned DNA fragments. Proc. Natl. Acad .Sci. USA, 1989, 86(4): 1183-1187.

    Google Scholar 

  128. Bailey T L, Boden M, Buske F A, Frith M, Grant C E, Clementi L, Ren J, Li W W, Noble W S. MEME SUITE: Tools for motif discovery and searching. Nucleic Acids Res., 2009, 37(Web Server Issue): W202-W208.

    Google Scholar 

  129. Liu X, Brutlag D L, Liu J S. BioProspector: Discovering conserved DNA motifs in upstream regulatory regions of coexpressed genes. Pac. Symp. Biocomput., 2001, 6: 127-138.

    Google Scholar 

  130. Cliften P, Sudarsanam P, Desikan A, Fulton L, Fulton B, Majors J, Waterston R, Cohen B A, Johnston M. Finding functional features in Saccharomyces genomes by phylogenetic footprinting. Science, 2003, 301(5629): 71-76.

    Google Scholar 

  131. Blanchette M, Tompa M. Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res., 2002, 12(5): 739-748.

    Google Scholar 

  132. Wu H, Mao F, Olman V, Xu Y. On application of directons to functional classification of genes in prokaryotes. Comput. Biol. Chem., 2008, 32(3): 176-184.

    Google Scholar 

  133. Wu H, Mao F, Olman V, Xu Y. Hierarchical classification of functionally equivalent genes in prokaryotes. Nucleic Acids Res., 2007, 35(7): 2125-2140.

    Google Scholar 

  134. Bowers P M, Cokus S J, Eisenberg D, Yeates T O. Use of logic relationships to decipher protein network organization. Science, 2004, 306(5705): 2246-2249.

    Google Scholar 

  135. Jiang T, Keating A E. AVID: An integrative framework for discovering functional relationships among proteins. BMC Bioinformatics, 2005, 6: 136.

    Google Scholar 

  136. Yu C, Zavaljevski N, Desai V, Johnson S, Stevens F J, Reifman J. The development of PIPA: An integrated and automated pipeline for genome-wide protein function annotation. BMC Bioinformatics, 2008, 9: 52.

    Google Scholar 

  137. Aoki-Kinoshita K F, Kanehisa M. Gene annotation and pathway mapping in KEGG. Methods Mol. Biol., 2007, 396: 71-91.

    Google Scholar 

  138. Caspi R, Foerster H, Fulcher C A, Hopkinson R, Ingraham J, Kaipa P, Krummenacker M, Paley S, Pick J, Rhee S Y et al. MetaCyc: A multiorganism database of metabolic pathways and enzymes. Nucleic Acids Res., 2006, 34(Database Issue): D511-D516.

    Google Scholar 

  139. Buckler D R, Zhou Y, Stock A M. Evidence of intradomain and interdomain flexibility in an OmpR/PhoB homolog from Thermotoga maritima. Structure, 2002, 10(2): 153-164.

    Google Scholar 

  140. Perez E, Samper S, Bordas Y, Guilhot C, Gicquel B, Martin C. An essential role for phoP in Mycobacterium tuberculosis virulence. Mol. Microbiol., 2001, 41(1): 179-187.

    Google Scholar 

  141. Hengge R. The two-component network and the general stress sigma factor RpoS (sigma S) in Escherichia coli. Adv. Exp. Med. Biol., 2008, 631: 40-53.

    Google Scholar 

  142. Campbell E A, Westblade L F, Darst S A. Regulation of bacterial RNA polymerase sigma factor activity: A structural perspective. Curr. Opin. Microbiol., 2008, 11(2): 121-127.

    Google Scholar 

  143. Germer J, Becker G, Metzner M, Hengge-Aronis R. Role of activator site position and a distal UP-element half-site for sigma factor selectivity at a CRP/H-NS-activated sigma(s)-dependent promoter in Escherichia coli. Mol. Microbiol., 2001, 41(3): 705-716.

    Google Scholar 

  144. Colland F, Barth M, Hengge-Aronis R, Kolb A. Sigma factor selectivity of Escherichia coli RNA polymerase: Role for CRP, IHF and lrp transcription factors. EMBO J., 2000, 19(12): 3028-3037.

    Google Scholar 

  145. Kivistik P A, Kivi R, Kivisaar M, Horak R. Identification of ColR binding consensus and prediction of regulon of ColRS two-component system. BMC Mol. Biol., 2009, 10: 46.

    Google Scholar 

  146. Munch R, Hiller K, Grote A, Scheer M, Klein J, Schobert M, Jahn D. Virtual footprint and PRODORIC: An integrative framework for regulon prediction in prokaryotes. Bioinformatics, 2005, 21(22): 4187-4189.

    Google Scholar 

  147. Yellaboina S, Ranjan S, Chakhaiyar P, Hasnain S E, Ranjan A. Prediction of DtxR regulon: Identification of binding sites and operons controlled by Diphtheria toxin repressor in Corynebacterium diphtheriae. BMC Microbiol., 2004, 4: 38.

    Google Scholar 

  148. Dombrecht B, Marchal K, Vanderleyden J, Michiels J. Prediction and overview of the RpoN-regulon in closely related species of the Rhizobiales. Genome Biol., 2002, 3(12): RESEARCH0076.

    Google Scholar 

  149. Smith A D, Sumazin P, Xuan Z, Zhang M Q. DNA motifs in human and mouse proximal promoters predict tissue-specific expression. Proc. Natl. Acad. Sci. USA, 2006, 103(16): 6275-6280.

    Google Scholar 

  150. Jacob F, Monod J. On the regulation of gene activity. Cold Spring Harbor Symposia on Quantitative Biology, 1961, 26: 193-211.

    Google Scholar 

  151. Okuda S, Yamada T, Hamajima M, Itoh M, Katayama T, Bork P, Goto S, Kanehisa M. KEGG Atlas mapping for global analysis of metabolic pathways. Nucleic Acids Res., 2008, 36(Web Server Issue): W423-W426.

    Google Scholar 

  152. Yin Y, Zhang H, Xu Y. A governing rule for gene arrangement at a global scale in bacterial genomes. submitted, 2009.

  153. Faith J J, Driscoll M E, Fusaro V A, Cosgrove E J, Hayete B, Juhn F S, Schneider S J, Gardner T S. Many microbe microarrays database: Uniformly normalized Affymetrix compendia with structured experimental metadata. Nucleic Acids Res., 2008, 36(Database Issue): D866-D870.

    Google Scholar 

  154. Langille M G I, Zhou F, Fedynak A, Hsiao W W L, Xu Y, Brinkman F S L. Mobile Genetic Elements and Their Prediction. Computational Methods for Understanding Bacterial and Archaeal Genomes, Xu Y, Gogarten J P (eds.), London: Imperial College Press, 2008, pp.113-136.

    Google Scholar 

  155. Gogarten J P, Zhaxybayeva O. Horizontal Gene Transfer: Its Detection and Role in Microbial Evolution. Computational Methods for Understanding Bacterial and Archaeal Genomes, Xu Y, Gogarten J P (eds.), London: Imperial College Press, 2008, pp.137-152.

    Google Scholar 

  156. Vitte C, Panaud O. LTR retrotransposons and flowering plant genome size: Emergence of the increase/decrease model. Cytogenet Genome Res., 2005, 110(1-4): 91-107.

    Google Scholar 

  157. Craig N L, Craigie R, Gellert M, Lambowitz A M. Mobile DNA II. Washington DC: American Society for Microbiology, 2002.

  158. Bestor T H. Transposons reanimated in mice. Cell, 2005, 122(3): 322-325.

    Google Scholar 

  159. Siguier P, Perochon J, Lestrade L, Mahillon J, Chandler M. ISfinder: The reference centre for bacterial insertion sequences. Nucleic Acids Res., 2006, 34(Database Issue): D32-D36.

    Google Scholar 

  160. Chandler M, Mahillon J. Insertion Sequences Revisited. 2nd Ed, Washington DC: American Society of Microbiology, 2002.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Xu.

Additional information

The work is supported in part by the NSF of USA (Grant Nos. DBI-0354771, ITR-IIS-0407204, DBI-0542119, CCF0621700), NIH of USA (Grant Nos. 1R01GM075331 and 1R01GM081682) and the grant for the BioEnergy Science Center.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xu, Y. Computational Challenges in Deciphering Genomic Structures of Bacteria. J. Comput. Sci. Technol. 25, 53–70 (2010). https://doi.org/10.1007/s11390-010-9305-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-010-9305-5

Keywords

Navigation