Skip to main content

Comparative Genomics Approaches to Identifying Functionally Related Genes

  • Conference paper
Book cover Algorithms for Computational Biology (AlCoB 2014)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 8542))

Included in the following conference series:

  • 1207 Accesses

Abstract

The rapid progress in genome sequencing makes it possible to address fundamental problems of biology and achieve critical insights into the functioning of the live cells and entire organisms. However, the widening gap between the rapidly accumulating sequence data and our ability to properly annotate these data constitutes a major problem that slows down the progress of genome biology. This paper discusses the notion of “function” as it relates to computational biology, lists the most common ways of assigning function to the new genes, particularly those that specifically rely on comparative genome analysis, and briefly reviews the drawbacks of the current algorithms for semi-automated high-throughput functional annotation of genomes.

The article is a work of the United States Government; Title 17 U.S.C 105 provides that copyright protection is not available for any work of the United States government in the United States.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Fleischmann, R.D., Adams, M.D., White, O., Clayton, R.A., Kirkness, E.F., Kerlavage, A.R., Bult, C.J., Tomb, J.-F., Dougherty, B.A., Merrick, J.M., McKenney, K., Sutton, G.G., FitzHugh, W., Fields, C., Gocayne, J.D., Scott, J., Shirley, R., Liu, L.-I., Glodek, A., Kelley, J.M., Weidman, J.F., Phillips, C.A., Spriggs, T., Hedblom, E., Cotton, M.D., Utterback, T.R., Hanna, M.C., Nguyen, D., Saudek, D.M., Brandon, R.C., Fine, L.D., Frichtman, J.L., Fuhrmann, J.L., Geoghagen, N.S.M., Gnehm, C.L., McDonald, L.A., Small, K.V., Fraser, C.M., Smith, H.O., Venter, J.C.: Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269, 496–512 (1995)

    Article  Google Scholar 

  2. Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., Funke, R., Gage, D., Harris, K., Heaford, A., Howland, J., Kann, L., Lehoczky, J., LeVine, R., McEwan, P., McKernan, K., Meldrim, J., Mesirov, J.P., Miranda, C., Morris, W., Naylor, J., Raymond, C., Rosetti, M., Santos, R., Sheridan, A., Sougnez, C., Stange-Thomann, N., Stojanovic, N., Subramanian, A., Wyman, D., Rogers, J., Sulston, J., Ainscough, R., Beck, S., Bentley, D., Burton, J., Clee, C., Carter, N., Coulson, A., Deadman, R., Deloukas, P., Dunham, A., Dunham, I., Durbin, R., French, L., Grafham, D., Gregory, S., Hubbard, T., Humphray, S., Hunt, A., Jones, M., Lloyd, C., McMurray, A., Matthews, L., Mercer, S., Milne, S., Mullikin, J.C., Mungall, A., Plumb, R., Ross, M., Shownkeen, R., Sims, S., Waterston, R.H., Wilson, R.K., Hillier, L.W., McPherson, J.D., Marra, M.A., Mardis, E.R., Fulton, L.A., Chinwalla, A.T., Pepin, K.H., Gish, W.R., Chissoe, S.L., Wendl, M.C., Delehaunty, K.D., Miner, T.L., Delehaunty, A., Kramer, J.B., Cook, L.L., Fulton, R.S., Johnson, D.L., Minx, P.J., Clifton, S.W., Hawkins, T., Branscomb, E., Predki, P., Richardson, P., Wenning, S., Slezak, T., Doggett, N., Cheng, J.F., Olsen, A., Lucas, S., Elkin, C., Uberbacher, E., Frazier, M., Gibbs, R.A., Muzny, D.M., Scherer, S.E., Bouck, J.B., Sodergren, E.J., Worley, K.C., Rives, C.M., Gorrell, J.H., Metzker, M.L., Naylor, S.L., Kucherlapati, R.S., Nelson, D.L., Weinstock, G.M., Sakaki, Y., Fujiyama, A., Hattori, M., Yada, T., Toyoda, A., Itoh, T., Kawagoe, C., Watanabe, H., Totoki, Y., Taylor, T., Weissenbach, J., Heilig, R., Saurin, W., Artiguenave, F., Brottier, P., Bruls, T., Pelletier, E., Robert, C., Wincker, P., Smith, D.R., Doucette-Stamm, L., Rubenfield, M., Weinstock, K., Lee, H.M., Dubois, J., Rosenthal, A., Platzer, M., Nyakatura, G., Taudien, S., Rump, A., Yang, H., Yu, J., Wang, J., Huang, G., Gu, J., Hood, L., Rowen, L., Madan, A., Qin, S., Davis, R.W., Federspiel, N.A., Abola, A.P., Proctor, M.J., Myers, R.M., Schmutz, J., Dickson, M., Grimwood, J., Cox, D.R., Olson, M.V., Kaul, R., Shimizu, N., Kawasaki, K., Minoshima, S., Evans, G.A., Athanasiou, M., Schultz, R., Roe, B.A., Chen, F., Pan, H., Ramser, J., Lehrach, H., Reinhardt, R., McCombie, W.R., de la Bastide, M., Dedhia, N., Blocker, H., Hornischer, K., Nordsiek, G., Agarwala, R., Aravind, L., Bailey, J.A., Bateman, A., Batzoglou, S., Birney, E., Bork, P., Brown, D.G., Burge, C.B., Cerutti, L., Chen, H.C., Church, D., Clamp, M., Copley, R.R., Doerks, T., Eddy, S.R., Eichler, E.E., Furey, T.S., Galagan, J., Gilbert, J.G., Harmon, C., Hayashizaki, Y., Haussler, D., Hermjakob, H., Hokamp, K., Jang, W., Johnson, L.S., Jones, T.A., Kasif, S., Kaspryzk, A., Kennedy, S., Kent, W.J., Kitts, P., Koonin, E.V., Korf, I., Kulp, D., Lancet, D., Lowe, T.M., McLysaght, A., Mikkelsen, T., Moran, J.V., Mulder, N., Pollara, V.J., Ponting, C.P., Schuler, G., Schultz, J., Slater, G., Smit, A.F., Stupka, E., Szustakowski, J., Thierry-Mieg, D., Thierry-Mieg, J., Wagner, L., Wallis, J., Wheeler, R., Williams, A., Wolf, Y.I., Wolfe, K.H., Yang, S.P., Yeh, R.F., Collins, F., Guyer, M.S., Peterson, J., Felsenfeld, A., Wetterstrand, K.A., Patrinos, A., Morgan, M.J., de Jong, P., Catanese, J.J., Osoegawa, K., Shizuya, H., Choi, S., Chen, Y.J.: Initial sequencing and analysis of the human genome. Nature 409, 860–921 (2001)

    Article  Google Scholar 

  3. Zhou, J., Rudd, K.E.: EcoGene 3.0. Nucleic Acids Res. 41, D613–D624 (2013)

    Google Scholar 

  4. Rigden, D.J., Galperin, M.Y.: Sequence analysis of GerM and SpoVS, uncharacterized bacterial ’sporulation’ proteins with widespread phylogenetic distribution. Bioinformatics 24, 1793–1797 (2008)

    Article  Google Scholar 

  5. Galperin, M.Y., Mekhedov, S.L., Puigbo, P., Smirnov, S., Wolf, Y.I., Rigden, D.J.: Genomic determinants of sporulation in Bacilli and Clostridia: Towards the minimal set of sporulation-specific genes. Environ. Microbiol. 14, 2870–2890 (2012)

    Article  Google Scholar 

  6. Kuznetsova, E., Proudfoot, M., Sanders, S.A., Reinking, J., Savchenko, A., Arrowsmith, C.H., Edwards, A.M., Yakunin, A.F.: Enzyme genomics: Application of general enzymatic screens to discover new enzymes. FEMS Microbiol. Rev. 29, 263–279 (2005)

    Google Scholar 

  7. Kuznetsova, E., Proudfoot, M., Gonzalez, C.F., Brown, G., Omelchenko, M.V., Borozan, I., Carmel, L., Wolf, Y.I., Mori, H., Savchenko, A.V., Arrowsmith, C.H., Koonin, E.V., Edwards, A.M., Yakunin, A.F.: Genome-wide analysis of substrate specificities of the Escherichia coli haloacid dehalogenase-like phosphatase family. J. Biol. Chem. 281, 36149–36161 (2006)

    Article  Google Scholar 

  8. Koonin, E.V., Galperin, M.Y.: Sequence - Evolution - Function. Computational Approaches in Comparative Genomics. Kluwer, Boston (2003)

    Google Scholar 

  9. Galperin, M.Y., Koonin, E.V.: From complete genome sequence to ‘complete’ understanding? Trends Biotechnol. 28, 398–406 (2010)

    Article  Google Scholar 

  10. Abhiman, S., Sonnhammer, E.L.: FunShift: A database of function shift analysis on protein subfamilies. Nucleic Acids Res. 33, D197–D200 (2005)

    Google Scholar 

  11. Mi, H., Muruganujan, A., Thomas, P.D.: PANTHER in 2013: Modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees. Nucleic Acids Res. 41, D377–D386 (2013)

    Google Scholar 

  12. Akiva, E., Brown, S., Almonacid, D.E., Barber, A.E., Custer, A.F., Hicks, M.A., Huang, C.C., Lauck, F., Mashiyama, S.T., Meng, E.C., Mischel, D., Morris, J.H., Ojha, S., Schnoes, A.M., Stryke, D., Yunes, J.M., Ferrin, T.E., Holliday, G.L., Babbitt, P.C.: The Structure-Function Linkage Database. Nucleic Acids Res. 42, D521–D530 (2014)

    Google Scholar 

  13. Copley, S.D.: Moonlighting is mainstream: Paradigm adjustment required. Bioessays 34, 578–588 (2012)

    Article  Google Scholar 

  14. Hernandez, S., Ferragut, G., Amela, I., Perez-Pons, J., Pinol, J., Mozo-Villarias, A., Cedano, J., Querol, E.: MultitaskProtDB: A database of multitasking proteins. Nucleic Acids Res. 42, D517–D520 (2014)

    Google Scholar 

  15. Tatusov, R.L., Koonin, E.V., Lipman, D.J.: A genomic perspective on protein families. Science 278, 631–637 (1997)

    Article  Google Scholar 

  16. Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.V.: The COG database: A tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)

    Article  Google Scholar 

  17. Altenhoff, A.M., Schneider, A., Gonnet, G.H., Dessimoz, C.: OMA 2011: Orthology inference among 1000 complete genomes. Nucleic Acids Res. 39, D289–D294 (2011)

    Google Scholar 

  18. Fischer, S., Brunk, B.P., Chen, F., Gao, X., Harb, O.S., Iodice, J.B., Shanmugam, D., Roos, D.S., Stoeckert, C.J.: Using OrthoMCL to assign proteins to OrthoMCL-DB groups or to cluster proteomes into new ortholog groups. Curr. Protoc. Bioinformatics ch. 6, unit 6 12 , 11–19 (2011)

    Google Scholar 

  19. Waterhouse, R.M., Tegenfeldt, F., Li, J., Zdobnov, E.M., Kriventseva, E.V.: OrthoDB: A hierarchical catalog of animal, fungal and bacterial orthologs. Nucleic Acids Res. 41, D358–D365 (2013)

    Google Scholar 

  20. Powell, S., Forslund, K., Szklarczyk, D., Trachana, K., Roth, A., Huerta-Cepas, J., Gabaldon, T., Rattei, T., Creevey, C., Kuhn, M., Jensen, L.J., von Mering, C., Bork, P.: eggnog v4.0: Nested orthology inference across 3686 organisms. Nucleic Acids Res. 42, 231–239 (2014)

    Article  Google Scholar 

  21. Datta, R.S., Meacham, C., Samad, B., Neyer, C., Sjolander, K.: Berkeley PHOG: PhyloFacts orthology group prediction web server. Nucleic Acids Res. 37, W84–W89 (2009)

    Google Scholar 

  22. Ostlund, G., Schmitt, T., Forslund, K., Kostler, T., Messina, D.N., Roopra, S., Frings, O., Sonnhammer, E.L.: InParanoid 7: New algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 38, D196–D203 (2010)

    Google Scholar 

  23. Kanehisa, M., Goto, S., Sato, Y., Kawashima, M., Furumichi, M., Tanabe, M.: Data, information, knowledge and principle: Back to metabolism in KEGG. Nucleic Acids Res. 42, D199–D205 (2014)

    Google Scholar 

  24. Galperin, M.Y., Koonin, E.V.: A diverse superfamily of enzymes with ATP-dependent carboxylate-amine/thiol ligase activity. Protein Sci. 6, 2639–2643 (1997)

    Article  Google Scholar 

  25. Galperin, M.Y., Bairoch, A., Koonin, E.V.: A superfamily of metalloenzymes unifies phosphopentomutase and cofactor- independent phosphoglycerate mutase with alkaline phosphatases and sulfatases. Protein Sci. 7, 1829–1835 (1998)

    Article  Google Scholar 

  26. Moroz, O.V., Murzin, A.G., Makarova, K.S., Koonin, E.V., Wilson, K.S., Galperin, M.Y.: Dimeric dUTPases, HisE, and MazG belong to a new superfamily of all-alpha NTP pyrophosphohydrolases with potential “house-cleaning” functions. J. Mol. Biol. 347, 243–255 (2005)

    Article  Google Scholar 

  27. Galperin, M.Y., Koonin, E.V.: Divergence and convergence in enzyme evolution. J. Biol. Chem. 287, 21–28 (2012)

    Article  Google Scholar 

  28. The UniProt Consortium: Activities at the Universal Protein Resource (UniProt). Nucleic Acids Res. 42, D191–D198 (2014)

    Google Scholar 

  29. Finn, R.D., Bateman, A., Clements, J., Coggill, P., Eberhardt, R.Y., Eddy, S.R., Heger, A., Hetherington, K., Holm, L., Mistry, J., Sonnhammer, E.L., Tate, J., Punta, M.: Pfam: The protein families database. Nucleic Acids Res. 42, D222–D230 (2014)

    Google Scholar 

  30. Huynen, M.A., Snel, B.: Gene and context: Integrative approaches to genome analysis. Adv. Protein Chem. 54, 345–379 (2000)

    Article  Google Scholar 

  31. Galperin, M.Y., Koonin, E.V.: Who’s your neighbor? New computational approaches for functional genomics. Nat. Biotechnol. 18, 609–613 (2000)

    Article  Google Scholar 

  32. Marcotte, E.M., Pellegrini, M., Ng, H.L., Rice, D.W., Yeates, T.O., Eisenberg, D.: Detecting protein function and protein-protein interactions from genome sequences. Science 285, 751–753 (1999)

    Article  Google Scholar 

  33. Marcotte, E.M., Pellegrini, M., Thompson, M.J., Yeates, T.O., Eisenberg, D.: A combined algorithm for genome-wide prediction of protein function. Nature 402, 83–86 (1999)

    Article  Google Scholar 

  34. Pellegrini, M., Marcotte, E.M., Thompson, M.J., Eisenberg, D., Yeates, T.O.: Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles. Proc. Natl. Acad. Sci. USA 96, 4285–4288 (1999)

    Article  Google Scholar 

  35. Overbeek, R., Begley, T., Butler, R.M., Choudhuri, J.V., Chuang, H.Y., Cohoon, M., de Crecy-Lagard, V., Diaz, N., Disz, T., Edwards, R., Fonstein, M., Frank, E.D., Gerdes, S., Glass, E.M., Goesmann, A., Hanson, A., Iwata-Reuyl, D., Jensen, R., Jamshidi, N., Krause, L., Kubal, M., Larsen, N., Linke, B., McHardy, A.C., Meyer, F., Neuweger, H., Olsen, G., Olson, R., Osterman, A., Portnoy, V., Pusch, G.D., Rodionov, D.A., Ruckert, C., Steiner, J., Stevens, R., Thiele, I., Vassieva, O., Ye, Y., Zagnitko, O., Vonstein, V.: The subsystems approach to genome annotation and its use in the project to annotate 1000 genomes. Nucleic Acids Res. 33, 5691–5702 (2005)

    Article  Google Scholar 

  36. Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G.D., Maltsev, N.: The use of contiguity on the chromosome to predict functional coupling. Silico Biol. 1 (1998)

    Google Scholar 

  37. Overbeek, R., Fonstein, M., D’Souza, M., Pusch, G.D., Maltsev, N.: The use of gene clusters to infer functional coupling. Proc. Natl. Acad. Sci. USA 96, 2896–2901 (1999)

    Article  Google Scholar 

  38. Gaasterland, T., Ragan, M.A.: Microbial genescapes: Phyletic and functional patterns of ORF distribution among prokaryotes. Microb. Comp. Genomics 3, 199–217 (1998)

    Article  Google Scholar 

  39. Rogozin, I.B., Makarova, K.S., Murvai, J., Czabarka, E., Wolf, Y.I., Tatusov, R.L., Szekely, L.A., Koonin, E.V.: Connected gene neighborhoods in prokaryotic genomes. Nucleic Acids Res. 30, 2212–2223 (2002)

    Article  Google Scholar 

  40. Rogozin, I.B., Makarova, K.S., Wolf, Y.I., Koonin, E.V.: Computational approaches for the analysis of gene neighbourhoods in prokaryotic genomes. Brief Bioinform. 5, 131–149 (2004)

    Article  Google Scholar 

  41. Wolf, Y.I., Rogozin, I.B., Kondrashov, A.S., Koonin, E.V.: Genome alignment, evolution of prokaryotic genome organization, and prediction of gene function using genomic context. Genome Res. 11, 356–372 (2001)

    Article  Google Scholar 

  42. Yanai, I., Mellor, J.C., DeLisi, C.: Identifying functional links between genes using conserved chromosomal proximity. Trends Genet. 18, 176–179 (2002)

    Article  Google Scholar 

  43. Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., Jensen, L.J.: STRING v9.1: Protein-protein interaction networks, with increased coverage and integration. Nucleic Acids Res. 41, 808–815 (2013)

    Article  Google Scholar 

  44. Koonin, E.V., Wolf, Y.I.: Genomics of bacteria and archaea: The emerging dynamic view of the prokaryotic world. Nucleic Acids Res. 36, 6688–6719 (2008)

    Article  Google Scholar 

  45. Natale, D.A., Galperin, M.Y., Tatusov, R.L., Koonin, E.V.: Using the COG database to improve gene recognition in complete genomes. Genetica 108, 9–17 (2000)

    Article  Google Scholar 

  46. Koonin, E.V., Mushegian, A.R., Bork, P.: Non-orthologous gene displacement. Trends Genet. 12, 334–336 (1996)

    Article  Google Scholar 

  47. Schmitt, T., Ogris, C., Sonnhammer, E.L.: FunCoup 3.0: Database of genome-wide functional coupling networks. Nucleic Acids Res. 42, 380–388 (2014)

    Article  Google Scholar 

  48. Koonin, E.V., Galperin, M.Y.: Prokaryotic genomes: The emerging paradigm of genome-based microbiology. Curr. Opin. Genet. Dev. 7, 757–763 (1997)

    Article  Google Scholar 

  49. Osterman, A., Overbeek, R.: Missing genes in metabolic pathways: A comparative genomics approach. Curr. Opin. Chem. Biol. 7, 238–251 (2003)

    Article  Google Scholar 

  50. Overbeek, R., Olson, R., Pusch, G.D., Olsen, G.J., Davis, J.J., Disz, T., Edwards, R.A., Gerdes, S., Parrello, B., Shukla, M., Vonstein, V., Wattam, A.R., Xia, F., Stevens, R.: The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST). Nucleic Acids Res. 42, D206–D214 (2014)

    Google Scholar 

  51. Rodionov, D.A., Mironov, A.A., Gelfand, M.S.: Transcriptional regulation of pentose utilisation systems in the Bacillus/Clostridium group of bacteria. FEMS Microbiol. Lett. 205, 305–314 (2001)

    Article  Google Scholar 

  52. Rodionov, D.A., Vitreschak, A.G., Mironov, A.A., Gelfand, M.S.: Comparative genomics of thiamin biosynthesis in procaryotes. New genes and regulatory mechanisms. J. Biol. Chem. 277, 48949–48959 (2002)

    Article  Google Scholar 

  53. Mironov, A.A., Koonin, E.V., Roytberg, M.A., Gelfand, M.S.: Computer analysis of transcription regulatory patterns in completely sequenced bacterial genomes. Nucleic Acids Res. 27, 2981–2989 (1999)

    Article  Google Scholar 

  54. Gelfand, M.S., Koonin, E.V., Mironov, A.A.: Prediction of transcription regulatory sites in Archaea by a comparative genomic approach. Nucleic Acids Res. 28, 695–705 (2000)

    Article  Google Scholar 

  55. Gelfand, M.S.: Recognition of regulatory sites by genomic comparison. Res. Microbiol. 150, 755–771 (1999)

    Article  Google Scholar 

  56. Rodionov, D.A., Novichkov, P.S., Stavrovskaya, E.D., Rodionova, I.A., Li, X., Kazanov, M.D., Ravcheev, D.A., Gerasimova, A.V., Kazakov, A.E., Kovaleva, G.Y., Permina, E.A., Laikova, O.N., Overbeek, R., Romine, M.F., Fredrickson, J.K., Arkin, A.P., Dubchak, I., Osterman, A.L., Gelfand, M.S.: Comparative genomic reconstruction of transcriptional networks controlling central metabolism in the Shewanella genus. BMC Genomics 12(suppl. 1), S3 (2011)

    Google Scholar 

  57. Rodionov, D.A., Dubchak, I.L., Arkin, A.P., Alm, E.J., Gelfand, M.S.: Dissimilatory metabolism of nitrogen oxides in bacteria: Comparative reconstruction of transcriptional networks. PLoS Comput. Biol. 1, e55 (2005)

    Google Scholar 

  58. Tsoy, O.V., Pyatnitskiy, M.A., Kazanov, M.D., Gelfand, M.S.: Evolution of transcriptional regulation in closely related bacteria. BMC Evol. Biol. 12, 200 (2012)

    Article  Google Scholar 

  59. Gelfand, M.S.: Evolution of transcriptional regulatory networks in microbial genomes. Curr. Opin. Struct. Biol. 16, 420–429 (2006)

    Article  Google Scholar 

  60. Thompson, W., Rouchka, E.C., Lawrence, C.E.: Gibbs Recursive Sampler: Finding transcription factor binding sites. Nucleic Acids Res. 31, 3580–3585 (2003)

    Article  Google Scholar 

  61. Thompson, W., McCue, L.A., Lawrence, C.E.: Using the Gibbs motif sampler to find conserved domains in DNA and protein sequences. Curr. Protoc. Bioinformatics ch. 2, unit 2 8 (2005)

    Google Scholar 

  62. Novichkov, P.S., Rodionov, D.A., Stavrovskaya, E.D., Novichkova, E.S., Kazakov, A.E., Gelfand, M.S., Arkin, A.P., Mironov, A.A., Dubchak, I.: RegPredict: An integrated system for regulon inference in prokaryotes by comparative genomics approach. Nucleic Acids Res. 38, W299–W307 (2010)

    Google Scholar 

  63. Thompson, W.A., Newberg, L.A., Conlan, S., McCue, L.A., Lawrence, C.E.: The Gibbs Centroid Sampler. Nucleic Acids Res. 35, W232–W237 (2007)

    Google Scholar 

  64. Newberg, L.A., Thompson, W.A., Conlan, S., Smith, T.M., McCue, L.A., Lawrence, C.E.: A phylogenetic Gibbs sampler that yields centroid solutions for cis-regulatory site prediction. Bioinformatics 23, 1718–1727 (2007)

    Article  Google Scholar 

  65. Novichkov, P.S., Kazakov, A.E., Ravcheev, D.A., Leyn, S.A., Kovaleva, G.Y., Sutormin, R.A., Kazanov, M.D., Riehl, W., Arkin, A.P., Dubchak, I., Rodionov, D.A.: RegPrecise 3.0–a resource for genome-scale exploration of transcriptional regulation in bacteria. BMC Genomics 14, 745 (2013)

    Article  Google Scholar 

  66. Cipriano, M.J., Novichkov, P.N., Kazakov, A.E., Rodionov, D.A., Arkin, A.P., Gelfand, M.S., Dubchak, I.: RegTransBase–a database of regulatory sequences and interactions based on literature: A resource for investigating transcriptional regulation in prokaryotes. BMC Genomics 14, 213 (2013)

    Article  Google Scholar 

  67. Enright, A.J., Illopoulos, I., Kyrpides, N.C., Ouzounis, C.A.: Protein interaction maps for complete genomes based on gene fusion events. Nature 402, 86–90 (1999)

    Article  Google Scholar 

  68. Doolittle, R.F.: Do you dig my groove? Nat. Genet. 23, 6–8 (1999)

    Article  Google Scholar 

  69. Hunter, S., Jones, P., Mitchell, A., Apweiler, R., Attwood, T.K., Bateman, A., Bernard, T., Binns, D., Bork, P., Burge, S., de Castro, E., Coggill, P., Corbett, M., Das, U., Daugherty, L., Duquenne, L., Finn, R.D., Fraser, M., Gough, J., Haft, D., Hulo, N., Kahn, D., Kelly, E., Letunic, I., Lonsdale, D., Lopez, R., Madera, M., Maslen, J., McAnulla, C., McDowall, J., McMenamin, C., Mi, H., Mutowo-Muellenet, P., Mulder, N., Natale, D., Orengo, C., Pesseat, S., Punta, M., Quinn, A.F., Rivoire, C., Sangrador-Vegas, A., Selengut, J.D., Sigrist, C.J., Scheremetjew, M., Tate, J., Thimmajanarthanan, M., Thomas, P.D., Wu, C.H., Yeats, C., Yong, S.Y.: InterPro in 2011: New developments in the family and domain prediction database. Nucleic Acids Res. 40, D306–D312 (2012)

    Google Scholar 

  70. Marchler-Bauer, A., Zheng, C., Chitsaz, F., Derbyshire, M.K., Geer, L.Y., Geer, R.C., Gonzales, N.R., Gwadz, M., Hurwitz, D.I., Lanczycki, C.J., Lu, F., Lu, S., Marchler, G.H., Song, J.S., Thanki, N., Yamashita, R.A., Zhang, D., Bryant, S.H.: CDD: Conserved domains and protein three-dimensional structure. Nucleic Acids Res. 41, D348–D352 (2013)

    Google Scholar 

  71. Suhre, K., Claverie, J.M.: FusionDB: A database for in-depth analysis of prokaryotic gene fusion events. Nucleic Acids Res. 32, D273–D276 (2004)

    Google Scholar 

  72. Galperin, M.Y.: Diversity of structure and function of response regulator output domains. Curr. Opin. Microbiol. 13, 150–159 (2010)

    Article  Google Scholar 

  73. Basu, M.K., Carmel, L., Rogozin, I.B., Koonin, E.V.: Evolution of protein domain promiscuity in eukaryotes. Genome Res. 18, 449–461 (2008)

    Article  Google Scholar 

  74. Mosca, R., Ceol, A., Stein, A., Olivella, R., Aloy, P.: 3did: A catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Res. 42, D374–D379 (2014)

    Google Scholar 

  75. Finn, R.D., Miller, B.L., Clements, J., Bateman, A.: iPfam: A database of protein family and domain interactions found in the Protein Data Bank. Nucleic Acids Res. 42, D364–D373 (2014)

    Google Scholar 

  76. Raghavachari, B., Tasneem, A., Przytycka, T.M., Jothi, R.: DOMINE: A database of protein domain interactions. Nucleic Acids Res. 36, D656–D661 (2008)

    Google Scholar 

  77. Luo, Q., Pagel, P., Vilne, B., Frishman, D.: DIMA 3.0: Domain Interaction Map. Nucleic Acids Res. 39, D724–D729 (2011)

    Google Scholar 

  78. Licata, L., Briganti, L., Peluso, D., Perfetto, L., Iannuccelli, M., Galeota, E., Sacco, F., Palma, A., Nardozza, A.P., Santonico, E., Castagnoli, L., Cesareni, G.: MINT, the molecular interaction database: 2012 update. Nucleic Acids Res. 40, D857–D861 (2012)

    Google Scholar 

  79. Kerrien, S., Aranda, B., Breuza, L., Bridge, A., Broackes-Carter, F., Chen, C., Duesbury, M., Dumousseau, M., Feuermann, M., Hinz, U., Jandrasits, C., Jimenez, R.C., Khadake, J., Mahadevan, U., Masson, P., Pedruzzi, I., Pfeiffenberger, E., Porras, P., Raghunath, A., Roechert, B., Orchard, S., Hermjakob, H.: The IntAct molecular interaction database in 2012. Nucleic Acids Res. 40, D841–D846 (2012)

    Google Scholar 

  80. Orchard, S., Ammari, M., Aranda, B., Breuza, L., Briganti, L., Broackes-Carter, F., Campbell, N.H., Chavali, G., Chen, C., Del-Torn, N., Duesbury, M., Dumousseau, M., Galeota, E., Hinz, U., Iannuccelli, M., Jagannathan, S., Jimenez, R., Khadake, J., Lagreid, A., Licata, L., Lovering, R.C., Meldal, B., Melidoni, A.N., Milagros, M., Peluso, D., Perfetto, L., Porras, P., Raghunath, A., Ricard-Blum, S., Roechert, B., Stutz, A., Tognolli, M., van Roey, K., Cesareni, G., Hermjakob, H.: The MIntAct project–IntAct as a common curation platform for 11 molecular interaction databases. Nucleic Acids Res. 42, D358–D363 (2014)

    Google Scholar 

  81. Patil, A., Nakai, K., Nakamura, H.: HitPredict: A database of quality assessed protein-protein interactions in nine species. Nucleic Acids Res. 39, D744–D749 (2011)

    Google Scholar 

  82. Salwinski, L., Miller, C.S., Smith, A.J., Pettit, F.K., Bowie, J.U., Eisenberg, D.: The Database of Interacting Proteins: 2004 update. Nucleic Acids Res. 32, D449–D451 (2004)

    Google Scholar 

  83. Benson, M.L., Smith, R.D., Khazanov, N.A., Dimcheff, B., Beaver, J., Dresslar, P., Nerothin, J., Carlson, H.A.: Binding MOAD, a high-quality protein-ligand database. Nucleic Acids Res. 36, D674–D678 (2008)

    Google Scholar 

  84. Chatr-Aryamontri, A., Breitkreutz, B.J., Heinicke, S., Boucher, L., Winter, A., Stark, C., Nixon, J., Ramage, L., Kolas, N., O’Donnell, L., Reguly, T., Breitkreutz, A., Sellam, A., Chen, D., Chang, C., Rust, J., Livstone, M., Oughtred, R., Dolinski, K., Tyers, M.: The BioGRID interaction database: 2013 update. Nucleic Acids Res. 41, D816–D823 (2013)

    Google Scholar 

  85. Fernandez-Suarez, X.M., Rigden, D.J., Galperin, M.Y.: The 2014 Nucleic Acids Research Database Issue and an updated NAR online Molecular Biology Database Collection. Nucleic Acids Res. 42, D1–D6 (2014)

    Google Scholar 

  86. Blohm, P., Frishman, G., Smialowski, P., Goebels, F., Wachinger, B., Ruepp, A., Frishman, D.: Negatome 2.0: A database of non-interacting proteins derived by literature mining, manual annotation and protein structure analysis. Nucleic Acids Res. 42, D396–D400 (2014)

    Google Scholar 

  87. Angiuoli, S.V., Gussman, A., Klimke, W., Cochrane, G., Field, D., Garrity, G., Kodira, C.D., Kyrpides, N., Madupu, R., Markowitz, V., Tatusova, T., Thomson, N., White, O.: Toward an online repository of Standard Operating Procedures (SOPs) for (meta)genomic annotation. OMICS 12, 137–141 (2008)

    Article  Google Scholar 

  88. Glasner, J.D., Plunkett, G., Anderson, B.D., Baumler, D.J., Biehl, B.S., Burland, V., Cabot, E.L., Darling, A.E., Mau, B., Neeno-Eckwall, E.C., Pot, D., Qiu, Y., Rissman, A.I., Worzella, S., Zaremba, S., Fedorko, J., Hampton, T., Liss, P., Rusch, M., Shaker, M., Shaull, L., Shetty, P., Thotakura, S., Whitmore, J., Blattner, F.R., Greene, J.M., Perna, N.T.: Enteropathogen Resource Integration Center (ERIC): bioinformatics support for research on biodefense-relevant enterobacteria. Nucleic Acids Res. 36, D519–D523 (2008)

    Google Scholar 

  89. Kolker, E., Picone, A.F., Galperin, M.Y., Romine, M.F., Higdon, R., Makarova, K.S., Kolker, N., Anderson, G.A., Qiu, X., Auberry, K.J., Babnigg, G., Beliaev, A.S., Edlefsen, P., Elias, D.A., Gorby, Y.A., Holzman, T., Klappenbach, J.A., Konstantinidis, K.T., Land, M.L., Lipton, M.S., McCue, L.A., Monroe, M., Pasa-Tolic, L., Pinchuk, G., Purvine, S., Serres, M.H., Tsapin, S., Zakrajsek, B.A., Zhu, W., Zhou, J., Larimer, F.W., Lawrence, C.E., Riley, M., Collart, F.R., Yates, J.R., Smith, R.D., Giometti, C.S., Nealson, K.H., Fredrickson, J.K., Tiedje, J.M.: Global profiling of Shewanella oneidensis MR-1: Expression of hypothetical genes and improved functional annotations. Proc. Natl. Acad. Sci. USA 102, 2099–2104 (2005)

    Article  Google Scholar 

  90. Pedruzzi, I., Rivoire, C., Auchincloss, A.H., Coudert, E., Keller, G., de Castro, E., Baratin, D., Cuche, B.A., Bougueleret, L., Poux, S., Redaschi, N., Xenarios, I., Bridge, A.: HAMAP in 2013, new developments in the protein family classification and annotation system. Nucleic Acids Res 41, D584–D589 (2013)

    Google Scholar 

  91. Roberts, R.J., Chang, Y.C., Hu, Z., Rachlin, J.N., Anton, B.P., Pokrzywa, R.M., Choi, H.P., Faller, L.L., Guleria, J., Housman, G., Klitgord, N., Mazumdar, V., McGettrick, M.G., Osmani, L., Swaminathan, R., Tao, K.R., Letovsky, S., Vitkup, D., Segre, D., Salzberg, S.L., Delisi, C., Steffen, M., Kasif, S.: COMBREX: A project to accelerate the functional annotation of prokaryotic genomes. Nucleic Acids Res. 39, D11–D14 (2011)

    Google Scholar 

  92. Anton, B.P., Chang, Y.C., Brown, P., Choi, H.P., Faller, L.L., Guleria, J., Hu, Z., Klitgord, N., Levy-Moonshine, A., Maksad, A., Mazumdar, V., McGettrick, M., Osmani, L., Pokrzywa, R., Rachlin, J., Swaminathan, R., Allen, B., Housman, G., Monahan, C., Rochussen, K., Tao, K., Bhagwat, A.S., Brenner, S.E., Columbus, L., de Crecy-Lagard, V., Ferguson, D., Fomenkov, A., Gadda, G., Morgan, R.D., Osterman, A.L., Rodionov, D.A., Rodionova, I.A., Rudd, K.E., Soll, D., Spain, J., Xu, S.Y., Bateman, A., Blumenthal, R.M., Bollinger, J.M., Chang, W.S., Ferrer, M., Friedberg, I., Galperin, M.Y., Gobeill, J., Haft, D., Hunt, J., Karp, P., Klimke, W., Krebs, C., Macelis, D., Madupu, R., Martin, M.J., Miller, J.H., O’Donovan, C., Palsson, B., Ruch, P., Setterdahl, A., Sutton, G., Tate, J., Yakunin, A., Tchigvintsev, D., Plata, G., Hu, J., Greiner, R., Horn, D., Sjolander, K., Salzberg, S.L., Vitkup, D., Letovsky, S., Segre, D., DeLisi, C., Roberts, R.J., Steffen, M., Kasif, S.: The COMBREX project: Design, methodology, and initial results. PLoS Biol. 11, e1001638 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Galperin, M.Y., Koonin, E.V. (2014). Comparative Genomics Approaches to Identifying Functionally Related Genes. In: Dediu, AH., Martín-Vide, C., Truthe, B. (eds) Algorithms for Computational Biology. AlCoB 2014. Lecture Notes in Computer Science(), vol 8542. Springer, Cham. https://doi.org/10.1007/978-3-319-07953-0_1

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-07953-0_1

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-07952-3

  • Online ISBN: 978-3-319-07953-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics