Advertisement

Infect-DB—A Data Warehouse Approach for Integrating Genomic Data of Infectious Diseases

  • Shakuntala Baichoo
  • Zahra Mungloo-Dilmohamud
  • Parinita Ujoodha
  • Veeresh Ramphull
  • Yasmina Jaufeerally-Fakim
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 863)

Abstract

With the expansion of biological data sources available online, integration is a major challenge facing researchers wishing to explore this information. Users often need to integrate data derived from multiple, diverse and heterogeneous sources for investigation. This paper presents the features of Infect-DB, a data warehouse that can localize and integrate genomes of pathogenic species, retrieved from NCBI, based on information from the American Biological Safety Association (ABSA). The list of bacteria and their corresponding host specificity were programmatically accessed from ABSA and integrated into Infect-DB. The list of organisms obtained from ABSA was used to target the automated download of corresponding genomes from the NCBI FTP site. Infect-DB provides a set of analysis tools, including a comparison of genomes using local-BLAST, dN/dS analysis, multiple sequence alignment, phylogenetic analysis and visualization tools. To date, Infect-DB has integrated 854 bacterial genomes from 207 genera considered as important pathogens causing infectious diseases.

Keywords

Data warehouse Genome comparison Infectious diseases Analysis tools Bacterial genomes 

References

  1. 1.
    Benediktsson, O., Dalcher, D., Reed, K., Woodman, M.: COCOMO-based effort estimation for iterative and incremental software development. Softw. Qual. J. 11, 265–281 (2003)CrossRefGoogle Scholar
  2. 2.
    Triplet, T., Butler, G.: A review of genomic data warehousing systems. Brief. Bioinform. 15, 471–483 (2014)CrossRefGoogle Scholar
  3. 3.
    Ramharack, P., Soliman, M.E.S.: Bioinformatics-based tools in drug discovery: the cartography from single gene to integrative biological networks. Drug Discov. Today (2018)Google Scholar
  4. 4.
    Shah, S.P., Huang, Y., Xu, T., Yuen, M.M.S., Ling, J., Ouellette, B.F.F.: Atlas—a data warehouse for integrative bioinformatics. BMC Bioinf. 6, 1–16 (2005)CrossRefGoogle Scholar
  5. 5.
    Topel, T., Kormeier, B., Klassen, A., Hofestadt, R.: BioDWH: a data warehouse kit for life science data integration. J. Integr. Bioinform. 5 (2008)Google Scholar
  6. 6.
    Brittnacher, M.J., Fong, C., Hayden, H.S., Jacobs, M.A., Radey, M., Rohmer, L.: PGAT : A Multistrain Analysis Resource for Microbial Genomes, vol. 27, pp. 2429–2430 (2011)Google Scholar
  7. 7.
    Markowitz, V.M.: The integrated microbial genomes (IMG) system. Nucleic Acids Res. 34, D344–D348 (2006)CrossRefGoogle Scholar
  8. 8.
    Markowitz, V.M., Chen, I.M.A., Palaniappan, K., Chu, K., Szeto, E., Pillay, M., Ratner, A., Huang, J., Woyke, T., Huntemann, M., Anderson, I., Billis, K., Varghese, N., Mavromatis, K., Pati, A., Ivanova, N.N., Kyrpides, N.C.: IMG 4 version of the integrated microbial genomes comparative analysis system. Nucleic Acids Res. 42, 560–567 (2014)CrossRefGoogle Scholar
  9. 9.
    Mayor, C., Brudno, M., Schwartz, J.R., Poliakov, A., Rubin, E.M., Frazer, K.A., Pachter, L.S., Dubchak, I.: Vista: visualizing global DNA sequence alignments of arbitrary length. Bioinformatics 16, 1046–1047 (2000)CrossRefGoogle Scholar
  10. 10.
    Carver, T., Berriman, M., Tivey, A., Patel, C., Böhme, U., Barrell, B.G., Parkhill, J., Rajandream, M.A.: Artemis and ACT: viewing, annotating and comparing sequences stored in a relational database. Bioinformatics 24, 2672–2676 (2008)CrossRefGoogle Scholar
  11. 11.
    Dehal, P.S., Joachimiak, M.P., Price, M.N., Bates, J.T., Baumohl, J.K., Chivian, D., Friedland, G.D., Huang, K.H., Keller, K., Novichkov, P.S., Dubchak, I.L., Alm, E.J., Arkin, A.P.: Microbesonline: an integrated portal for comparative and functional genomics. Nucleic Acids Res. 38, 396–400 (2009)CrossRefGoogle Scholar
  12. 12.
    Pruitt, K.D., Tatusova, T., Maglott, D.R.: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 35, 501–504 (2007)CrossRefGoogle Scholar
  13. 13.
    Sayers, E.W., Barrett, T., Benson, D.A., Bolton, E., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Landsman, D., Lipman, D.J., Lu, Z., Madden, T.L., Madej, T., Maglott, D.R., Marchler-Bauer, A., Miller, V., Mizrachi, I., Ostell, J., Panchenko, A., Pruitt, K.D., Schuler, G.D., Sequeira, E., Sherry, S.T., Shumway, M., Sirotkin, K., Slotta, D., Souvorov, A., Starchenko, G., Tatusova, T.A., Wagner, L., Wang, Y., John Wilbur, W., Yaschenko, E., Ye, J.: Database resources of the national center for biotechnology information. Nucleic Acids Res. 38, 5–16 (2009)CrossRefGoogle Scholar
  14. 14.
    Vallenet, D., Engelen, S., Mornico, D., Cruveiller, S., Fleury, L., Lajus, A., Rouy, Z., Roche, D., Salvignol, G., Scarpelli, C., MeDigue, C.: Microscope: a platform for microbial genome annotation and comparative genomics. Database 2009, 1–12 (2009)CrossRefGoogle Scholar
  15. 15.
    Vallenet, D., Belda, E., Alexandra, C., Cruveiller, S., Engelen, S., Lajus, A., Le Fevre, F., Longin, C., Mornico, D., Roche, D., Rouy, Z., Salvignol, G., Scarpelli, C., Smith, A.A.T., Weiman, M., Medigue, C.: MicroScope—an Integrated Microbial Resource for the Curation and Comparative Analysis of Genomic and Metabolic Data, vol. 41, pp. 636–647 (2013)Google Scholar
  16. 16.
    Vallenet, D., Calteau, A., Cruveiller, S., Gachet, M., Lajus, A., Josso, A., Mercier, J., Renaux, A., Rollin, J., Rouy, Z., Roche, D., Scarpelli, C., Medigue, C.: Microscope in 2017: an expanding and evolving integrated resource for community expertise of microbial genomes. Nucleic Acids Res. 45, D517–D528 (2017)CrossRefGoogle Scholar
  17. 17.
    Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., Dicuccio, M., Edgar, R., Federhen, S., Feolo, M., Geer, L.Y., Helmberg, W., Kapustin, Y., Khovayko, O., Landsman, D., Lipman, D.J., Madden, T.L., Maglott, D.R., Miller, V., Ostell, J., Pruitt, K.D., Schuler, G.D., Shumway, M., Sequeira, E., Sherry, S.T., Sirotkin, K., Souvorov, A., Starchenko, G., Tatusov, R.L., Tatusova, T.A., Wagner, L., Yaschenko, E.: Database resources of the national center for biotechnology information. Nucleic Acids Res. 36, 13–21 (2008)CrossRefGoogle Scholar
  18. 18.
    Kanz, C., Aldebert, P., Althorpe, N., Baker, W., Baldwin, A., Bates, K., Browne, P., van den Broek, A., Castro, M., Cochrane, G., Duggan, K., Eberhardt, R., Faruque, N., Gamble, J., Garcia Diez, F., Harte, N., Kulikova, T., Lin, Q., Lombard, V., Lopez, R., Mancuso, R., McHale, M., Nardone, F., Silventoinen, V., Sobhany, S., Stoehr, P., Tuli, M.A., Tzouvara, K., Vaughan, R., Wu, D., Zhu, W., Apweiler, R.: The EMBL nucleotide sequence database. Nucleic Acids Res. 33, 29–33 (2005)CrossRefGoogle Scholar
  19. 19.
    Miyazaki, S., Sugawara, H., Ikeo, K., Gojobori, T., Tateno, Y.: DDBJ in the stream of various biological data. Nucleic Acids Res. 32, D31–D34 (2004)CrossRefGoogle Scholar
  20. 20.
    Petkau, A., Stuart-Edwards, M., Stothard, P., van Domselaar, G.: Interactive microbial genome visualization with GView. Bioinformatics 26, 3125–3126 (2010)CrossRefGoogle Scholar
  21. 21.
    Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic Local Alignment Search Tool. http://www.ncbi.nlm.nih.gov/pubmed/2231712%5Cn, http://www.cmu.edu/bio/education/courses/03510/LectureNotes/Altschul1990.pdf (1990)
  22. 22.
    Page, R.D.M.: Visualizing phylogenetic trees using Treeview. Curr. Protoc. Bioinformatics. Chapter 6, Unit 6.2 (2002)Google Scholar
  23. 23.
    Steinway, S.N., Dannenfelser, R., Laucius, C.D., Hayes, J.E., Nayak, S.: JCoDA: a tool for detecting evolutionary selection. BMC Bioinformatics 11, 1–9 (2010)CrossRefGoogle Scholar
  24. 24.
    Abbott, J.C., Aanensen, D.M., Rutherford, K., Butcher, S., Spratt, B.G.: WebACT–an online companion for the artemis comparison tool. Bioinformatics 21, 3665–3666 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2019

Authors and Affiliations

  • Shakuntala Baichoo
    • 1
  • Zahra Mungloo-Dilmohamud
    • 1
  • Parinita Ujoodha
    • 1
  • Veeresh Ramphull
    • 1
  • Yasmina Jaufeerally-Fakim
    • 1
  1. 1.University of Mauritius, ReduitMokaMauritius

Personalised recommendations