Skip to main content

Using GenBank

  • Protocol
Plant Bioinformatics

Part of the book series: Methods in Molecular Biology™ ((MIMB,volume 406))

  • 1838 Accesses

Summary

GenBank(R) is a comprehensive database of publicly available DNA sequences for more than 205,000 named organisms and for more than 60,000 within the embryophyta, obtained through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Daily data exchange with the European Molecular Biology Laboratory (EMBL) in Europe and the DNA Data Bank of Japan ensures worldwide coverage. GenBank is accessible through the National Center for Biotechnology Information (NCBI) retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases with taxonomy, genome, mapping, protein structure, and domain information and the biomedical journal literature through PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available through FTP. GenBank usage scenarios ranging from local analyses of the data available through FTP to online analyses supported by the NCBI Web-based tools are discussed. To access GenBank and its related retrieval and analysis services, go to the NCBI Homepage at http://www.ncbi.nlm.nih.gov.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reference

  1. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., and Wheeler, D.L. (2006) Genbank. Nucleic Acids Res., 34, 16–20.

    Article  Google Scholar 

  2. Mizrachi, I. (2004) Genbank, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  3. Cochrane, G., Aldebert, P., Althorpe, N., Andersson, M., Baker, W., Baldwin, A., Bates, K., Bhattacharyya, S., Browne, P., van den Broek, A., Castro, M., Duggan, K., Eberhardt, R., Faruque, N., Gamble, J., Kanz, C., Kulikova, T., Lee, C., Leinonen, R., Lin, Q., Lombard, V., Lopez, R., McHale, M., McWilliam, H., Mukherjee, G., Nardone, F., Pastor, M.P., Sobhany, S., Stoehr, P., Tzouvara, K., Vaughan, R., Wu, D., Zhu, W., and Apweiler, R. (2006) EMBL nucleotide sequence database: developments in 2005. Nucleic Acids Res., 34,10–15.

    Article  Google Scholar 

  4. Ohyanagi, H., Tanaka, T., Sakai, H., Shigemoto, Y., Yamaguchi, K., Habara, T., Fujii, Y., Antonio, B.A., Nagamura, Y., Imanishi, T., Ikeo, K., Itoh, T., Gojobori, T., and Sasaki, T. (2006) The rice annotation project database (rap-db): hub for Oryza sativa ssp. japonica genome information. Nucleic Acids Res., 34, 741–744.

    Article  Google Scholar 

  5. Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., Geer, L.Y., Helmberg, W., Kapustin, Y., Kenton, D.L., Khovayko, O., Lipman, D.J., Madden, T.L., Maglott, D.R., Ostell, J., Pruitt, K.D., Schuler, G.D., Schriml, L.M., Sequeira, E., Sherry, S.T., Sirotkin, K., Souvorov, A., Starchenko, G., Suzek, T.O., Tatusov, T., Tatusova, T.A., Wagner, L., and Yaschenko, E. (2006) Database resources of the national center for biotechnology information. Nucleic Acids Res., 34, 173–180.

    Article  Google Scholar 

  6. Federhen, S. (2003) The taxonomy project, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  7. The Arabidopsis Genome Initiative. (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 408, 796–815.

    Google Scholar 

  8. Wang, J., Wong, G.K., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., Yu, J., and Hu, S. (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science, 296, 79–92.

    Google Scholar 

  9. Yamamoto, K., Sakata, K., Baba, T., Katayose, Y., Wu, J., Niimura, Y., Cheng, Z., Nagamura, Y., Sasaki, T., and Matsumoto, T. (2002) The genome sequence and structure of rice chromosome 1. Nature, 420, 312–316.

    Article  PubMed  Google Scholar 

  10. Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno. R.F., Adams, M.D., and Kelley, J.M. (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science, 252, 1651–1656.

    Article  PubMed  Google Scholar 

  11. Tolstoshev, C.M., Boguski, M.S, and Lowe, T.M. (1993) dbEST–database for expressed sequence tags. Nat. Genet., 4, 332–333.

    Article  PubMed  Google Scholar 

  12. Boguski, M.S. (1995) The turning point in genome research. Trends Biochem. Sci., 20, 295–296.

    Article  CAS  PubMed  Google Scholar 

  13. Wagner, L., Pontius, J.U., and Schuler, G.D. (2003) Unigene: A unified view of the transcriptome in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  14. Kitts, A., and Sherry, S. (2003) The single nucleotide polymorphism database (dbSNP) of nucleotide sequence variation, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  15. Ostell, J.M. (2003) The entrez search and retrieval system, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  16. Anderson, J., Fedorova, N., DeWeese-Scott, C., Geer, L.Y., Hurwitz, D., Jackson, J.J., Jacobs, A., Lanczycki, C., Liebert, C., and Marchler-Bauer, A. (2005) MMdb: Entrez’s 3D-structure database. Nucleic Acids Res., 33, D192–D196.

    Article  PubMed  Google Scholar 

  17. Sayers, E., and Bryant, S. (2003) Macromolecular structure databases, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  18. Jentsch, J., Canese, K., and Myers, C. Pubmed: the bibliographic database, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  19. Beck, J., and Sequeira, E. (2003) Pubmed central (PMC): an archive for literature from life sciences journals, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  20. Madden, T. (2003) The blast sequence analysis tool, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  21. Kwan, K. Linkout: linking to external resources from entrez databases, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  22. Jaiswal, P., Ni, J., Yap, I., Ware, D., Spooner, W., Youens-Clark, K., Ren, L., Liang, C., Zhao, W., Ratnapu, K., Faga, B., Canaran, P., Fogleman, M., Hebbard, C., Avraham, S., Schmidt, S., Casstevens, T.M., Buckler, E.S., Stein, L., and McCouch S. (2006) Gramene: a bird’s eye view of cereal genomes. Nucleic Acids Res., 34, D717–D723.

    Article  CAS  PubMed  Google Scholar 

  23. Garcia-Hernandez, M., Berardini, T.Z., Chen, G., Crist, D., Doyle, A., Huala, E., Knee, E., Lambrecht, M., Miller, N., Mueller, L.A., Mundodi, D., Reiser, L., Rhee, S.Y., Scholl, R., Tacklind, J., Weems, D.C., Wu, Y., Xu, I., Yoo, D., Yoon, J., and Zhang, P. (2002) TAIR: a resource for integrated Arabidopsis data. Funct. Integr. Genomics, 2, 239.

    Article  CAS  PubMed  Google Scholar 

  24. Gundlach, H., Lemcke, K., Rudd, S., Kolesov, G., Arnold, R., Mewes, H.W., Mayer, K.F., Schoof, H., and Zaccaria, P. (2002) MIPS Arabidopsis thaliana database (MAtdb): an integrated biological knowledge resource based on the first complete plant genome. Nucleic Acids Res., 30, 91–93.

    Article  PubMed  Google Scholar 

  25. Dong, Q., Polacco, M.L., Seigfried, T.E., Lawrence, C.J., and Brendel, V. (2004) MaizeGDB, the community database for maize genetics and genomics. Nucleic Acids Res., 32, D393–D397.

    Article  PubMed  Google Scholar 

  26. Gene Ontology Consortium. (2006) The gene ontology (GO) project in 2006. Nucleic Acids Res., 34, 322–326.

    Article  Google Scholar 

  27. Sayers, E., and Wheeler, D. (2004) Building customized data pipelines using the entrez programming utilities (eutils), in NCBI Short Courses. National Center for Biotechnology Information.

    Google Scholar 

  28. Pruitt, K.D., Tatusova, T., and Maglott, D.R. (2005) NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res., 33(1), D501–D504.

    CAS  PubMed  Google Scholar 

  29. Tatusova, T., Pruitt, K.D., and Ostell, J.M.(2003) The reference sequence (refseq) project, in The NCBI Handbook. National Center for Biotechnology Information.

    Google Scholar 

  30. Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Altschul, S.F., and Lipman, D.J. (1997) Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.

    Article  PubMed  Google Scholar 

  31. Madden, T.L., and McGinnis, S., (2004) BLAST: at the core of a powerful and diverse set of sequence analysis tools. Nucleic Acids Res. 32, W20–W25.

    Article  PubMed  Google Scholar 

  32. Wootton, J.C., and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol., 266, 554–571.

    Article  CAS  PubMed  Google Scholar 

  33. Zhang, Z., Schwartz, S., Wagner, L., and Miller, W. (2000) A greedy algorithm for aligning DNA sequences. J. Comput. Biol., 7(1–2), 203–214.

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Humana Press Inc.

About this protocol

Cite this protocol

Wheeler, D. (2007). Using GenBank. In: Edwards, D. (eds) Plant Bioinformatics. Methods in Molecular Biology™, vol 406. Humana Press. https://doi.org/10.1007/978-1-59745-535-0_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-59745-535-0_2

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-58829-653-5

  • Online ISBN: 978-1-59745-535-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics