Skip to main content

Bioinformatics Tools and Databases for Genomics Research

  • Chapter
Marker-Assisted Plant Breeding: Principles and Practices
  • 4265 Accesses

Abstract

Bioinformatics involves the development of statistical tools and techniques and computer software for acquisition, storage, analysis, and visualization of biological information. The European Molecular Biology laboratory (EMBL), the National Center for Biotechnology Information (NCBI), and the DNA Databank of Japan (DDBJ) have been catering to the needs of the researchers around the globe for decades, and the databases and tools hosted by these institutes are continually growing at a rapid pace. Analytical tools such as BLAST and CLUSTAL have been the workhorses for sequence data search and analysis, and these programs have been maintained since the 1990s. In addition, many others tools like AutoSNP, SNP2CAPS, TASSEL, STRUCTURE, etc. are useful for sequence data analysis and for deriving biologically meaningful conclusions based on these analyses. On the other hand, databases like GenBank, Phytozome, the EMBL Nucleotide Sequence Database, SwissProt, and Uniprot Knowledgebase, etc. store huge amounts of nucleotide and protein sequence information that are readily accessible to the public. In addition, the Kyoto Encyclopaedia of Genes and Genomes (KEGG) attempts to understand higher-order biological functions by integrating gene, protein, and metabolic pathway information. This chapter is devoted to the description of various bioinformatics tools and databases relevant for plant breeding activities and discusses their relevant features and applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Altchul SFW, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410

    Article  Google Scholar 

  • Bairoch A, Apweiler R (1996) The SWISS-PROT protein sequence data bank and its new supplement TrEMBL. Nucleic Acids Res 24:21–25

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Barker G, Batley J, O’Sullivan H et al (2003) Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics Appl Note 19:421–422

    Article  CAS  Google Scholar 

  • Baxevanis D (2000) The Molecular Biology Database Collection: an online compilation of relevant database resources. Nucleic Acids Res 28:1–7

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Brazma A, Parkinson H, Sarkans U et al (2003) ArrayExpress - a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 31:68–71

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94

    Article  CAS  PubMed  Google Scholar 

  • Carollo V, Matthews DE, Lazo GR et al (2005) GrainGenes 2.0. An improved resource for the small-grains community. Plant Physiol 139:643–651

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Chenna R, Sugawara H, Koike T et al (2003) Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res 31:3497–3500. doi:10.1093/nar/gkg500

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Duran C, Appleby N, Clark T et al (2009) AutoSNPdb: an annotated single nucleotide polymorphism database for crop plants. Nucleic Acids Res 37(Database issue):D951–953. doi:10.1093/nar/gkn650

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Goodstein DM, Shu S, Howson R et al (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40(Database issue):D1178–D1186

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Hashimoto K, Goto S, Kawano S et al (2006) KEGG as a glycome informatics resource. Glycobiology 16:63R–70R

    Article  CAS  PubMed  Google Scholar 

  • Jenkins H, Hardy N, Beckmann M et al (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nature Biotechnol 22:1601–1606

    Article  CAS  Google Scholar 

  • Kanehisa M, Goto S, Kawashima S et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32(Database issue):D277–D280

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Kanz C, Aldebert P, Althorpe N et al (2005) The EMBL nucleotide sequence database. Nucleic Acids Res 33(Database issue):D29–D3

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Larkin MA, Blackshields G, Brown NP et al (2007) ClustalW and ClustalX version 2. Bioinformatics 23:2947–2948. doi:10.1093/bioinformatics/btm404

    Article  CAS  PubMed  Google Scholar 

  • Lawrence CJ, Seigfried TE, Brendel V (2005) The Maize Genetics and Genomics Database. The community resource for access to diverse maize data. Plant Physiol 138:55–58

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Magrane M, UniProt Consortium (2011) UniProt knowledgebase: a hub of integrated protein data. Database vol 2011, Article ID bar009. doi:10.1093/database/bar009

  • Parkinson H, Sarkans U, Shojatalab M et al (2005) ArrayExpress—a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 33(Database issue):D553–D555

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Pritchard JK, Stephens M, Rosenberg NA et al (2000a) Association mapping in structured populations. Am J Hum Genet 67:170–181

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Saeed AI, Sharov V, White J et al (2003) TM4: a free, open-source system for microarray data management and analysis. BioTechniques 34:374–378

    CAS  PubMed  Google Scholar 

  • Savage D, Batley J, Erwin T et al (2005) SNPServer: a real-time SNP discovery tool. Nucleic Acids Res 33(Web Server issue):W493–W495

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Thiel T, Kota R, Grosse I et al (2004) SNP2CAPS: a SNP and INDEL analysis tool for CAPS marker development. Nucleic Acids Res 32:e5. doi:10.1093/nar/gnh006

    Article  PubMed Central  PubMed  Google Scholar 

  • Thimm O, Blasing O, Gibon Y et al (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37:914–939

    Article  CAS  PubMed  Google Scholar 

  • Thompson JD, Gibson TJ, Plewniak F et al (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882. doi:10.1093/nar/25.24.4876

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Thongjuea S, Ruanjaichon V, Bruskiewich R et al (2009) RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome. Nucleic Acids Res 37(Database issue):D996–D1000

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • UniProt Consortium (2013) Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res 41(Database issue):D43–D47

    Article  Google Scholar 

  • Ware DH, Jaiswal P, Ni J et al (2002) Gramene, a tool for grass genomics. Plant Physiol 130:1606–1613

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  • Youens-Clark K, Buckler E, Casstevens T et al (2011) Gramene database in 2010: updates and extensions. Nucleic Acids Res 39(Database issue):D1085–D1094

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Author(s)

About this chapter

Cite this chapter

Singh, B.D., Singh, A.K. (2015). Bioinformatics Tools and Databases for Genomics Research. In: Marker-Assisted Plant Breeding: Principles and Practices. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2316-0_14

Download citation

Publish with us

Policies and ethics