Abstract
Bioinformatics involves the development of statistical tools and techniques and computer software for acquisition, storage, analysis, and visualization of biological information. The European Molecular Biology laboratory (EMBL), the National Center for Biotechnology Information (NCBI), and the DNA Databank of Japan (DDBJ) have been catering to the needs of the researchers around the globe for decades, and the databases and tools hosted by these institutes are continually growing at a rapid pace. Analytical tools such as BLAST and CLUSTAL have been the workhorses for sequence data search and analysis, and these programs have been maintained since the 1990s. In addition, many others tools like AutoSNP, SNP2CAPS, TASSEL, STRUCTURE, etc. are useful for sequence data analysis and for deriving biologically meaningful conclusions based on these analyses. On the other hand, databases like GenBank, Phytozome, the EMBL Nucleotide Sequence Database, SwissProt, and Uniprot Knowledgebase, etc. store huge amounts of nucleotide and protein sequence information that are readily accessible to the public. In addition, the Kyoto Encyclopaedia of Genes and Genomes (KEGG) attempts to understand higher-order biological functions by integrating gene, protein, and metabolic pathway information. This chapter is devoted to the description of various bioinformatics tools and databases relevant for plant breeding activities and discusses their relevant features and applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Altchul SFW, Gish W, Miller W et al (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Bairoch A, Apweiler R (1996) The SWISS-PROT protein sequence data bank and its new supplement TrEMBL. Nucleic Acids Res 24:21–25
Barker G, Batley J, O’Sullivan H et al (2003) Redundancy based detection of sequence polymorphisms in expressed sequence tag data using autoSNP. Bioinformatics Appl Note 19:421–422
Baxevanis D (2000) The Molecular Biology Database Collection: an online compilation of relevant database resources. Nucleic Acids Res 28:1–7
Brazma A, Parkinson H, Sarkans U et al (2003) ArrayExpress - a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 31:68–71
Burge C, Karlin S (1997) Prediction of complete gene structures in human genomic DNA. J Mol Biol 268:78–94
Carollo V, Matthews DE, Lazo GR et al (2005) GrainGenes 2.0. An improved resource for the small-grains community. Plant Physiol 139:643–651
Chenna R, Sugawara H, Koike T et al (2003) Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res 31:3497–3500. doi:10.1093/nar/gkg500
Duran C, Appleby N, Clark T et al (2009) AutoSNPdb: an annotated single nucleotide polymorphism database for crop plants. Nucleic Acids Res 37(Database issue):D951–953. doi:10.1093/nar/gkn650
Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30:207–210
Goodstein DM, Shu S, Howson R et al (2012) Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res 40(Database issue):D1178–D1186
Hashimoto K, Goto S, Kawano S et al (2006) KEGG as a glycome informatics resource. Glycobiology 16:63R–70R
Jenkins H, Hardy N, Beckmann M et al (2004) A proposed framework for the description of plant metabolomics experiments and their results. Nature Biotechnol 22:1601–1606
Kanehisa M, Goto S, Kawashima S et al (2004) The KEGG resource for deciphering the genome. Nucleic Acids Res 32(Database issue):D277–D280
Kanz C, Aldebert P, Althorpe N et al (2005) The EMBL nucleotide sequence database. Nucleic Acids Res 33(Database issue):D29–D3
Larkin MA, Blackshields G, Brown NP et al (2007) ClustalW and ClustalX version 2. Bioinformatics 23:2947–2948. doi:10.1093/bioinformatics/btm404
Lawrence CJ, Seigfried TE, Brendel V (2005) The Maize Genetics and Genomics Database. The community resource for access to diverse maize data. Plant Physiol 138:55–58
Magrane M, UniProt Consortium (2011) UniProt knowledgebase: a hub of integrated protein data. Database vol 2011, Article ID bar009. doi:10.1093/database/bar009
Parkinson H, Sarkans U, Shojatalab M et al (2005) ArrayExpress—a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 33(Database issue):D553–D555
Pritchard JK, Stephens M, Rosenberg NA et al (2000a) Association mapping in structured populations. Am J Hum Genet 67:170–181
Saeed AI, Sharov V, White J et al (2003) TM4: a free, open-source system for microarray data management and analysis. BioTechniques 34:374–378
Savage D, Batley J, Erwin T et al (2005) SNPServer: a real-time SNP discovery tool. Nucleic Acids Res 33(Web Server issue):W493–W495
Thiel T, Kota R, Grosse I et al (2004) SNP2CAPS: a SNP and INDEL analysis tool for CAPS marker development. Nucleic Acids Res 32:e5. doi:10.1093/nar/gnh006
Thimm O, Blasing O, Gibon Y et al (2004) MAPMAN: a user-driven tool to display genomics data sets onto diagrams of metabolic pathways and other biological processes. Plant J 37:914–939
Thompson JD, Gibson TJ, Plewniak F et al (1997) The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res 25:4876–4882. doi:10.1093/nar/25.24.4876
Thongjuea S, Ruanjaichon V, Bruskiewich R et al (2009) RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome. Nucleic Acids Res 37(Database issue):D996–D1000
UniProt Consortium (2013) Update on activities at the Universal Protein Resource (UniProt) in 2013. Nucleic Acids Res 41(Database issue):D43–D47
Ware DH, Jaiswal P, Ni J et al (2002) Gramene, a tool for grass genomics. Plant Physiol 130:1606–1613
Youens-Clark K, Buckler E, Casstevens T et al (2011) Gramene database in 2010: updates and extensions. Nucleic Acids Res 39(Database issue):D1085–D1094
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2015 Author(s)
About this chapter
Cite this chapter
Singh, B.D., Singh, A.K. (2015). Bioinformatics Tools and Databases for Genomics Research. In: Marker-Assisted Plant Breeding: Principles and Practices. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2316-0_14
Download citation
DOI: https://doi.org/10.1007/978-81-322-2316-0_14
Publisher Name: Springer, New Delhi
Print ISBN: 978-81-322-2315-3
Online ISBN: 978-81-322-2316-0
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)