Databases for Rice Omics Studies

  • Takeshi Itoh
  • Yoshihiro Kawahara
  • Tsuyoshi Tanaka
Chapter

Abstract

In modern molecular biology, databases play a pivotal role. With the advent of high-throughput DNA sequencing technologies, a central question has become how to process, store, and present large-scale data sets, and databases where researchers can search for sequence data and related omics information are now indispensable. For rice, two major genome databases, the RAP-DB and RGAP databases, and several transcriptome databases are widely used. As genome resequencing data increased, some databases capable of displaying multiple genomes were also developed. Furthermore, novel databases are being developed for the comparison of genome sequences among wild and cultivated Oryza species. Bioinformatics analyses of omics information will be needed in the future, and researchers will likely desire to effectively retrieve data created by such analyses; therefore, databases are expected to function as a hub for multiple rice omics resources. Consequently, databases will facilitate the next-generation breeding science based on large-scale omics data.

Keywords

Database Bioinformatics High-throughput DNA sequencing Genome Transcriptome Comparative omics 

Notes

Acknowledgments

This work was supported by a grant from the Ministry of Agriculture, Forestry, and Fisheries of Japan (Genomics-based Technology for Agricultural Improvement, IVG2001 to T.I.) and the Japan Society for the Promotion of Science (JSPS) KAKENHI Grant-in-Aid (grant No. 17HP8029 to Y.K.).

References

  1. Alexandrov N, Tai S, Wang W et al (2015) SNP-seek database of SNPs derived from 3000 rice genomes. Nucleic Acids Res 43:D1023–D1027.  https://doi.org/10.1093/nar/gku1039 CrossRefPubMedGoogle Scholar
  2. Altschul SF, Madden TL, Schaeffer AA (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402.  https://doi.org/10.1093/nar/25.17.3389 CrossRefPubMedPubMedCentralGoogle Scholar
  3. Aoki Y, Okamura Y, Tadaka S et al (2016) ATTED-II in 2016: a plant coexpression database towards lineage-specific coexpression. Plant Cell Physiol 57:e5.  https://doi.org/10.1093/pcp/pcv165 CrossRefPubMedGoogle Scholar
  4. Bennetzen JL, Coleman C, Liu R et al (2004) Consistent over-estimation of gene number in complex plant genomes. Curr Opin Plant Biol 7:732–736.  https://doi.org/10.1016/j.pbi.2004.09.003 CrossRefPubMedGoogle Scholar
  5. Blaxter M, Danchin A, Savakis B et al (2016) Reminder to deposit DNA sequences. Science 352:780.  https://doi.org/10.1126/science.aaf7672 CrossRefPubMedGoogle Scholar
  6. Cruveiller S, Jabbari K, Clay O, Bernardi G (2004) Incorrectly predicted genes in rice? Gene 333:187–188.  https://doi.org/10.1016/j.gene.2004.02.039 CrossRefPubMedGoogle Scholar
  7. Dash S, Van Hemert J, Hong L et al (2012) PLEXdb: gene expression resources for plants and plant pathogens. Nucleic Acids Res 40:D1194–D1201.  https://doi.org/10.1093/nar/gkr938 CrossRefPubMedGoogle Scholar
  8. Devos D, Valencia A (2001) Intrinsic errors in genome annotation. Trends Genet 17:429–431.  https://doi.org/10.1016/S0168-9525(01)02348-4 CrossRefPubMedGoogle Scholar
  9. Dong Q, Schlueter SD, Brendel V (2004) PlantGDB, plant genome database and analysis tools. Nucleic Acids Res 32:D354–D359.  https://doi.org/10.1093/nar/gkh046 CrossRefPubMedPubMedCentralGoogle Scholar
  10. Galperin MY, Femandez-Suarez XM, Rigden DJ (2017) The 24th annual nucleic acids research database issue: a look back and upcoming changes. Nucleic Acids Res 45:D1–D11.  https://doi.org/10.1093/nar/gkw1188 CrossRefPubMedGoogle Scholar
  11. Garris AJ, Tai TH, Cburn J et al (2005) Genetic structure and diversity in Oryza sativa L. Genetics 169:1631–1638.  https://doi.org/10.1534/genetics.104.035642 CrossRefPubMedPubMedCentralGoogle Scholar
  12. Goff AS, Ricke D, Lan T-H et al (2002) A draft sequence of the Rice genome (Oryza sativa L. ssp. japonica). Science 296:92–100.  https://doi.org/10.1126/science.1068275 CrossRefPubMedGoogle Scholar
  13. Goodman L, Edmunds SC, Basford AT (2012) Large and linked in scientific publishing. Gigascience 1:1.  https://doi.org/10.1186/2047-217X-1-1 CrossRefPubMedPubMedCentralGoogle Scholar
  14. Goodwin S, McPherson JD, McCombie WR (2016) Coming of age: ten years of next-generation sequencing technologies. Nat Rev Genet 17:333–351.  https://doi.org/10.1038/nrg.2016.49 CrossRefPubMedGoogle Scholar
  15. Hamada K, Hong K, Suwabe K et al (2011) OryzaExpress: an integrated database of gene expression networks and omics annotations in rice. Plant Cell Physiol 52:220–229.  https://doi.org/10.1093/pcp/pcq195 CrossRefPubMedGoogle Scholar
  16. He X, Wang J (2007) BGI-RIS V2. Methods Mol Biol 406:275–299PubMedGoogle Scholar
  17. Imanishi T, Itoh T, Suzuki U et al (2004) Integrative Annotation of 21,037 human genes validated by full-length cDNA clones. PLoS Biol 2:e256.  https://doi.org/10.1371/journal.pbio.0020162 CrossRefGoogle Scholar
  18. Itoh T, Tanaka T, Barrero RA et al (2007) Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana. Genome Res 17:175–183.  https://doi.org/10.1101/gr.5509507 CrossRefPubMedPubMedCentralGoogle Scholar
  19. Kawahara Y, de la Bastide M, Hamilton JP et al (2013) Improvement of the Oryza sativa Nipponbare reference genome using next generation sequence and optical map data. Rice 6:4.  https://doi.org/10.1186/1939-8433-6-4 CrossRefPubMedPubMedCentralGoogle Scholar
  20. Kawahara Y, Oono Y, Wakimoto H et al (2015) TENOR: database for comprehensive mRNA-Seq experiments in rice. Plant Cell Physiol 57:e7.  https://doi.org/10.1093/pcp/pcv179 CrossRefPubMedGoogle Scholar
  21. Kent WJ (2002) BLAT—the BLAST-like alignment tool. Genome Res 12:656–664.  https://doi.org/10.1101/gr.229202 CrossRefPubMedPubMedCentralGoogle Scholar
  22. Kersey PJ, Lawson D, Birney E et al (2010) Ensembl genomes: extending Ensembl across the taxonomic space. Nucleic Acids Res 38:D563–D569.  https://doi.org/10.1093/nar/gkp871 CrossRefPubMedGoogle Scholar
  23. Kudo T, Terashima S, Takaki Y et al (2017) PlantExpress: a database integrating OryzaExpress and ArthaExpress for single-species and cross-species gene expression network analyses with microarray-based transcriptome data. Plant Cell Physiol 58:e1.  https://doi.org/10.1093/pcp/pcw208 CrossRefPubMedGoogle Scholar
  24. Kumagai M, Kim J, Itoh R, Itoh T (2013) TASUKE: a web-based visualization program for large-scale resequencing data. Bioinformatics 29:1806–1808.  https://doi.org/10.1093/bioinformatics/btt295 CrossRefPubMedPubMedCentralGoogle Scholar
  25. Kumar S (2013) Editor’s inaugural message. Mol Biol Evol 30:1–2.  https://doi.org/10.1093/molbev/mss237 CrossRefPubMedGoogle Scholar
  26. Kurata N, Yamazaki Y (2006) Oryzabase. An integrated biological and genome information database for rice. Plant Physiol 140:12–17.  https://doi.org/10.1104/pp.105.063008 CrossRefPubMedPubMedCentralGoogle Scholar
  27. Kyrpides NC, Ouzounis CA (1999) Whole-genome sequence annotation: ‘going wrong with confidence’. Mol Microbiol 32:886–887.  https://doi.org/10.1046/j.1365-2958.1999.01380.x CrossRefPubMedGoogle Scholar
  28. Landsman D, Gentleman R, Kelso J et al (2009) DATABASE: a new forum for biological databases and curation. Database (Oxford) 2009:bap002.  https://doi.org/10.1093/database/bap002 Google Scholar
  29. Lee TH, Kim YK, Pham TT et al (2009) RiceArrayNet: a database for correlating gene expression from transcriptome profiling, and its application to the analysis of coexpressed genes in rice. Plant Physiol 151:16–33.  https://doi.org/10.1104/pp.109.139030 CrossRefPubMedPubMedCentralGoogle Scholar
  30. Matsuoka M (2011) The first databases special issue: the new category opens a new portal to plant and cell physiology. Plant Cell Physiol 52:211–212.  https://doi.org/10.1093/pcp/pcr012 CrossRefPubMedGoogle Scholar
  31. McCouch RS, CGSNL (2008) Gene nomenclature system for rice. Rice 1:72–84.  https://doi.org/10.1007/s12284-008-9004-9 CrossRefGoogle Scholar
  32. Muir P, Li S, Lou S et al (2016) The real cost of sequencing: scaling computation to keep pace with data generation. Genome Biol 17:53.  https://doi.org/10.1186/s13059-016-0917-0 CrossRefPubMedPubMedCentralGoogle Scholar
  33. Nagano AJ, Sato Y, Mihara M et al (2012) Deciphering and prediction of transcriptome dynamics under fluctuating field conditions. Cell 151:1358–1369.  https://doi.org/10.1016/j.cell.2012.10.048 CrossRefPubMedGoogle Scholar
  34. Nakano M, Nobuta K, Vemaraju K et al (2006) Plant MPSS databases: signature-based transcriptional resources for analyses of mRNA and small RNA. Nucleic Acids Res 34:D731–D735.  https://doi.org/10.1093/nar/gkj077 CrossRefPubMedGoogle Scholar
  35. Nussbaumer T, Martis MM, Roessner SK et al (2013) MIPS PlantsDB: a database framework for comparative plant genome research. Nucleic Acids Res 41:D1144–D1151.  https://doi.org/10.1093/nar/gks1153 CrossRefPubMedGoogle Scholar
  36. Ohyanagi H, Tanaka T, Sakai H et al (2006) The rice annotation project database (RAP-DB): hub for Oryza sativa ssp. japonica genome information. Nucleic Acids Res 34:D741–D744.  https://doi.org/10.1093/nar/gkj094 CrossRefPubMedGoogle Scholar
  37. Ohyanagi H, Takano T, Terashima S et al (2015) Plant omics data center: an integrated web repository for interspecies gene expression networks with NLP-based curation. Plant Cell Physiol 56:e9.  https://doi.org/10.1093/pcp/pcu188 CrossRefPubMedGoogle Scholar
  38. Ohyanagi H, Ebata T, Huang X et al (2016) OryzaGenome: genome diversity database of wild Oryza species. Plant Cell Physiol 57:1.  https://doi.org/10.1093/pcp/pcv171 CrossRefPubMedGoogle Scholar
  39. Sakai H, Lee SS, Tanaka T et al (2013) Rice annotation project database (RAP-DB): an integrative and interactive database for rice genomics. Plant Cell Physiol 54:e6.  https://doi.org/10.1093/pcp/pcs183 CrossRefPubMedPubMedCentralGoogle Scholar
  40. Sakai H, Kanamori H, Arai-Kichise Y et al (2014) Construction of pseudomolecule sequences of the aus rice cultivar Kasalath for comparative genomics of Asian cultivated rice. DNA Res 21:397–405.  https://doi.org/10.1093/dnares/dsu006 CrossRefPubMedPubMedCentralGoogle Scholar
  41. Sato Y, Namiki N, Takehisa H et al (2013a) RiceFREND: a platform for retrieving coexpressed gene networks in rice. Nucleic Acids Res 41:D1214–D1221.  https://doi.org/10.1093/nar/gks1122 CrossRefPubMedGoogle Scholar
  42. Sato Y, Takehisa H, Kamatsuki K et al (2013b) RiceXPro version 3.0: expanding the informatics resource for rice transcriptome. Nucleic Acids Res 41:D1206–D1213.  https://doi.org/10.1093/nar/gks1125 CrossRefPubMedGoogle Scholar
  43. Schatz CM, Maron GL, Stein CJ et al (2014) Whole genome de novo assemblies of three divergent strains of rice, Oryza sativa, document novel gene space of aus and indica. Genome Biol 15:506.  https://doi.org/10.1186/PREACCEPT-2784872521277375 PubMedPubMedCentralGoogle Scholar
  44. Schnoes MA, Brown DS, Dodevski I, Babbitt CP (2009) Annotation error in public databases: misannotation of molecular function in enzyme Superfamilies. PLoS Comput Biol 5:e1000605.  https://doi.org/10.1371/journal.pcbi.1000605 CrossRefPubMedPubMedCentralGoogle Scholar
  45. Shi X, Peng J, Yu X et al (2015) PopGeV: a web-based large-scale population genome browser. Bioinformatics 31:3048–3050.  https://doi.org/10.1093/bioinformatics/btv324 CrossRefPubMedGoogle Scholar
  46. Tello-Ruiz MK, Stein J, Wei S et al (2016) Gramene 2016: comparative plant genomics and pathway resources. Nucleic Acids Res 44(D1):D1133–D1140.  https://doi.org/10.1093/nar/gkv1179 CrossRefPubMedGoogle Scholar
  47. The 3,000 Rice Genomes Project (2014) The 3,000 rice genomes project. GigaScience 3:7.  https://doi.org/10.1186/2047-217X-3-7 CrossRefGoogle Scholar
  48. Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10:57–63.  https://doi.org/10.1038/nrg2484 CrossRefPubMedPubMedCentralGoogle Scholar
  49. Ware D, Jaiswal P, Ni J et al (2002) Gramene: a resource for comparative grass genomics. Nucleic Acids Res 30:103–105.  https://doi.org/10.1093/nar/30.1.103 CrossRefPubMedPubMedCentralGoogle Scholar
  50. Xia L, Zou D, Sang J et al (2017) Rice expression database (RED): an integrated RNA-Seq-derived gene expression database for rice. J Genet Genomics 44:235–241.  https://doi.org/10.1016/j.jgg.2017.05.003 CrossRefPubMedGoogle Scholar
  51. Yu J, Hu S, Wang J et al (2002) A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296:79–92.  https://doi.org/10.1126/science.1068037 CrossRefPubMedGoogle Scholar
  52. Yuan Q, Ouyang S, Liu J et al (2003) The TIGR rice genome annotation resource: annotating the rice genome and creating resources for plant biologists. Nucleic Acids Res 31:229–233.  https://doi.org/10.1093/nar/gkg059 CrossRefPubMedPubMedCentralGoogle Scholar
  53. Yuan Q, Ouyang S, Wang A et al (2005) The institute for genomic research Osa1 rice genome annotation database. Plant Pysiol 138:18–26.  https://doi.org/10.1104/pp.104.059063 Google Scholar
  54. Zhao W, Wang J, He X et al (2004) BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics. Nucleic Acids Res 32:D377–D382.  https://doi.org/10.1093/nar/gkh085 CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Takeshi Itoh
    • 1
  • Yoshihiro Kawahara
    • 1
  • Tsuyoshi Tanaka
    • 1
  1. 1.Advanced Analysis CenterNational Agriculture and Food Research OrganizationTsukubaJapan

Personalised recommendations