Springer Nature is making SARS-CoV-2 and COVID-19 research free View research | View latest news | Sign up for updates

De novo next-generation sequencing, assembling and annotation of Arachis hypogaea L. Spanish botanical type whole plant transcriptome


Peanut is a major agronomic crop within the legume family and an important source of plant oil, proteins, vitamins, and minerals for human consumption, as well as animal feed, bioenergy, and health products. Peanut genomic research effort lags that of other legumes of economic importance, mainly due to the shortage of essential genomic infrastructure, tools, resources, and the complexity of the peanut genome. This is a pioneering study that explored the peanut Spanish Group whole plant transcriptome and culminated in developing unigenes database. The study applied modern technologies, such as, normalization and next-generation sequencing. It overall sequenced 8,308,655,800 nucleotides and generated 26,048 unigenes amongst which 12,302 were annotated and 8,817 were characterized. The remainder, 13,746 (52.77 %) unigenes, had unknown functions. These results will be applied as the reference transcriptome sequences for expanded transcriptome sequencing of the remaining three peanut botanical types (Valencia, Runner, and Virginia), which is currently in progress, RNA-seq, exome identification, and genomic markers development. It will also provide important tools and resources for other legumes and plant species genomic research.

This is a preview of subscription content, log in to check access.

Fig. 1


  1. Adams MD, Kelley JM, Gocayne JD et al (1991) Complementary DNA sequencing: expressed sequence tags and human genome project. Science 252:1651–1656

  2. Altschul SF, Gish W (1996) Local alignment statistics. Methods Enzymol 266:460–480

  3. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 25:3389–3402

  4. Bell CJ, Dixon RA, Farmer AD, Flores R, Inman J, Gonzales RA, Harrison MJ, Paiva NL, Scott AD, Weller JW, May GD (2001) The Medicago genome initiative: a model legume database. Nucleic Acids Res 29:114–117

  5. Bi YP, Liu W, Xia H, Su L, Zhao CZ, Wan SB, Wang XJ (2010) EST sequencing and gene expression profiling of cultivated peanut (Arachis hypogaea L.). Genome 53:832–839

  6. Bonaldo MF, Lennon G, Soares MB (1996) Normalization and subtraction: two approaches to facilitate gene discovery. Genome Res 6:791–806

  7. Boote KJ (1982) Growth stages of peanut (Arachis hypogaea L.). Peanut Sci 9:35–40

  8. Burow MD, Simpson CE, Starr JL, Paterson AH (2001) Transmission genetics of chromatin from a synthetic amphidiploid to cultivated peanut (Arachis hypogaea L.): broadening the gene pool of a monophyletic polyploidy species. Genetics 159:823–837

  9. Cannon SB, Crow JA, Heuer ML, Wang X, Cannon EKS, Dwan C et al (2005) Databases and information integration for the Medicago truncatula genome and transcriptome. Plant Physiol 138:38–46

  10. Cianferoni A, Muraro A (2012) Food-induced anaphylaxis. Immunol Allergy Clin North Am 32:165–195

  11. Feng S, Wang X, Zhang X, Dang PM, Holbrook CC, Culbreath AK, Wu Y, Guo B (2012) Peanut (Arachis hypogaea) expressed sequence tag project: progress and application. Comp Funct Genomics 2012:373768. doi:10.1155/2012/373768

  12. He G, Prakash C (2001) Evaluation of genetic relationships among botanical varieties of cultivated peanut (Arachis hypogaea L.) using AFLP markers. Genet Resour Crop Evol 48:347–352

  13. Higgs J (2002) The beneficial role of peanuts in the diet—an update and rethink! Peanuts and their role in CHD. Nutr Food Sci 32:214–218

  14. Huang J, Yan L, Lei Y, Jiang H, Ren X, Liao B (2012) Expressed sequence tags in cultivated peanut (Arachis hypogaea): discovery of genes in seed development and response to Ralstonia solanacearum challenge. J Plant Res 25:755–769. doi:10.1007/s10265-012-0491-9

  15. Koilkonda P, Sato S, Tabata S, Shirasawa K, Hirakawa H, Sakai H, Sasamoto S, Watanabe A, Wada T, Kishida Y, Tsuruoka H, Fujishiro T, Yamada M, Kohara M, Suzuki S, Hasegawa M, Kiyoshima H, Isobe S (2012) Large-scale development of expressed sequence tag-derived simple sequence repeat markers and diversity analysis in Arachis spp. Mol Breed 30:125–138

  16. Krapovickas A, Gregory WC (1994) Taxonomı′a del ge′nero Arachis (Leguminosae). Bonplandia 8:1–186

  17. Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J (2009) SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics 25:1966–1967

  18. Li X, Hou S, Su M, Yang M, Shen S, Jiang G, Qi D, Chen S, Liu G (2010) Major energy plants and their potential for bioenergy development in China. Environ Manage 46:579–589

  19. Li X, Rezaei R, Li P, Wu G (2011) Composition of amino acids in feed ingredients for animal diets. Amino Acids 40:1159–1168

  20. Moretzsohn MC, Barbosa AVG, Alves-Freitas DMT, Teixeira C, Leal-Bertioli SCM, Guimarães PM, Pereira RW, Lopes CR, Cavallari MM, Valls JFM, Bertioli DJ, Gimenes MA (2009) A linkage map for the B-genome of Arachis (Fabaceae) and its synteny to the A-genome. BMC Plant Biol 9:40. doi:10.1186/1471-2229-9-40

  21. Severin AJ, Woody JL, Bolon YT, Joseph B, Diers BW, Farmer AD, Muehlbauer GJ, Nelson RT, Grant D, Specht JE et al (2010) RNA-Seq Atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 10:160

  22. Soares MB, Bonaldo MF, Jelene P, Su L, Lawton L, Efstratiadis A (1994) Construction and characterization of a normalized cDNA library. Proc Natl Acad Sci USA 91:9228–9232

  23. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4:41

  24. Wilson RF et al (2011) International peanut genomic research initiative strategic plan for 2012–2016 characterization of the peanut genome. http://www.peanutbioscience.com/images/IPGRI_StratPlan_DRAFT_v4_1_Aug11a.pdf

  25. Wilson RF, Grant D (2010) Soybean genomics research program accomplishments report. http://soybase.org/SoyGenStrat2007/SoyGenStratPlan2008-2012-Accomplishments%20v1.6.pdf

  26. Woody JL, Severin AJ, Bolon YT, Joseph B, Diers BW, Farmer AD, Weeks N, Muehlbauer GJ, Nelson RT, Grant D, Specht JE, Graham MA, Cannon SB, May GD, Vance CP, Shoemaker RC (2011) Gene expression patterns are correlated with genomic and genic structure in soybean. Genome 54:10–18

  27. Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Brujin graphs. Genome Res 18:821–829

  28. Zhang J, Liang S, Duan J, Wang J, Chen S, Cheng Z, Zhang Q, Liang X, Li Y (2012) De novo assembly and characterisation of the transcriptome during seed development, and generation of genic-SSR markers in peanut (Arachis hypogaea L.). BMC Genomics 13:90. doi:10.1186/1471-2164-13-90

Download references


Authors are grateful to Ms. Kayla Love for her laboratory assistance during the conduct of this experiment. The manuscript has been reviewed by all authors and all listed authors have agreed to this submission without conflict of interest. This work was supported by USDA-NIFA Evans-Allen Formula Grant (Accession No: 0209894).

Author information

Correspondence to Ning Wu or Kanyand Matand.

Additional information

Communicated by H. T. Nguyen.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Wu, N., Matand, K., Wu, H. et al. De novo next-generation sequencing, assembling and annotation of Arachis hypogaea L. Spanish botanical type whole plant transcriptome. Theor Appl Genet 126, 1145–1149 (2013). https://doi.org/10.1007/s00122-013-2042-8

Download citation


  • Reference Transcriptome
  • Peanut Plant
  • Normalize cDNA Library
  • Plant Transcriptome
  • Peanut Genome