Abstract
Genome sequencing yields an exceptional resource of genetic information. Knowledge of whole genome sequence information helps characterize individual genomes, transcriptional states and genetic variation in populations and provide genetic architecture associated with each trait. After the release of the first human genome assembly, other model organism assemblies became available; including the model plant Arabidopsis thaliana. The soybean community published the first reference genome of the variety Williams 82 in 2010. Soybean has important syntenic relationships with the other legume species and is a model plant for the legumes. In this chapter, we discuss about the soybean genome assemblies and annotations, and its fine-tuning in view of the next-generation sequencing technologies and bioinformatics tools. In addition, comparison of the structural variations between the cultivated reference genome with the available wild soybean genome information will be discussed. This is followed by the discussion on the opportunities of next-generation sequencing technologies and challenges that we anticipate on the development of more pangenomes and reference genomes for soybean. This will significantly affect the discovery of rare alleles associated with key agronomic and quality traits and shape up the next-generation breeding technologies and crop improvement.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Arumuganathan K, Earle ED (1991) Nuclear DNA content of some important plant species. Plant Mol Biol Rep 9:208–219
Batzoglou S, Jaffe DB, Stanley K, Butler J, Gnerre S et al (2002) ARACHNE: a whole-genome shotgun assembler. Genome Res 12(1):177–189
Bennetzen JL, Wang H (2014) The contributions of transposable elements to the structure, function, and evolution of plant genomes. Annu Rev Plant Biol 65:505–530
Butler J, MacCallum I, Kleber M, Shlyakhter IA, Belmonte MK et al (2008) ALLPATHS: de novo assembly of whole-genome shotgun microreads. Genome Res 18:810–820
Chaisson MJ, Pevzner PA (2008) Short read fragment assembly of bacterial genomes. Genome Res 18:324–330
de la Chaux N, Tsuchimatsu T, Shimizu KK, Wagner A (2012) The predominantly selfing plant Arabidopsis thaliana experienced a recent reduction in transposable element abundance compared to its outcrossing relative Arabidopsis lyrata. Mob DNA 3:2
Eid J, Fehr A, Gray J, Luong K, Luong K et al (2009) Real-time DNA sequencing from single polymerase molecules. Science 323:133–138
Fuller CW, Middendorf LR, Benner SA, Church GM, Harris T et al (2009) The challenges of sequencing by synthesis. Nat Biotechnol 27:1013–1023
Galindo-González L, Mhiri C, Deyholos MK, Grandbastien MA (2017) LTR-retrotransposons in plants: engines of evolution. Gene S0378–1119(17):30322
Goldberg RB (1978) DNA sequence organization in the soybean plant. Biochem Genet 16:45–51
Goldblatt P (1981) Cytology and phylogeny of Leguminosae. In: Polhill RM, Raven PH (eds) Advances in legume systematics, part 2. Royal Botanic Gardens, Kew, pp 427–463
Golicz AA, Batley J, Edwards D (2016) Towards plant pangenomics. Plant Biotechnol J 14:1099–1105
Ha J, Abernathy B, Nelson W, Grant D, Wu X et al (2012) Integration of the draft sequence and physical map as a framework for genomic research in soybean (Glycine max (L.) Merr.) and wild soybean (Glycine soja Sieb. and Zucc.). Genes Genomes Genet (Bethesda) 2:321–329
Hashmi U, Shafqat S, Khan F, Majid M, Hussain H et al (2015) Plant exomics: concepts, applications and methodologies in crop improvement. Plant Signal Behav 10(1):e976152
Hernandez D, Francois P, Farinelli L, Osterås M, Schrenzel J (2008) De novo bacterial genome sequencing: millions of very short reads assembled on a desktop computer. Genome Res 18:802–809
Hymowitz T (1970) On the domestication of soybean. Econ Bot 24:408–421
Hymowitz T (2004) Speciation and cytogentics. In Boerma HR, Specht JE (eds) Soybeans: improvement, production and uses. American Society of Agronomy, Madison, pp 97–136
Imelfort M, Edwards D (2009) De novo sequencing of plant genomes using second-generation technologies. Brief Bioinform 10:609–618
International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800
Jackman SD, Vandervalk BP, Mohamadi H, Chu J, Yeo S et al (2017) ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter. Genome Res. doi:10.1101/gr.214346.116
Jaffe DB, Butler J, Gnerre S, Mauceli E, Lindblad-Toh K et al (2003) Whole-genome sequence assembly for mammalian genomes: Arachne 2. Genome Res 13:91–96
Jeck WR, Reinhardt JA, Baltrus DA, Hickenbotham MT, Magrini V et al (2007) Extending assembly of short DNA sequences to handle error. Bioinformatics 23:2942–2944
Jiao WB, Accinelli G, Hartwig B, Kiefer C, Baker D et al (2017a) Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res 27:778–786
Jiao WB, Accinelli G, Hartwig B, Kiefer C, Baker D et al (2017b) Improving and correcting the contiguity of long-read genome assemblies of three plant species using optical mapping and chromosome conformation capture data. Genome Res. doi:10.1101/gr.213652.116
Joshi T, Valliyodan B, Wu JH, Lee SH, Xu D, Nguyen HT (2013) Genomic differences between cultivated soybean, G. max and its wild relative G. soja. BMC Genomics 14(Suppl 1):S5
Keim P, Diers BW, Olson TC, Shoemaker RC (1990) RFLP mapping in soybean: association between marker loci and variation in quantitative traits. Genetics 126:735–742
Keim P, Schupp JM, Travis SE, Clayton K, Zhu T et al (1997) A high density soybean genetic map based on AFLP markers. Crop Sci 37:537–543
Kim MY, Lee S, Van K, Kim TH, Jeong SC et al (2010) Whole-genome sequencing and intensive analysis of the undomesticated soybean (Glycine soja Sieb. and Zucc.) genome. Proc Natl Acad Sci USA 107:22032–22037
Lander ES, Waterman MS (1988) Genomic mapping by fingerprinting random clones: a mathematical analysis. Genomics 2:231–239
Li R, Zhu H, Ruan J, Qian W, Fang X et al (2010) De novo assembly of human genomes with massively parallel short read sequencing. Genome Res 20:265–272
Li YH, Zhou G, Ma J, Jiang W, Jin LG et al (2014) De novo assembly of soybean wild relatives for pan-genome analysis of diversity and agronomic traits. Nat Biotechnol 10:1045–1052
Libault M, Farmer A, Joshi T, Takahashi K, Langley RJ et al (2010) An integrated transcriptome atlas of the crop model Glycine max, and its use in comparative analyses in plants. Plant J 63:86–99
Luo MC, Thomas C, You FM, Hsiao J, Ouyang S et al (2003) High-throughput fingerprinting of bacterial artificial chromosomes using the SNaPshot labeling kit and sizing of restriction fragments by capillary electrophoresis. Genomics 82:378–389
Marek LF, Mudge J, Darnielle L, Grant D, Hanson N et al (2001) Soybean genomic survey: BAC-end sequences near RFLP and SSR markers. Genome 44:572–581
Margulies M, Margulies M, Egholm M, Altman WE et al (2005) Genome sequencing in microfabricated high-density picolitre reactors. Nature 437:376–380
Myers EW (1995) Toward simplifying and accurately formulating fragment assembly. J Comput Biol 2:275–290
Nakano K, Shiroma A, Shimoji M, Tamotsu H, Ashimine N et al (2017) Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area. Hum Cell. doi:10.1007/s13577-017-0168-8
Pevzner PA, Tang H, Waterman MS (2001) An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci USA 98:9748–9753
Qi X, Li MW, Xie M, Liu X, Ni M et al (2014) Identification of a novel salt tolerance gene in wild soybean by whole-genome sequencing. Nat Commun 5:4340
Rhoads A, Au KF (2015) PacBio sequencing and its applications. Genom Proteom Bioinform 13:278–289
Sanger F, Air GM, Barrell BG, Brown NL, Coulson AR et al (1977) Nucleotide sequence of bacteriophage φX174 DNA. Nature 24:687–695
Schmutz J, Cannon SB, Schlueter J, Ma J, Mitros T et al (2010) Genome sequence of the palaeopolyploid soybean. Nature 463:178–183
Schnable PS, Ware D, Fulton RS, Stein JC, Wei F et al (2009) The B73 maize genome: complexity, diversity, and dynamics. Science 326:1112–1115
Schwartz DC, Li X, Hernandez LI, Ramnarain SP, Huff EJ, Wang YK (1993) Ordered restriction maps of Saccharomyces cerevisiae chromosomes constructed by optical mapping. Science 262:110–114
Severin AJ, Woody JL, Bolon YT, Joseph B, Diers BW et al (2010) RNA-Seq atlas of Glycine max: a guide to the soybean transcriptome. BMC Plant Biol 10:160
Shendure J, Porreca GJ, Reppas NB, Lin X, McCutcheon JP et al (2005) Accurate multiplex polony sequencing of an evolved bacterial genome. Science 309:1728–1732
Shoemaker R, Keim P, Vodkin L, Retzel E, Clifton SW et al (2002) A compilation of soybean ESTs: generation and analysis. Genome 45:329–338
Shoemaker RC, Grant D, Olson T, Warren WC, Wing R et al (2008) Microsatellite discovery from BAC end sequences and genetic mapping to anchor the soybean physical and genetic maps. Genome 51:294–302
Shultz JL, Ray JD, Lightfoot DA (2007) A sequence based synteny map between soybean and Arabidopsis thaliana. BMC Genom 8:8
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I (2009) ABySS: a parallel assembler for short read sequence data. Genome Res 19:1117–1123
Singh RJ, Hymowitz T (1988) The genomic relationship between Glycine max (L.) Merr. and G. soja Sieb. and Zucc. as revealed by pachytene chromosome analysis. Theor Appl Genet 76:705–711
Song Q, Jenkins J, Jia G, Hyten DL, Pantalone V et al (2016) Construction of high resolution genetic linkage maps to improve the soybean genome sequence assembly Glyma1.01. BMC Genom 17:33
Sutton GG, White O, Adams MD, Kerlavage AR (1995) TIGR assembler: a new tool for assembling large shotgun sequencing projects. Genome Sci Technol 1:9–19
Takata M, Kiyohara A, Takasu A, Kishima Y, Ohtsubo H, Sano Y (2007) Rice transposable elements are characterized by various methylation environments in the genome. BMC Genom 8:469
Tang H, Lyons E, Town CD (2015) Optical mapping in plant comparative genomics. Gigascience 4:3
The Arabidopsis Genome Initiative (2000) Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408:796–815
The International Human Genome Sequencing Consortium (2004) Finishing the euchromatic sequence of the human genome. Nature 431:931–945
United States Department of Agriculture-Foreign Agricultural Service (2017) World agricultural production, Circular Series WAP 05-17
Utturkar SM, Klingeman DM, Land ML, Schadt CW, Doktycz MJ, Pelletier DA, Brown SD (2014) Evaluation and validation of de novo and hybrid assembly techniques to derive high-quality genome sequences. Bioinformatics 30:2709–2716
Valliyodan B, Qiu D, Patil G, Zeng P, Huang J et al (2016) Landscape of genomic diversity and trait discovery in soybean. Sci Rep 6:23598
Varshney RK, Nayak SN, May GD, Jackson SA (2009) Next-generation sequencing technologies and their implications for crop genetics and breeding. Trends Biotechnol 27(9):522–530
Venter JC, Adams MD, Myers EW, Li PW, Mural RJ et al (2001) The sequence of the human genome. Science 291:1304–1351
Wang Z, Libault M, Joshi T, Valliyodan B, Nguyen HT et al (2010) SoyDB: a knowledge database of soybean transcription factors. BMC Plant Biol 10:14
Warren WC, The Soybean Mapping Consortium (2006) A physical map of the “Williams 82” soybean (Glycine max) genome. Abstract W151. In: Plant and animal genomes XIV conference, San Diego, CA, 14–18 Jan 2006
Warren RL, Sutton GG, Jones SJM, Holt RA (2007) Assembling millions of short DNA sequences using SSAKE. Bioinformatics 23:500–501
Weissensteiner MH, Pang AWC, Bunikis I, Höijer I, Vinnere-Petterson O, Suh A, Wolf JBW (2017) Combination of short-read, long-read, and optical mapping assemblies reveals large-scale tandem repeat arrays with population genetic implications. Genome Res. doi:10.1101/gr.215095.116
Wu X, Ren C, Joshi T, Vuong T, Xu D, Nguyen HT (2010) SNP discovery by high-throughput sequencing in soybean. BMC Genom 11:469
Wu X, Vuong TD, Leroy JA, Shannon GJ, Sleper DA, Nguyen HT (2011) Selection of a core set of RILs from Forrest × Williams 82 to develop a framework map in soybean. Theor Appl Genet 122:1179–1187
Zerbino DR, Birney E (2008) Velvet: algorithms for de novo short read assembly using de Bruijn graphs. Genome Res 18:821–829
Acknowledgements
The authors are grateful to the United Soybean Board and the United States Department of Agriculture for project support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this chapter
Cite this chapter
Valliyodan, B., Lee, SH., Nguyen, H.T. (2017). Sequencing, Assembly, and Annotation of the Soybean Genome. In: Nguyen, H., Bhattacharyya, M. (eds) The Soybean Genome. Compendium of Plant Genomes. Springer, Cham. https://doi.org/10.1007/978-3-319-64198-0_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-64198-0_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-64196-6
Online ISBN: 978-3-319-64198-0
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)