Skip to main content
Log in

De novo assembly of a Chinese soybean genome

  • Cover Article
  • Published:
Science China Life Sciences Aims and scope Submit manuscript

Abstract

Soybean was domesticated in China and has become one of the most important oilseed crops. Due to bottlenecks in their introduction and dissemination, soybeans from different geographic areas exhibit extensive genetic diversity. Asia is the largest soybean market; therefore, a high–quality soybean reference genome from this area is critical for soybean research and breeding. Here, we report the de novo assembly and sequence analysis of a Chinese soybean genome for “Zhonghuang 13” by a combination of SMRT, Hi–C and optical mapping data. The assembled genome size is 1.025 Gb with a contig N50 of 3.46 Mb and a scaffold N50 of 51.87 Mb. Comparisons between this genome and the previously reported reference genome (cv. Williams 82) uncovered more than 250,000 structure variations. A total of 52,051 protein coding genes and 36,429 transposable elements were annotated for this genome, and a gene co–expression network including 39,967 genes was also established. This high quality Chinese soybean genome and its sequence analysis will provide valuable information for soybean improvement in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Akdemir, K.C., and Chin, L. (2015). HiCPlotter integrates genomic data with interaction matrices. Genome Biol 16, 198.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Badouin, H., Gouzy, J., Grassa, C.J., Murat, F., Staton, S.E., Cottret, L., Lelandais–Brière, C., Owens, G.L., Carrère, S., Mayjonade, B., et al. (2017). The sunflower genome provides insights into oil metabolism, flowering and Asterid evolution. Nature 546, 148–152.

    Article  PubMed  CAS  Google Scholar 

  • Besemer, J., and Borodovsky, M. (2005). GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res 33, W451–W454.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Bickhart, D.M., Rosen, B.D., Koren, S., Sayre, B.L., Hastie, A.R., Chan, S., Lee, J., Lam, E.T., Liachko, I., Sullivan, S.T., et al. (2017). Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 49, 643–650.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Bolger, A.M., Lohse, M., and Usadel, B. (2014). Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114–2120.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Burton, J.N., Adey, A., Patwardhan, R.P., Qiu, R., Kitzman, J.O., and Shendure, J. (2013). Chromosome–scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat Biotechnol 31, 1119–1125.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Byrum, J. R., Kinney, A. J., Shoemaker, R. C., and Diers, B. W. (1995). Mapping of the microsomal and plastid omega–3 fatty acid desaturases in soybean [Glycine max (L.) Merr.]. Soybean Genet Newslett 22, 181–184.

    Google Scholar 

  • Camacho, C., Coulouris, G., Avagyan, V., Ma, N., Papadopoulos, J., Bealer, K., and Madden, T.L. (2009). BLAST+: architecture and applications. BMC BioInf 10, 421.

    Article  CAS  Google Scholar 

  • Carter, T.E., Nelson, R., Sneller, C.H., and Cui, Z. (2004). Soybeans: improvement, production and uses, Third edition (agronomy) (Madison, Wisconsin, USA).

    Google Scholar 

  • Chaisson, M.J., and Tesler, G. (2012). Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory. BMC BioInf 13, 238.

    Article  CAS  Google Scholar 

  • Chan, C., Qi, X., Li, M.W., Wong, F.L., and Lam, H.M. (2012). Recent developments of genomic research in soybean. J Genets Genomics 39, 317–324.

    Article  CAS  Google Scholar 

  • Chen, G., Shi, T., and Shi, L. (2017). Characterizing and annotating the genome using RNA–seq data. Sci China Life Sci 60, 116–125.

    Article  PubMed  CAS  Google Scholar 

  • Childs, K.L., Davidson, R.M., and Buell, C.R. (2011). Gene coexpression network analysis as a source of functional annotation for rice genes. PLoS ONE 6, e22196.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Clavijo, B.J., Venturini, L., Schudoma, C., Accinelli, G.G., Kaithakottil, G., Wright, J., Borrill, P., Kettleborough, G., Heavens, D., Chapman, H., et al. (2017). An improved assembly and annotation of the allohexaploid wheat genome identifies complete families of agronomic genes and provides genomic evidence for chromosomal translocations. Genome Res 27, 885–896.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Contreras–Soto, R.I., Mora, F., Lazzari, F., de Oliveira, M.A.R., Scapim, C. A., and Schuster, I. (2017). Genome–wide association mapping for flowering and maturity in tropical soybean: implications for breeding strategies. Breed Sci 67, 435–449.

    Article  PubMed  PubMed Central  Google Scholar 

  • Du, H., Yu, Y., Ma, Y., Gao, Q., Cao, Y., Chen, Z., Ma, B., Qi, M., Li, Y., Zhao, X., et al. (2017). Sequencing and de novo assembly of a near complete indica rice genome. Nat Commun 8, 15324.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dobin, A., Davis, C.A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M., and Gingeras, T.R. (2013). STAR: ultrafast universal RNA–seq aligner. Bioinformatics 29, 15–21.

    Article  PubMed  CAS  Google Scholar 

  • Dooner, H.K., and He, L. (2008). Maize genome structure variation: interplay between retrotransposon polymorphisms and genic recombination. Plant Cell 20, 249–258.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Du, J., Grant, D., Tian, Z., Nelson, R.T., Zhu, L., Shoemaker, R.C., and Ma, J. (2010). SoyTEdb: a comprehensive database of transposable elements in the soybean genome. BMC Genomics 11, 113.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Fang, C., Ma, Y., Wu, S., Liu, Z., Wang, Z., Yang, R., Hu, G., Zhou, Z., Yu, H., Zhang, M., et al. (2017). Genome–wide association studies dissect the genetic networks underlying agronomical traits in soybean. Genome Biol 18, 161.

    Article  PubMed  PubMed Central  Google Scholar 

  • Foley, J.A., Ramankutty, N., Brauman, K.A., Cassidy, E.S., Gerber, J.S., Johnston, M., Mueller, N.D., O’Connell, C., Ray, D.K., West, P.C., et al. (2011). Solutions for a cultivated planet. Nature 478, 337–342.

    Article  PubMed  CAS  Google Scholar 

  • Funatsuki, H., Kawaguchi, K., Matsuba, S., Sato, Y., and Ishimoto, M. (2005). Mapping of QTL associated with chilling tolerance during reproductive growth in soybean. Theor Appl Genet 111, 851–861.

    Article  PubMed  CAS  Google Scholar 

  • Gai, J., Wang, Y., Wu, X., and Chen, S. (2007). A comparative study on segregation analysis and QTL mapping of quantitative traits in plants— with a case in soybean. Front Agric China 1, 1–7.

    Article  Google Scholar 

  • Githiri, S.M., Yang, D., Khan, N.A., Xu, D., Komatsuda, T., and Takahashi, R. (2007). QTL analysis of low temperature induced browning in soybean seed coats. J Heredity 98, 360–366.

    Article  CAS  Google Scholar 

  • Gizlice, Z., Carter, T.E., and Burton, J.W. (1994). Genetic base for North American public soybean cultivars released between 1947 and 1988. Crop Sci 34, 1143–1151.

    Article  Google Scholar 

  • Guo, H., Liu, J., Luo, L., Wei, X., Zhang, J., Qi, Y., Zhang, B., Liu, H., and Xiao, P. (2017). Complete chloroplast genome sequences of Schisandra chinensis: genome structure, comparative analysis, and phylogenetic relationship of basal angiosperms. Sci China Life Sci 60, 1–5.

    Google Scholar 

  • Haas, B.J. (2003). Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res 31, 5654–5666.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Haas, B.J., Salzberg, S.L., Zhu, W., Pertea, M., Allen, J.E., Orvis, J., White, O., Buell, C.R., and Wortman, J.R. (2008). Automated eukaryotic gene structure annotation using EVidenceModeler and the Program to Assemble Spliced Alignments. Genome Biol 9, R7.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Hirsch, C.N., Hirsch, C.D., Brohammer, A.B., Bowman, M.J., Soifer, I., Barad, O., Shem–Tov, D., Baruch, K., Lu, F., Hernandez, A.G., et al. (2016). Draft assembly of elite inbred line PH207 provides insights into genomic and transcriptome diversity in maize. Plant Cell 28, 2700–2714.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Holligan, D., Zhang, X., Jiang, N., Pritham, E.J., and Wessler, S.R. (2006). The transposable element landscape of the model legume Lotus japonicus. Genetics 174, 2215–2228.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Hoshino, A., Jayakumar, V., Nitasaka, E., Toyoda, A., Noguchi, H., Itoh, T., Shin–I, T., Minakuchi, Y., Koda, Y., Nagano, A.J., et al. (2016). Genome sequence and analysis of the Japanese morning glory Ipomoea nil. Nat Commun 7, 13295.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Hyten, D.L., Song, Q., Zhu, Y., Choi, I.Y., Nelson, R.L., Costa, J.M., Specht, J.E., Shoemaker, R.C., and Cregan, P.B. (2006). Impacts of genetic bottlenecks on soybean genome diversity. Proc Natl Acad Sci USA 103, 16666–16671.

    Article  PubMed  CAS  Google Scholar 

  • Jarvis, D.E., Ho, Y.S., Lightfoot, D.J., Schmöckel, S.M., Li, B., Borm, T.J. A., Ohyanagi, H., Mineta, K., Michell, C.T., Saber, N., et al. (2017). The genome of Chenopodium quinoa. Nature 542, 307–312.

    Article  PubMed  CAS  Google Scholar 

  • Jiao, Y., Peluso, P., Shi, J., Liang, T., Stitzer, M.C., Wang, B., Campbell, M. S., Stein, J.C., Wei, X., and Chin, C.S. (2017). Improved maize reference genome with single–molecule technologies. Nature 546, 524–527.

    PubMed  CAS  Google Scholar 

  • Jun, T.H., Freewalt, K., Michel, A.P., and Mian, R. (2014). Identification of novel QTL for leaf traits in soybean. Plant Breed 133, 61–66.

    Article  CAS  Google Scholar 

  • Kawakatsu, T., Huang, S.S.C., Jupe, F., Sasaki, E., Schmitz, R.J., Urich, M. A., Castanon, R., Nery, J.R., Barragan, C., He, Y., et al. (2016). Epigenomic diversity in a global collection of Arabidopsis thaliana accessions. Cell 166, 492–505.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Keilwagen, J., Hartung, F., Paulini, M., Twardziok, S.O., and Grau, J. (2018). Combining RNA–seq data and homology–based gene prediction for plants, animals and fungi. BMC BioInf 19, 189.

    Article  Google Scholar 

  • Keim, P., Diers, B.W., Olson, T.C., and Shoemaker, R.C. (1990). RFLP mapping in soybean: association between marker loci and variation in quantitative traits. Genetics 126, 735–742.

    PubMed  PubMed Central  CAS  Google Scholar 

  • Khan, N.A., Githiri, S.M., Benitez, E.R., Abe, J., Kawasaki, S., Hayashi, T., and Takahashi, R. (2008). QTL analysis of cleistogamy in soybean. Theor Appl Genet 117, 479–487.

    Article  PubMed  CAS  Google Scholar 

  • Kim, H.K., Kim, Y.C., Kim, S.T., Son, B.G., Choi, Y.W., Kang, J.S., Park, Y.H., Cho, Y.S., and Choi, I.S. (2010). Analysis of quantitative trait loci (QTLs) for seed size and fatty acid composition using recombinant inbred lines in soybean. J Life Sci 20, 1186–1192.

    Article  CAS  Google Scholar 

  • Komatsu, K., Okuda, S., Takahashi, M., Matsunaga, R., and Nakazawa, Y. (2007). Quantitative trait loci mapping of pubescence density and flowering time of insect–resistant soybean (Glycine max L. Merr.). Genet Mol Biol 30, 635–639.

    Article  Google Scholar 

  • Kong, F., Liu, B., Xia, Z., Sato, S., Kim, B.M., Watanabe, S., Yamada, T., Tabata, S., Kanazawa, A., Harada, K., et al. (2010). Two coordinately regulated homologs of FLOWERING LOCUS T are involved in the control of photoperiodic flowering in soybean. Plant Physiol 154, 1220–1231.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kong, F., Nan, H., Cao, D., Li, Y., Wu, F., Wang, J., Lu, S., Yuan, X., Cober, E.R., Abe, J., et al. (2014). A new dominant gene conditions early flowering and maturity in soybean. Crop Sci 54, 2529–2535.

    Article  CAS  Google Scholar 

  • Koo, S.C., Bracko, O., Park, M.S., Schwab, R., Chun, H.J., Park, K.M., Seo, J.S., Grbic, V., Balasubramanian, S., Schmid, M., et al. (2010). Control of lateral organ development and flowering time by the Arabidopsis thaliana MADS–box Gene AGAMOUS–LIKE6. Plant J 62, 807–816.

    Article  PubMed  CAS  Google Scholar 

  • Koren, S., Walenz, B.P., Berlin, K., Miller, J.R., Bergman, N.H., and Phillippy, A.M. (2017). Canu: scalable and accurate long–read assembly via adaptivek–mer weighting and repeat separation. Genome Res 27, 722–736.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Korf, I. (2004). Gene finding in novel genomes. BMC BioInf 5, 59.

    Article  Google Scholar 

  • Krouk, G., Mirowski, P., LeCun, Y., Shasha, D.E., and Coruzzi, G.M. (2010). Predictive network modeling of the high–resolution dynamic plant transcriptome in response to nitrate. Genome Biol 11, R123.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kuroda, Y., Kaga, A., Tomooka, N., Yano, H., Takada, Y., Kato, S., and Vaughan, D. (2013). QTL affecting fitness of hybrids between wild and cultivated soybeans in experimental fields. Ecol Evol 3, 2150–2168.

    Article  PubMed  PubMed Central  Google Scholar 

  • Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S.L. (2004). Versatile and open software for comparing large genomes.. Genome Biol 5, R12.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lam, H.M., Xu, X., Liu, X., Chen, W., Yang, G., Wong, F.L., Li, M.W., He, W., Qin, N., Wang, B., et al. (2010). Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat Genet 42, 1053–1059.

    Article  PubMed  CAS  Google Scholar 

  • Le, B.H., Cheng, C., Bui, A.Q., Wagmaister, J.A., Henry, K.F., Pelletier, J., Kwong, L., Belmonte, M., Kirkbride, R., Horvath, S., et al. (2010). Global analysis of gene activity during Arabidopsis seed development and identification of seed–specific transcription factors. Proc Natl Acad Sci USA 107, 8063–8070.

    Article  PubMed  Google Scholar 

  • Li, B., and Dewey, C.N. (2011). RSEM: accurate transcript quantification from RNA–Seq data with or without a reference genome. BMC BioInf 12, 323.

    Article  CAS  Google Scholar 

  • Li, Y.H., Li, W., Zhang, C., Yang, L., Chang, R.Z., Gaut, B.S., and Qiu, L.J. (2010). Genetic diversity in domesticated soybean (Glycine max) and its wild progenitor (Glycine soja) for simple sequence repeat and singlenucleotide polymorphism loci. New Phytologist 188, 242–253.

    Article  PubMed  CAS  Google Scholar 

  • Li, Y., Zhao, S., Ma, J., Li, D., Yan, L., Li, J., Qi, X., Guo, X., Zhang, L., He, W., et al. (2013). Molecular footprints of domestication and improvement in soybean revealed by whole genome re–sequencing. BMC Genomics 14, 579.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Li, Y., Zhou, G., Ma, J., Jiang, W., Jin, L., Zhang, Z., Guo, Y., Zhang, J., Sui, Y., Zheng, L., et al. (2014). De novo assembly of soybean wild relatives for pan–genome analysis of diversity and agronomic traits. Nat Biotechnol 32, 1045–1052.

    Article  PubMed  CAS  Google Scholar 

  • Lieberman–Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M. O., et al. (2009). Comprehensive mapping of long–range interactions reveals folding principles of the human genome. Science 326, 289–293.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Liu, C., Shi, L., Zhu, Y., Chen, H., Zhang, J., Lin, X., and Guan, X. (2012). CpGAVAS, an integrated web server for the annotation, visualization, analysis, and GenBank submission of completely sequenced chloroplast genome sequences. BMC Genomics 13, 715.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Liu, Z.X., Li, H.H., Wen, Z.X., Fan, X.H., Li, Y.H., Guan, R.X., Guo, Y., Wang, S.M., Wang, D.C., and Qiu, L.J. (2017). Comparison of genetic diversity between Chinese and American soybean (Glycine max (L.)) accessions revealed by high–density SNPs. Front Plant Sci 8, 2014.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lupski, J.R., de Oca–Luna, R.M., Slaugenhaupt, S., Pentao, L., Guzzetta, V., Trask, B.J., Saucedo–Cardenas, O., Barker, D.F., Killian, J.M., Garcia, C.A., et al. (1991). DNA duplication associated with Charcot– Marie–Tooth disease type 1A. Cell 66, 219–232.

    Article  PubMed  CAS  Google Scholar 

  • Lu, S., Zhao, X., Hu, Y., Liu, S., Nan, H., Li, X., Fang, C., Cao, D., Shi, X., Kong, L., et al. (2017). Natural variation at the soybean J locus improves adaptation to the tropics and enhances yield. Nat Genet 49, 773–779.

    Article  PubMed  CAS  Google Scholar 

  • Lv, S., Wu, W., Wang, M., Meyer, R.S., Ndjiondjop, M.N., Tan, L., Zhou, H., Zhang, J., Fu, Y., Cai, H., et al. (2018). Genetic control of seed shattering during African rice domestication. Nat Plants 4, 331–337.

    Article  PubMed  CAS  Google Scholar 

  • Ma, S.S., Bohnert, H.J., and Dinesh–Kumar, S.P. (2015). AtGGM2014, an Arabidopsis gene co–expression network for functional studies. Sci China Life Sci 58, 276–286.

    Article  PubMed  CAS  Google Scholar 

  • Ma, S., Ding, Z., and Li, P. (2017). Maize network analysis revealed gene modules involved in development, nutrients utilization, metabolism, and stress response. BMC Plant Biol 17, 131.

    Article  PubMed  PubMed Central  Google Scholar 

  • Ma, S., Gong, Q., and Bohnert, H.J. (2007). An Arabidopsis gene network based on the graphical Gaussian model. Genome Res 17, 1614–1625.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Mansur, L., Lark, K., Kross, H., and Oliveira, A. (1993). Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean (Glycine max L.). Theor Appl Genet 86, 907–913.

    PubMed  CAS  Google Scholar 

  • Mansur, L.M., Orf, J.H., Chase, K., Jarvik, T., Cregan, P.B., and Lark, K.G. (1996). Genetic mapping of agronomic traits using recombinant inbred lines of soybean. Crop Sci 36, 1327–1336.

    Article  CAS  Google Scholar 

  • Mao, T., Li, J., Wen, Z., Wu, T., Wu, C., Sun, S., Jiang, B., Hou, W., Li, W., Song, Q., et al. (2017). Association mapping of loci controlling genetic and environmental interaction of soybean flowering time under various photo–thermal conditions. BMC Genomics 18, 415.

    Article  PubMed  PubMed Central  Google Scholar 

  • McCarthy, E.M., and McDonald, J.F. (2003). LTR_STRUC: a novel search and identification program for LTR retrotransposons. Bioinformatics 19, 362–367.

    Article  PubMed  CAS  Google Scholar 

  • Oldham, M.C., Horvath, S., and Geschwind, D.H. (2006). Conservation and evolution of gene coexpression networks in human and chimpanzee brains. Proc Natl Acad Sci USA 103, 17973–17978.

    Article  PubMed  CAS  Google Scholar 

  • Orf, J., Chase, K., Jarvik, T., Mansur, L., Cregan, P., Adler, F., and Lark, K. (1999). Genetics of soybean agronomic traits: I. Comparison of three related recombinant inbred populations. Crop Sci 39, 1642–1651.

    Google Scholar 

  • Oyoo, M.E., Githiri, S.M., Benitez, E.R., and Takahashi, R. (2010). QTL analysis of net–like cracking in soybean seed coats. Breed Sci 60, 28–33.

    Article  CAS  Google Scholar 

  • Palomeque, L., Li–Jun, L., Li, W., Hedges, B., Cober, E.R., and Rajcan, I. (2009). QTL in mega–environments: II. Agronomic trait QTL co–localized with seed yield QTL detected in a population derived from a cross of high–yielding adapted × high–yielding exotic soybean lines. Theor Appl Genet 119, 429–436.

    Google Scholar 

  • Pooprompan, P., Wasee, S., Toojinda, T., Abe, J., Chanprame, S., and Srinives, P. (2006). Molecular marker analysis of days to flowering in vegetable soybean (Glycine max (L.) Merrill). Kasetsart Journal 40, 573–581.

    Google Scholar 

  • Ray, D.K., Mueller, N.D., West, P.C., and Foley, J.A. (2013). Yield trends are insufficient to double global crop production by 2050. PLoS ONE 8, e66428.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Raymond, O., Gouzy, J., Just, J., Badouin, H., Verdenaud, M., Lemainque, A., Vergne, P., Moja, S., Choisne, N., Pont, C., et al. (2018). The Rosa genome provides new insights into the domestication of modern roses. Nat Genet 50, 772–777.

    Article  PubMed  CAS  Google Scholar 

  • Reinprecht, Y., Poysa, V.W., Yu, K., Rajcan, I., Ablett, G.R., and Pauls, K.P. (2006). Seed and agronomic QTL in low linolenic acid, lipoxygenasefree soybean (Glycine max (L.) Merrill) germplasm. Genome 49, 1510–1527.

    Article  PubMed  CAS  Google Scholar 

  • Rhee, S.Y., and Mutwil, M. (2014). Towards revealing the functions of all genes in plants. Trends Plant Sci 19, 212–221.

    Article  PubMed  CAS  Google Scholar 

  • Samanfar, B., Molnar, S.J., Charette, M., Schoenrock, A., Dehne, F., Golshani, A., Belzile, F., and Cober, E.R. (2017). Mapping and identification of a potential candidate gene for a novel maturity locus, E10, in soybean. Theor Appl Genet 130, 377–390.

    Article  PubMed  CAS  Google Scholar 

  • Saski, C., Lee, S.B., Daniell, H., Wood, T.C., Tomkins, J., Kim, H.G., and Jansen, R.K. (2005). Complete chloroplast genome sequence of Glycine max and comparative analyses with other legume genomes. Plant Mol Biol 59, 309–322.

    Article  PubMed  CAS  Google Scholar 

  • Schäfer, J., and Strimmer, K. (2005). A shrinkage approach to large–scale covariance matrix estimation and implications for functional genomics. Stat Appl Genet Mol Biol 4, Article32.

  • Schmidt, M.H.W., Vogel, A., Denton, A.K., Istace, B., Wormit, A., van de Geest, H., Bolger, M.E., Alseekh, S., Maß, J., Pfaff, C., et al. (2017). De novo assembly of a newSolanum pennellii accession using nanopore sequencing. Plant Cell 29, 2336–2348.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  • Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D.L., Song, Q., Thelen, J.J., Cheng, J., et al. (2010). Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183.

    Article  PubMed  CAS  Google Scholar 

  • Seo, J.S., Rhie, A., Kim, J., Lee, S., Sohn, M.H., Kim, C.U., Hastie, A., Cao, H., Yun, J.Y., Kim, J., et al. (2016). De novo assembly and phasing of a Korean human genome. Nature 538, 243–247.

    Article  PubMed  CAS  Google Scholar 

  • Serin, E.A.R., Nijveen, H., Hilhorst, H.W.M., and Ligterink, W. (2016). Learning from co–expression networks: possibilities and challenges. Front Plant Sci 7, 444.

    Article  PubMed  PubMed Central  Google Scholar 

  • Servant, N., Varoquaux, N., Lajoie, B.R., Viara, E., Chen, C.J., Vert, J.P., Heard, E., Dekker, J., and Barillot, E. (2015). HiC–Pro: an optimized and flexible pipeline for Hi–C data processing. Genome Biol 16, 259.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Shi, L., Guo, Y., Dong, C., Huddleston, J., Yang, H., Han, X., Fu, A., Li, Q., Li, N., Gong, S., et al. (2016). Long–read sequencing and de novo assembly of a Chinese genome. Nat Commun 7, 12065.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Shimomura, M., Kanamori, H., Komatsu, S., Namiki, N., Mukai, Y., Kurita, K., Kamatsuki, K., Ikawa, H., Yano, R., and Ishimoto, M. (2015). The Glycine max cv. Enrei genome for improvement of Japanese soybean cultivars. Int J Genomics 2015, 358127.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., and Zdobnov, E.M. (2015). BUSCO: assessing genome assembly and annotation completeness with single–copy orthologs. Bioinformatics 31, 3210–3212.

    Article  PubMed  CAS  Google Scholar 

  • Stanke, M., and Morgenstern, B. (2005). AUGUSTUS: a web server for gene prediction in eukaryotes that allows user–defined constraints. Nucleic Acids Res 33, W465–W467.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Studer, A., Zhao, Q., Ross–Ibarra, J., and Doebley, J. (2011). Identification of a functional transposon insertion in the maize domestication gene tbl. Nat Genet 43, 1160–1163.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Tasma, I.M., Lorenzen, L.L., Green, D.E., and Shoemaker, R.C. (2001). Mapping genetic loci for flowering time, maturity, and photoperiod insensitivity in soybean. Mol Breeding 8, 25–35.

    Article  CAS  Google Scholar 

  • VanBuren, R., Bryant, D., Edger, P.P., Tang, H., Burgess, D., Challabathula, D., Spittle, K., Hall, R., Gu, J., Lyons, E., et al. (2015). Single–molecule sequencing of the desiccation–tolerant grass Oropetium thomaeum. Nature 527, 508–511.

    Article  PubMed  CAS  Google Scholar 

  • Walker, B.J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar, S., Cuomo, C.A., Zeng, Q., Wortman, J., Young, S.K., et al. (2014). Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Wang, K., Huang, G., and Zhu, Y. (2016). Transposable elements play an important role during cotton genome evolution and fiber cell development. Sci China Life Sci 59, 112–121.

    Article  PubMed  CAS  Google Scholar 

  • Wang, Z., and Tian, Z.X. (2015). Genomics progress will facilitate molecular breeding in soybean. Sci China Life Sci 58, 813–815.

    Article  PubMed  Google Scholar 

  • Watanabe, S., Xia, Z., Hideshima, R., Tsubokura, Y., Sato, S., Yamanaka, N., Takahashi, R., Anai, T., Tabata, S., Kitamura, K., et al. (2011). A map–based cloning strategy employing a residual heterozygous line reveals that theGIGANTEA gene is involved in soybean maturity and flowering. Genetics 188, 395–407.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Wei, H., Yordanov, Y.S., Georgieva, T., Li, X., and Busov, V. (2013). Nitrogen deprivation promotesPopulus root growth through global transcriptome reprogramming and activation of hierarchical genetic networks. New Phytol 200, 483–497.

    Article  PubMed  CAS  Google Scholar 

  • Wei, L., and Cao, X. (2016). The effect of transposable elements on phenotypic variation: insights from plants to humans. Sci China Life Sci 59, 24–37.

    Article  PubMed  CAS  Google Scholar 

  • Wilson, R.F. (2008). Soybean: Market Driven Research Needs in Genetics and Genomics of Soybean, G. Stacey, ed. (New York: Springer), pp. 3–16.

  • Windram, O., Madhou, P., McHattie, S., Hill, C., Hickman, R., Cooke, E., Jenkins, D.J., Penfold, C.A., Baxter, L., Breeze, E., et al. (2012). Arabidopsis defense against Botrytis cinerea: chronology and regulation deciphered by high–resolution temporal transcriptomic analysis. Plant Cell 24, 3530–3557.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Wolfe, C.J., Kohane, I.S., and Butte, A.J. (2005). Systematic survey reveals general applicability of “guilt–by–association” within gene coexpression networks. BMC BioInf 6, 227.

    Article  CAS  Google Scholar 

  • Xia, Z., Watanabe, S., Yamada, T., Tsubokura, Y., Nakashima, H., Zhai, H., Anai, T., Sato, S., Yamazaki, T., Lü, S., et al. (2012). Positional cloning and characterization reveal the molecular basis for soybean maturity locus E1 that regulates photoperiodic flowering. Proc Natl Acad Sci USA 109, E2155–E2164.

    Article  PubMed  Google Scholar 

  • Yamanaka, N., Nagamura, Y., Tsubokura, Y., Yamamoto, K., Takahashi, R., Kouchi, H., Yano, M., Sasaki, T., and Harada, K. (2000). Quantitative trait locus analysis of flowering time in soybean using a RFLP linkage map.. Breed Sci 50, 109–115.

    Article  CAS  Google Scholar 

  • Yamanaka, N. (2001). An informative linkage map of soybean reveals QTLs for flowering time, leaflet morphology and regions of segregation distortion. DNA Res 8, 61–72.

    Article  PubMed  CAS  Google Scholar 

  • Yue, Y., Liu, N., Jiang, B., Li, M., Wang, H., Jiang, Z., Pan, H., Xia, Q., Ma, Q., Han, T., et al. (2017). A single nucleotide deletion in J encoding gmelf3 confers long juvenility and is associated with adaption of tropic soybean. Mol Plant 10, 656–658.

    Article  PubMed  CAS  Google Scholar 

  • Zabala, G., and Vodkin, L.O. (2007). A rearrangement resulting in small tandem repeats in the F3′5′H gene of white flower genotypes is associated with the soybean locus. Crop Sci 47, S–113.

    Article  Google Scholar 

  • Zhang, J., Chen, L.L., Xing, F., Kudrna, D.A., Yao, W., Copetti, D., Mu, T., Li, W., Song, J.M., Xie, W., et al. (2016). Extensive sequence divergence between the reference genomes of two eliteindica rice varieties Zhenshan 97 and Minghui 63. Proc Natl Acad Sci USA 113, E5163–E5171.

    Article  PubMed  CAS  Google Scholar 

  • Zhang, S.R., Wang, H., Wang, Z., Ren, Y., Niu, L., Liu, J., and Liu, B. (2017). Photoperiodism dynamics during the domestication and improvement of soybean. Sci China Life Sci 60, 1416–1427.

    Article  PubMed  Google Scholar 

  • Zhang, W.K., Wang, Y.J., Luo, G.Z., Zhang, J.S., He, C.Y., Wu, X.L., Gai, J.Y., and Chen, S.Y. (2004). QTL mapping of ten agronomic traits on the soybean (Glycine max L. Merr.) genetic map and their association with EST markers. Theor Appl Genet 108, 1131–1139.

    Article  PubMed  CAS  Google Scholar 

  • Zhao, C., Takeshima, R., Zhu, J., Xu, M., Sato, M., Watanabe, S., Kanazawa, A., Liu, B., Kong, F., Yamada, T., et al. (2016). A recessive allele for delayed flowering at the soybean maturity locus E9 is a leaky allele of FT2a, a FLOWERING LOCUS T ortholog. BMC Plant Biol 16, 20.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Zhou, Z., Jiang, Y., Wang, Z., Gou, Z., Lyu, J., Li, W., Yu, Y., Shu, L., Zhao, Y., Ma, Y., et al. (2015). Resequencing 302 wild and cultivated accessions identifies genes related to domestication and improvement in soybean. Nat Biotechnol 33, 408–414.

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (91531304, 31525018, 31370266, and 31788103), the “Strategic Priority Research Program” of the Chinese Academy of Sciences (XDA08000000), and the State Key Laboratory of Plant Cell and Chromosome Engineering (PCCE–KF–2017–03).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Jianchang Du, Shisong Ma or Zhixi Tian.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shen, Y., Liu, J., Geng, H. et al. De novo assembly of a Chinese soybean genome. Sci. China Life Sci. 61, 871–884 (2018). https://doi.org/10.1007/s11427-018-9360-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11427-018-9360-0

Keywords

Navigation