Genomic introgression through interspecific hybridization counteracts genetic bottleneck during soybean domestication
Evidence of introgression, the transfer of genetic material, between crops and their wild relatives through spontaneous hybridization and subsequent backcrossing has been documented; however, the evolutionary patterns and consequences of introgression and its influence on the processes of crop domestication and varietal diversification are poorly understood.
We investigate the genomic landscape and evolution of putative crop-wild-relative introgression by analyzing the nuclear and chloroplast genomes from a panel of wild (Glycine soja) and domesticated (Glycine max) soybeans. Our data suggest that naturally occurring introgression between wild and domesticated soybeans was widespread and that introgressed variation in both wild and domesticated soybeans was selected against throughout the genomes and preferentially removed from the genomic regions underlying selective sweeps and domestication quantitative trait locus (QTL). In both taxa, putative introgression was preferentially retained in recombination-repressed pericentromeric regions that exhibit lower gene densities, reflecting potential roles of recombination in purging introgression. Despite extensive removal of introgressed variation by recurrent selection for domestication-related QTL and associated genomic regions, spontaneous interspecific hybridization during soybean domestication appear to have contributed to a rapid varietal diversification with high levels of genetic diversity and asymmetric evolution between the nuclear and chloroplast genomes.
This work reveals the evolutionary forces, patterns, and consequences of putative genomic introgression between crops and their wild relatives, and the effects of introgression on the processes of crop domestication and varietal diversification. We envision that interspecific introgression serves as an important mechanism for counteracting the reduction of genetic diversity in domesticated crops, particularly the ones under single domestication.
KeywordsDomestication Genetic diversity Introgression Natural selection Recurrent selection Selective sweep Varietal diversification
Identical by descent
Normalized identical by descent
Quantitative trait locus
Relative identical by descent
Recombinant inbred line
Single nucleotide polymorphism
Wild relative of crop
Soybean (Glycine max [L.] Merr.) is one of the most economically important crops in the world, providing a source of high-quality proteins for feed and food as well as vegetable oil/fuels for human consumption and industrial use . It has been widely accepted that soybean was domesticated from its annual wild relative Glycine soja in China approximately ~ 6000–9000 years ago [2, 3], resulting in dramatic morphological and physiological modifications often referred to as the “domestication syndrome” . This process was followed by varietal diversification, forming a multitude of soybean landraces adapted to diverse eco-regions for cultivation in agricultural systems. This scenario, i.e., the single origin of cultivated soybeans, appears to be well supported by recent investigation of genome-wide diversity among cultivated and wild soybeans at population levels, in which all cultivated accessions were exclusively grouped together into a single clade interior to the G. soja clades [5, 6, 7]. The single origin model is also supported by recent identification and isolation of a few key domestication genes such as GmHS1-1 and B1, which control seed coat impermeability and seed coat bloom, respectively [8, 9]. At each of the two domestication loci, the causal mutation responsible for the domestication transition was shared by phylogenetically defined major groups of landraces that are representative of the cultivated soybean population.
Despite the general acceptance of the single-origin model, the domestication history and processes of soybean remain obscure and even under debate . An early study proposed that the transition from the wild to domesticated soybeans occurred as a gradual process . This proposition was solely based on the estimated time (~ 0.27 million years) of divergence between the soybean (cv., Williams 82) reference genome  and a sequenced G. soja accession (var. IT82932) . Given that many wild relative of crop (WRC) accessions are highly diverged from their related crops, such as maize, of single domestication origin [13, 14], such a divergence time does not necessarily suggest the existence of artificially selected intermediates or G. soja/G. max complex from which the cultivated soybean was domesticated. It is more likely that IT82932 is just one of the highly diverged wild soybean accessions in natural soybean population instead of the direct wild progenitor of cultivated soybeans, which may have been extinct after successive changes towards the transition to cultivated soybeans through domestication.
On the other hand, semi-wild soybean accessions, taxonomically described as a distinct species Glycine gracilis, were found in many soybean planting regions in China . In general, these accessions have semi-erect plants with seeds larger than those of G. soja . Thus, they were either considered as evolutionary intermediates between G. soja and G. max [17, 18] or hypothesized as hybrids between them [16, 19]. Recent genotyping-by-sequencing of 72 G. soja accessions, 404 G. max accessions, and 36 G. gracilis accessions suggests that G. gracilis is likely to be a transitional species derived from the evolutionary process of domesticated soybean [5, 10], instead of hybrids between G. soja and G. max. Nevertheless, this study proposed the occurrence of gene flow from wild soybean to cultivated soybean subpopulations .
Introgression by hybridization has been recognized as an important process that occurs to some degree between crops and WRC, including both self-pollinated and cross-pollinated species [20, 21, 22, 23], as the major avenue for gene flow. For example, despite the existence of pre-zygotic and phenological barriers to hybridization between cultivated maize (Zea mays) and its wild progenitor teosinte (Zea mays ssp. mexicana), introgression between the two taxa were detected in both directions of gene flow . More extensive gene flow was detected between cultivated rice (Oryza sativa) and its wild progenitor species (Oryza rufipogon) and between two subspecies of the cultivated rice (O. sativa indica and O. sativa japonica) based on population structure and admixture analyses [21, 23]. Such extensive introgression appears to be responsible for controversial conclusions regarding the domestication history of rice drawn from different studies [24, 25].
Although inter-subpopulation introgression has been revealed by inter-subpopulation admixture in many plant species [5, 22, 23, 25, 26], very few studies have investigated the processes, patterns, and evolutionary consequences of genomic introgression during crop domestication. As G. max is able to hybridize with G. soja to produce fertile seeds, cross-fertilization between the two taxa would have naturally occurred during soybean domestication. Here, we report the evolutionary patterns and consequences of putative interspecific introgression revealed by comparative analysis of both the nuclear and chloroplast genomes of a diverse panel of G. soja and G. max (landrace) accessions that are highly representative of the natural genetic diversity of soybean. Our results not only reveal evolutionary factors reshaping the genomic and genetic variation in the wild and cultivated soybean populations but also exemplify the complexity and dynamics of the domestication processes for crops that were even under single domestication.
Genome-wide identification of putative G. soja-G. max introgression
To investigate genome-wide introgression between cultivated and wild soybeans, we analyzed the whole genome resequencing data from a representative soybean population that includes 62 G. soja accessions, 130 landraces, and 110 improved soybean cultivars collected from diverse eco-geographic regions in China and from other countries including Korea, Japan, Russia, the USA, and Canada . These samples were distributed in nearly all major phylogenetical clades/groups of 18,480 domesticated soybean accessions and 1168 wild soybean accessions collected from 84 countries or developed in the USA that are deposited in the US Department of Agriculture (USDA) Soybean Germplasm Collection  and thus considered to be very representative of soybean genetic diversity. We first identified local regional haplotypes in each of the 62 G. soja and 240 G. max accessions that were identical by descent (IBD) to individuals within the G. soja and G. max subpopulations using all SNP data from the 302 accessions following an approach previously described . To calculate the frequencies of the shared haplotypes in different regions along each chromosome of the soybean genome, we divided each chromosome into bins of 10 kb and calculated the numbers of recorded IBD tracts between each accession and the two G. soja and G. max subpopulations per bin by pairwise comparisons. These numbers were normalized from 0 (no IBD detected) to 1 (IBD shared by all individuals within a subpopulation), and the normalized IBD between each accession and the G. soja subpopulation (nIBDG. soja) and between each accession and the G. max subpopulation (nIBDG. max) were used to calculate the relative IBD (rIBD) between the compared groups (rIBD = nIBDG. soja − nIBDG. max). Finally, the putative genomic introgression from the G. soja subpopulation to each of the cultivated soybean accessions and from the G. max subpopulation to each of the wild soybean accessions was identified.
The introgression regions in PI 578357 span the entire pericentromeric region of chromosome 2 without previously detected domestication QTLs  (Fig. 2) and were thus considered to have undergone minimal selective pressure for their retention. We found that the examined regions of chromosome 2 in PI 578357, grown in Amur, Russia, share the highest sequence similarity with their corresponding regions in a landrace Heihexiaohuangdou (m104) grown in Heihe, China (Fig. 4a, b, d, f). By contrast, the non-introgression regions of PI 578357 share the highest similarity with their corresponding regions in a G. soja accession PI 522226 (s5) grown in Primorye, Russia (Fig. 4b, c, e, f), which is one of the G. soja accessions that are most diverged from PI 578357 and from the G. max lineage (Fig. 1), and thus is unlikely to be involved in or derived from the source population used for soybean domestication. The introgression regions of chromosome 19 in PI 339734, a landrace grown in Korea, were found to share the highest sequence similarity with their corresponding regions in a G. soja accession PI 407275 (s42) that was also grown in Korea (Fig. 4h, i, k, l). By contrast, the non-introgression regions of PI 339734 share the highest similarity with their corresponding regions in a landrace PI 548456 (m111) that was also grown in Pyongyang, North Korea (Fig. 4g, h, j, l). The geographic distribution, the patterns and levels of sequence similarity and divergence of local genomic regions, and the whole genome-wide sequence diversity and phylogeny of these accessions together suggest that the detected chimerism of chromosomes, as described above, was most likely resulted from interspecific introgression instead of incomplete lineage sorting of ancestral polymorphisms in the source population for soybean domestication. Based on gene sequences in the entire genome, the divergence times between PI 578357 and Heihexiaohuangdou and between PI 339734 and PI 407275 were dated to ~ 0.37 and 0.27 million years ago (mya), respectively. As soybean domestication occurred only ~ 6000–9000 years ago [3, 29], such a high level of similarities of the examined introgression regions between the two pairs of (G. soja-G. max) accessions exemplified above should be considered as direct evidence of G. soja-G. max introgression.
Factors shaping the landscape of G. soja-G. max introgression
To understand the evolutionary forces shaping the distribution of genomic introgression in the two subpopulations, we first compared the average proportions of introgressed fragments between pericentromeric regions and chromosomal arms that were roughly defined based on the rates of local genetic recombination and physical positions of centromere-enriched repeats in the soybean genome [12, 30]. In general, pericentromeric regions exhibit severely reduced rates of genetic recombination and biased accumulation for deleterious mutations such as the insertion of transposable elements compared with chromosomal arms [13, 30, 31]. We found that, despite some exceptions, overall the pericentromeric regions have higher proportions of introgressed fragments in either the G. soja subpopulation or the G. max subpopulation (paired Student t test, p value < 0.01, Fig. 2 and Additional file 3: Table S2). D-statistic analysis for pericentromeric regions in comparison with arms was performed using the same subset of G. max and G. soja accessions. As shown in Fig. 3b, the D-statistic for the pericentromeric regions (D = − 0.12 ± 0.06) was significantly lower than the chromosome arms (D = − 0.09 ± 0.02) (Fig. 3b), suggesting a biased accumulation of gene flow in pericentromeric regions. Such a bias may be partially the outcome of the reduced rates of genetic recombination and thus reduced efficiency in purging introgressed variation/fragments in pericentromeric regions compared with chromosomal arms.
In addition, we estimated the proportions of introgressed fragments in 44 QTL regions underlying soybean domestication (Fig. 2). These QTLs have been recently identified using approximately 800 recombinant inbred lines (RIL) derived from crosses between Williams 82 and each of 2 G. soja accessions PI 468916 (s14) and PI 479752 (s15) . We found that the proportion of introgressed fragments in the corresponding domestication QTL regions of either the G. max or G. soja accessions is even smaller than detected in the selective sweep regions (Fig. 5b, d). Together, these observations indicate distinct outcomes and effects of bidirectional selection, i.e., natural selection versus artificial selection, on the retention of introgressed fragments in the G. soja and G. max subpopulations.
Introgression-mediated gene flow surrounding key domestication genes
Genomic introgression has been realized as a major avenue for gene flow; we wondered how gene flow may have affected the domestication process and the genetic architecture of the soybean genome at a population level. Recently, two key soybean domestication genes GmHs1-1 and Bloom1 (B1) which control seed hardedness and seed coat bloom, respectively, have been isolated [8, 9]. Seed coat impermeability and bloom were considered to be important or essential for the long-term survival of wild soybeans, whereas permeable seed coat without bloom was desirable for agricultural production and human consumption and targeted for selection under domestication. The causal mutation at each of the two loci for the key domestication transition was identified and functionally validated [8, 9]. A recessive mutation (C→T) in the coding region of GmHs1-1 resulted in the transition from impermeable seed coat in wild soybean to permeable seed coat in cultivated soybean, and selection for the domesticated allele Gmhs1-1 formed an ~ 160-kb selective sweep region . A recessive mutation (C→T) in the coding region of B1 was responsible for the loss of seed coat bloom in the cultivated soybeans, and selection of the domesticated allele b1 resulted in a ~ 301-kb selective sweep region . To our knowledge, these are only two genes identified to date whose two alleles can nearly exclusively distinguish the wild soybeans from the cultivated soybeans, and meanwhile, the identical causal mutation at each of the two loci for the domesticated phenotypes is shared by the cultivated soybeans.
Selective sweeps surrounding domestication QTL are generally detectable at population levels, but the boundaries of specific haplotypes under selection within a particular selective sweep region vary among individual accessions. To further track the footprints of gene flow involving these key domestication loci, we zoomed in at the two domestication genes and their adjacent sequences including respective presumable promoter and terminator sequences, with a focus on the causative mutation that resulted in the domestication allele at each locus. At the GmHs1-1/Gmhs1-1 locus, 13 landraces were found to share the G. soja or G. soja-like haplotypes including the GmHs1-1-specific nucleotide (C), while 7 G. soja accessions were found to possess the G. max or G. max-like haplotype including Gmhs1-1-specific nucleotide (T) (Fig. 6a and Additional file 2: Figure S4). Consistent with the phylogeny of the B1/b1 sweep regions among the 191 accessions (Fig. 6b), G. soja PI 549046 (s28) was found to possess the b1 haplotype, including the causal mutation (T) for b1, which is identical to that of 191 landraces, while the haplotype of landrace PI 339734 (m30) is highly identical to those of the majority of G. soja accessions including B1-specific nucleotide (C) (Additional file 2: Figure S5). Given the phylogenetic distinction of the G. soja and G. max subpopulations, the detected admixture of the selective sweep regions and haplotypes at the two domestication gene loci in the investigated population would be considered as further evidence of inter-subpopulation gene flow.
Introgression revealed by asymmetric diversification between the nuclear and organellar genomes
For the G. max and G. soja accessions possessing respective G. max and G. soja chloroplasts, more closely related accessions according to their nuclear genomes tend to share identical or more similar haplotypes of their chloroplast genomes (Fig. 7 and Additional file 2: Figure S6), suggesting co-evolution between the nuclear genomes and the chloroplast genomes as a general pattern. However, apparent exceptions were also observed. For example, some accessions with highly diverged nuclear genomes were detected to share identical or nearly identical haplotypes of the chloroplast genomes, and some accessions with more closely related nuclear genomes were detected to have more diverged haplotypes of the chloroplast genomes (Additional file 2: Figure S6). Such exceptions suggest that hybridization events between highly diverged accessions in terms of their nuclear genomes within the G. max subpopulation or within the G. soja subpopulation have also occurred, and such events are very likely to be responsible for the observed unparalleled varietal diversification between the nuclear and chloroplast genomes.
The consequences of selection against introgression
Wild soybeans are widely distributed over a wide range of eco-geographical regions of East Asia including China, the Russian Far East, the Korean Peninsula, and Japan . To survive under such diverse climate and environment, wild soybeans were diverged for a set of adaptation traits such as flowering periods and maturity that are extremely important for the plant’s photoperiod response . On the other hand, the majority of wild soybean accessions share a set of similar characteristics including a procumbent or climbing growth habit and prolific branching, with dehiscent pods and small black seeds with impermeable seed coat and a layer of seed coat bloom . Apparently, these characteristics were advantageous for the long-term survival of wild soybeans, and thus, genes/QTLs underlying these characteristics in wild soybeans were under strong purifying selection over evolutionary time [8, 9, 35]. Compared with the wild progenitor species, cultivated soybeans show upright growth dominated by a single stem, with non-shattering pods and large shiny seeds covered by with yellow and permeable seed coat, a set of domestication-related trait (DRT) that were favorably selected by ancient farmers for production of soybean seeds as a source of food. Of course, the maintenance of these DRT “adapted” for cultivation reply on sustained agricultural practices.
The role of introgression in varietal diversification
Despite nearly complete removal of wild-type alleles of domestication-related genes/QTL from landraces, our results indicate that interspecific introgression has played an important role in enhancing the genetic diversity in cultivated soybeans as illustrated in Fig. 8. This proposition was based on the observation that many of the introgressed fragments, as exemplified in Fig. 4e, were from G. soja accessions highly diverged from the G. max accessions. Given that some of these G. soja accessions are most distant from the cultivated soybeans and thus unlikely to be the direct wild progenitor of cultivated soybeans, the introgression involving those G. soja accessions would have boosted the genetic diversity of cultivated soybeans to a level that could not be reached through varietal diversification that does not involve interspecific introgression (Fig. 8). Although overall the level of genetic diversity in cultivated soybean subpopulation has been substantially reduced compared with the G. soja subpopulation, pairwise comparison estimated the time of divergence among some landraces to > 0.3 mya, very close to the estimated average time (~ 0.37 mya) of divergence among the G. soja accessions (Additional file 2: Figure S7), further suggesting potential contribution of G. soja to varietal diversification of cultivated soybeans through interspecific introgression. As observed in maize , such inter-subpopulation introgression may have contributed to adaptation traits important for radiation of cultivated soybeans to diverse environments beyond the location for soybean domestication.
Introgression and asymmetric nuclear and organellar genome evolution
Actually, the asymmetric patterns of divergence and phylogenetic relationships between nuclear genomes and chloroplast genomes of the G. soja and G. max accessions (Additional file 2: Figure S7) not only reveal interspecific introgression between the two taxa but also suggest the occurrence of introgression within either the G. soja or G. max taxon (Additional file 2: Figure S6). It is obvious that the observed 23 G. max accessions with the G soja-type chloroplast genome and the 3 G. soja accessions processing the G. max-type chloroplast genome are indicative of introgression events involving G. soja and G. max. However, whether an introgression event involving the two taxa possesses heterologous nuclear and chloroplast genomes would be determined by the maternal parent involved in the final hybridization event. Therefore, interspecific introgression events that actually occurred and shaped the genetic architecture of the wild and domesticated soybeans analyzed in this study would outnumber those reflected by the 26 accessions described above.
In addition, due to the limitation of IBD analysis and high levels of sequence similarity, intraspecific introgression events may not be easily detected. In general, the phylogenetic relationships among wild soybean accessions are highly consistent with their geographical distribution [6, 35], but exceptions were also observed, particularly, for the accessions distributed in different eco-regions of China [33, 36, 37]. If asymmetric evolution of the nuclear and chloroplast genomes among cultivated soybeans are indeed indicative of intraspecific hybridization, we envision that such events would have further shaped the genetic architecture of the soybean population, resulting in rapid diversification of local genomic regions among accessions.
Elucidating evolutionary patterns and consequences of introgression between crops and their WRC is important for understanding the processes and history of crop domestication. By profiling the genome-wide distribution of putative G. max-G. soja introgression in the context of genomic features, soybean domestication QTL, selective sweeps, and phylogenetic relationships of a panel of representative soybean accessions constructed based on their nuclear and chloroplast genomes, we illustrated the nature of selection against genomic introgression derived from G. max-G. soja hybridization and subsequent backcrossing, evolutionary forces shaping the genetic and genomic architecture of the G. max and G. soja population, and potential effects of introgression on rapid varietal diversification for adaption to various climate environments for cultivation. These results provide novel insights into the history and dynamic process of soybean domestication and explain the controversial conclusions/debates in terms of the date and location of soybean domestication from previous studies. Because of such extensiveness of deduced introgression in the soybean population, it is important to analyze and compare local genomic regions underlying domestication-related traits or other traits of interest towards effective utilization of genetic diversity for crop improvement.
Acquisition of the nuclear and chloroplast genome sequences
Nuclear and chloroplast genome sequences of soybean accessions investigated in this study were reported previously . The identities of the accessions, the fractions of introgressed fragments detected in each accession, and the alleles at the GmHs1-1 and B1 loci were listed in Additional file 1: Table S1.
Mapping of short sequence reads and detection of genetic variation
The paired-end resequencing reads were downloaded from SRP045129 in NCBI Short Reads Archive by fastq-dump program by SRA-toolkit (version 2.3.5, https://github.com/ncbi/sratoolkit). Then, the low-quality reads were filtered (Phred quality value < 20) using fastx-toolkit  (version 0.0.14, http://hannonlab.cshl.edu/fastx_toolkit/). All cleaned reads were mapped to the soybean reference genome version 1.1 (http://genome.jgi.doe.gov/Phytozome/download/) by BWA program  (version 1.7.3) with four mismatches allowed per read. Only uniquely mapped paired reads were used for the detection of genetic variation. The entire selected bam files were converted into BAM, and potential PCR duplicates were removed using the Samtools program  (version 1.3.1). Variation was called through the best practice pipeline of the Genome Analysis Toolkit (GATK, version 3.5.0)  and Picard-tools programs (picard.sourceforge.net, version 2.0.1) . The SNP VCF files were merged together by VCFtools  (version 0.1.14). SNPs with allele frequencies lower than 1% and distances to adjacent SNPs less than 3 bp were excluded in further analyses.
To investigate genetic variation among chloroplast genomes of the same set of soybean accessions, we mapped all chloroplast reads from each accession to the soybean chloroplast genome sequence from Williams 82 (DQ317523.1, Gene bank in NCBI), following the same protocol as used for the nuclear genomes, and a merged SNP matrix of chloroplast genomes was obtained for analysis of genetic variation.
Pairwise IBD detection
Local regional haplotypes in each G. soja or G. max accession that were identical by descent (IBD) to individuals within the G. soja and G. max subpopulations were identified following an approach previously described , with minor modification. The approach involves two main steps, IBD region identification and relative IBD (rIBD) statistics. The matrix of genome-wide SNPs from the investigated G. soja and G. max populations was served as input for the IBD identification pipeline. All the G. soja and G. max accessions were converted to the beagle format by the in-house Perl scripts and then phased with the fastPhase function of the Beagle program  (version 3.3.2). The shared haplotypes between any two of the G. soja and G. max accessions were detected and extracted with the Beagle fastIBD function of the Beagle program. Phasing and IBD detection were run five times with different thresholds (10−5 to 10−6), independently, for assigning all possible IBDs to the haplotypes of two accessions. The identified IBD tracks from all the five runs were merged and then extracted using custom Perl scripts.
To profile the frequency of shared haplotypes along individual chromosomes, each chromosome was divided into bins of 10 kb with a sliding window of 1 kb, and the number of recorded IBD tracts between each accession and the two groups (i.e., G. soja and G. max) of accessions was computed per bin. As the total number of pairwise comparisons differed between the groups, these numbers were normalized from 0 (no IBD detected) to 1 (IBD shared by all individuals within the group) according to the total number of accessions in the group. The normalized IBD between the accession and the G. soja group (nIBDG. soja) and the normalized IBD between this accession and the G. max group (relative nIBDG. max) were then used to calculate the relative IBD (rIBD = nIBDG. soja − nIBDG. max). We profiled rIBD blocks along chromosomes in the order of the 10-kb bins to define putative genomic introgression. The distribution of putative introgression in each of the examined accessions was illustrated using the Circos program  (version 0.69).
D-statistic has been used to distinguish between the hypotheses of introgression and shared ancestral variation at specific loci [27, 46, 47]. For a specific category of genomic regions, we computed D-statistic values for the regions by pairwise comparison between two accessions from two distinct groups, such as the group of G. soja accessions with putative G. max introgression in the regions versus the group of G. max accessions without any putative genomic introgression and the group of G. max accessions with putative G. soja introgression in the regions versus the group of G. soja accessions without any putative genomic introgression. D-statistic values were calculated following the protocol previously described .
Estimation and profiling of divergence time
To estimate the divergence time between each two accessions, coding sequences (CDS) from genes that covers ≥ 80% in G. soja and G. max accessions were used. Rates of synonymous substitution (Ks) were calculated using the maximum likelihood (ML) method of the CODEML subprogram in the PAML package  (version 4.8, http://abacus.gene.ucl.ac.uk/software/paml.html). The values of Ks were converted to divergence time by employing an average substitution rate of 6.1 × 10−9 .
SNPs from the nuclear genome, chloroplast genome, selective sweep regions (B1 and GmHs1-1) and introgressed regions with MAF > 0.05 and heterozygous rate < 0.50 were used to construct phylogenetic trees using the Neighbor-Join method in MEGA7  and visualized using Evolview , an online visualization tool for phylogenetic trees (version 2).
All statistical tests in this paper were performed using basic packages in R language  (version 3.3.1).
We thank Zhenyan Miao for initial exploration of some data presented in this manuscript.
This work was supported by North Central Soybean Research Program and Indiana Soybean Alliance, and the Agriculture and Food Research Initiative competitive grant # (2015-67013-22811) of the USDA National Institute of Food and Agriculture.
Availability of data and materials
The datasets analyzed during the current study were reported previously. Accession code, SRA: SRP045129.
XW and LC analyzed and interpreted the genomic data regarding the evolutionary patterns of introgression. JM designed the research and interpreted the data. JM was a major contributor in writing the manuscript. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
- 2.Carter T, Hymowitz T, Nelson R. Biogeography, local adaptation, Vavilov, and genetic diversity in soybean. In Biological resources and migration. Berlin: Springer; 2004. p. 47–59.Google Scholar
- 4.Hammer K. Das Domestikationssyndrom. Genet Resour Crop Evol. 1984;32:11–34.Google Scholar
- 19.Fukuda Y. Cytogenetical studies on the wild and cultivated Manchurian soybeans (Glycine L.). JPN J BOT. 1933;6:489–506.Google Scholar
- 28.Swarm SA, Sun L, Wang X, Wang W, Brown PJ, Ma J, Nelson RL. Genetic dissection of domestication-related traits in soybean through genotyping-by-sequencing of two interspecific mapping populations. Theor Appl Genet. 2019. https://doi.org/10.1007/s00122-018-3272-6 [Epub ahead of print] PubMed PMID: 30607438.
- 29.Carter TE, Nelson RL, Sneller CH, Cui Z. Genetic diversity in soybean. In soybeans: Improvement, production, and uses. Volume agronomy. Edited by Boerma HR SJ. Am Soc of Agronomy, Crop Sci Soc of Am, Soil Sci Soc of Am, Madison, 2004. p. 303–416.Google Scholar
- 42.Wysoker A, Tibbetts K, Fennell T. Picard tools version 1.90. 2013.Google Scholar
- 51.Development Core Team R. R: a language and environment for statistical computing. Vienna: the R Foundation for Statistical Computing; 2011. ISBN: 3-900051-07-0Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.