Training Population Design and Resource Allocation for Genomic Selection in Plant Breeding



Wide-scale adoption of genomic selection in cultivar development programs will dramatically alter the role of phenotyping from a direct source of information for selection to an indirect source of information through the training of genomic prediction models. While some features of phenotyping and field trials in plant breeding programs such as measurement of relevant phenotypes in relevant environments will remain the same, some features of plant breeding field trials will most certainly change. Potential changes include which types of individuals are phenotyped in field trials and how field plot resources are allocated to genotypes in order to maximize the amount and quality of information used for developing genomic prediction models. We provide a brief and intuitive review of the current literature on these two topics. By far, the most important consideration in training population design is the genetic relationship with the target population, making the definition of the target population the first and most important step in genomic selection model development. Several algorithms for training population design that show promise have been published and should be considered by practitioners. When the goal of phenotyping is to train genomic prediction models, plots should be allocated to maximize population size, although much more flexibility exists for genomic selection model training compared to QTL mapping and marker-assisted selection applications. Trials across environments can be connected through markers, making it possible to further maximize population size and allowing extensive capturing of genotype x environment interaction effects in early, preliminary yield trial stages.


Training population Prediction error variance Genomic prediction models Linkage disequilibrium RR-BLUP G-BLUP 


  1. Akdemir D, Sanchez JI, Jannink J-L (2015) Optimization of genomic selection training populations with a genetic algorithm. Genet Sel Evol 47:38. doi: 10.1186/s12711-015-0116-6 CrossRefPubMedPubMedCentralGoogle Scholar
  2. Albrecht T, Auinger H-J, Wimmer V et al (2014) Genome-based prediction of maize hybrid performance across genetic groups, testers, locations, and years. Theor Appl Genet 127:1375–1386. doi: 10.1007/s00122-014-2305-z CrossRefPubMedGoogle Scholar
  3. Araus JL, Cairns JE (2014) Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci 19:52–61. doi: 10.1016/j.tplants.2013.09.008 CrossRefPubMedGoogle Scholar
  4. Bassi FM, Bentley AR, Charmet G et al (2016) Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.) Plant Sci 242:23–36. doi: 10.1016/j.plantsci.2015.08.021 CrossRefPubMedGoogle Scholar
  5. Bernardo R (2010) Breeding for quantitative traits in plants, 2nd edn. Stemma Press, Woodbury, MNGoogle Scholar
  6. Bernardo R, Yu J (2007) Prospects for Genomewide selection for quantitative traits in maize. Crop Sci 47:1082–1090. doi: 10.2135/cropsci2006.11.0690 CrossRefGoogle Scholar
  7. Bos I (1983) The optimum number of replications when testing lines or families on a fixed number of plots. Euphytica 32:311–318. doi: 10.1007/BF00021439 CrossRefGoogle Scholar
  8. Cabrera-Bosquet L, Crossa J, von Zitzewitz J et al (2012) High-throughput Phenotyping and genomic selection: the Frontiers of crop breeding ConvergeF. J Integr Plant Biol 54:312–320. doi: 10.1111/j.1744-7909.2012.01116.x CrossRefPubMedGoogle Scholar
  9. Campos de los G, Vazquez AI, Fernando R et al (2013) Prediction of complex human traits using the genomic best linear unbiased predictor. PLoS Genet 9:e1003608. doi: 10.1371/journal.pgen.1003608 CrossRefGoogle Scholar
  10. Clark SA, Hickey JM, Daetwyler HD, van der Werf JH (2012) The importance of information on relatives for the prediction of genomic breeding values and the implications for the makeup of reference data sets in livestock breeding schemes. Genet Sel Evol 44:4. doi: 10.1186/1297-9686-44-4 CrossRefPubMedPubMedCentralGoogle Scholar
  11. Combs E, Bernardo R (2013) Accuracy of Genomewide selection for different traits with constant population size, heritability, and number of markers. Plant Genome 6:0. doi:  10.3835/plantgenome2012.11.0030
  12. Daetwyler HD, Villanueva B, Woolliams JA (2008) Accuracy of predicting the genetic risk of disease using a genome-wide approach. PLoS One 3:e3395. doi: 10.1371/journal.pone.0003395 CrossRefPubMedPubMedCentralGoogle Scholar
  13. Endelman JB, Atlin GN, Beyene Y et al (2014) Optimal Design of Preliminary Yield Trials with genome-wide markers. Crop Sci 54:48–59. doi: 10.2135/cropsci2013.03.0154 CrossRefGoogle Scholar
  14. Gauch HG, Zobel RW (1996) Optimal replication in selection experiments. Crop Sci 36:838–843. doi: 10.2135/cropsci1996.0011183X003600040002x CrossRefGoogle Scholar
  15. Habier D, Fernando RL, Garrick DJ (2013) Genomic BLUP decoded: a look into the black box of genomic prediction. Genetics 194:597–607. doi: 10.1534/genetics.113.152207 CrossRefPubMedPubMedCentralGoogle Scholar
  16. Hearne S, Franco J, Chen J et al (2015) Genome wide assessment of maize Genebank diversity; synthesis of next generation technologies and GIS based approaches. San Diego, USAGoogle Scholar
  17. Heffner EL, Jannink J-L, Iwata H, Souza E, Sorrells ME (2011) Genomic selection accuracy for grain quality traits in biparental wheat populations. Crop Sci 51:2597–2606CrossRefGoogle Scholar
  18. Henderson CR (1975) Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447. doi: 10.2307/2529430 CrossRefPubMedGoogle Scholar
  19. Hickey JM, Dreisigacker S, Crossa J et al (2014) Evaluation of genomic selection training population designs and genotyping strategies in plant breeding programs using simulation. Crop Sci 54:1476–1488. doi: 10.2135/cropsci2013.03.0195 CrossRefGoogle Scholar
  20. Hill WG, Weir BS (2011) Variation in actual relationship as a consequence of Mendelian sampling and linkage. Genet Res 93:47–64. doi: 10.1017/S0016672310000480 CrossRefGoogle Scholar
  21. Isidro J, Jannink J-L, Akdemir D et al (2015) Training set optimization under population structure in genomic selection. Theor Appl Genet 128:145–158. doi: 10.1007/s00122-014-2418-4 CrossRefPubMedGoogle Scholar
  22. Jacobson A, Lian L, Zhong S, Bernardo R (2014) General combining ability model for Genomewide selection in a Biparental cross. Crop Sci 54:895–905. doi: 10.2135/cropsci2013.11.0774 CrossRefGoogle Scholar
  23. Jannink J-L (2005) Selective Phenotyping to accurately map quantitative trait loci. Crop Sci 45:901–908. doi: 10.2135/cropsci2004.0278 CrossRefGoogle Scholar
  24. Jannink J-L, Lorenz AJ, Iwata H (2010) Genomic selection in plant breeding: from theory to practice. Brief Funct Genomics 9:166–177. doi: 10.1093/bfgp/elq001 CrossRefPubMedGoogle Scholar
  25. Jarquin D, Specht J, Lorenz A (2016) Prospects of genomic prediction in the USDA soybean Germplasm collection: historical data creates robust models for enhancing selection of accessions. G3 GenesGenomesGenetics:g3.116.031443. doi: 10.1534/g3.116.031443
  26. Johnson R (2004) Marker-assisted selection. Plant Breeding Reviews. John Wiley & Sons, InGoogle Scholar
  27. Knapp SJ, Bridges WC (1990) Using molecular markers to estimate quantitative trait locus parameters: power and genetic variances for Unreplicated and replicated progeny. Genetics 126:769–777PubMedPubMedCentralGoogle Scholar
  28. Kuehn LA, Notter DR, Nieuwhof GJ, Lewis RM (2007) Changes in connectedness over time in alternative sheep sire referencing schemes. J Anim Sci 86:536–544. doi: 10.2527/jas.2007-0256 CrossRefPubMedGoogle Scholar
  29. Laloë D, Phocas F (2003) A proposal of criteria of robustness analysis in genetic evaluation. Livest Prod Sci 80:241–256. doi: 10.1016/S0301-6226(02)00092-1 CrossRefGoogle Scholar
  30. Laloë D, Phocas F, Ménissier F (1996) Considerations on measures of precision and connectedness in mixed linear models of genetic evaluation. Genet Sel Evol 28:1–20. doi: 10.1186/1297-9686-28-4-359 CrossRefGoogle Scholar
  31. Lehermeier C, Krämer N, Bauer E et al (2014) Usefulness of Multiparental populations of maize (Zea mays L.) for genome-based prediction. Genetics 198:3–16. doi: 10.1534/genetics.114.161943 CrossRefPubMedPubMedCentralGoogle Scholar
  32. Longin CFH, Mi X, Würschum T (2015) Genomic selection in wheat: optimum allocation of test resources and comparison of breeding strategies for line and hybrid breeding. Theor Appl Genet 128:1297–1306. doi: 10.1007/s00122-015-2505-1 CrossRefPubMedGoogle Scholar
  33. Lorenz AJ (2013) Resource allocation for maximizing prediction accuracy and genetic gain of genomic selection in plant breeding: a simulation experiment. G3 GenesGenomesGenetics 3:481–491. doi: 10.1534/g3.112.004911 CrossRefGoogle Scholar
  34. Lorenz AJ, Smith KP (2015) Adding genetically distant individuals to training populations reduces genomic prediction accuracy in barley. Crop Sci 55:2657–2667. doi: 10.2135/cropsci2014.12.0827 CrossRefGoogle Scholar
  35. Lorenzana RE, Bernardo R (2009) Accuracy of genotypic value predictions for marker-based selection in biparental plant populations. Theor Appl Genet 120:151–161. doi: 10.1007/s00122-009-1166-3 CrossRefPubMedGoogle Scholar
  36. Ly D, Hamblin M, Rabbi I et al (2013) Relatedness and genotype × environment interaction affect prediction accuracies in genomic selection: a study in cassava. Crop Sci 53:1312–1325. doi: 10.2135/cropsci2012.11.0653 CrossRefGoogle Scholar
  37. Marulanda JJ, Melchinger AE, Würschum T (2015) Genomic selection in biparental populations: assessment of parameters for optimum estimation set design. Plant Breed 134:623–630. doi: 10.1111/pbr.12317 CrossRefGoogle Scholar
  38. Massman JM, Jung H-JG, Bernardo R (2013) Genomewide selection versus marker-assisted recurrent selection to improve grain yield and Stover-quality traits for cellulosic ethanol in maize. Crop Sci 53:58–66. doi: 10.2135/cropsci2012.02.0112 CrossRefGoogle Scholar
  39. Moreau L, Lemarie S, Charcosset A, Gallais A (2000) Economic efficiency of one cycle of marker-assisted selection. Crop Sci 40:329–337. doi: 10.2135/cropsci2000.402329x CrossRefGoogle Scholar
  40. Pszczola M, Strabel T, Mulder HA, Calus MPL (2012) Reliability of direct genomic values for animals with different relationships within and to the reference population. J Dairy Sci 95:389–400. doi: 10.3168/jds.2011-4338 CrossRefPubMedGoogle Scholar
  41. Riedelsheimer C, Endelman JB, Stange M et al (2013) Genomic predictability of interconnected Biparental maize populations. Genetics 194:493–503. doi: 10.1534/genetics.113.150227 CrossRefPubMedPubMedCentralGoogle Scholar
  42. Riedelsheimer C, Melchinger AE (2013) Optimizing the allocation of resources for genomic selection in one breeding cycle. Theor Appl Genet 126:2835–2848. doi: 10.1007/s00122-013-2175-9 CrossRefPubMedGoogle Scholar
  43. Rincent R, Laloë D, Nicolas S et al (2012) Maximizing the reliability of genomic selection by optimizing the calibration set of reference individuals: comparison of methods in two diverse groups of maize Inbreds (Zea mays L.) Genetics 192:715–728. doi: 10.1534/genetics.112.141473 CrossRefPubMedPubMedCentralGoogle Scholar
  44. Schön CC, Utz HF, Groh S et al (2004) Quantitative trait locus mapping based on resampling in a vast maize testcross experiment and its relevance to quantitative genetics for complex traits. Genetics 167:485–498CrossRefPubMedPubMedCentralGoogle Scholar
  45. Schopp P, Muller D, Technow F, Melchinger A (2017) Accuracy of genomic prediction in synthetic populations depending on the number of parents, relatedness, and ancestral linkage disequilibrium. Genetics 205:441–454CrossRefPubMedGoogle Scholar
  46. Schulz-Streeck T, Ogutu JO, Karaman Z et al (2012) Genomic selection using multiple populations. Crop Sci 52:2453–2461. doi: 10.2135/cropsci2012.03.0160 CrossRefGoogle Scholar
  47. Sen Ś, Johannes F, Broman KW (2009) Selective genotyping and Phenotyping strategies in a complex trait context. Genetics 181:1613–1626. doi: 10.1534/genetics.108.094607 CrossRefPubMedPubMedCentralGoogle Scholar
  48. Song Q, Hyten DL, Jia G et al (2015) Fingerprinting soybean Germplasm and its utility in genomic research. G3 GenesGenomesGenetics 5:1999–2006. doi: 10.1534/g3.115.019000 CrossRefGoogle Scholar
  49. Technow F, Bürger A, Melchinger AE (2013) Genomic prediction of northern corn leaf blight resistance in maize with combined or separated training sets for Heterotic groups. G3 GenesGenomesGenetics 3:197–203. doi: 10.1534/g3.112.004630 CrossRefGoogle Scholar
  50. Whittaker JC, Thompson R, Denham MC (2000) Marker-assisted selection using ridge regression. Genet Res 75:249–252. doi: nullCrossRefPubMedGoogle Scholar
  51. Windhausen VS, Atlin GN, Hickey JM et al (2012) Effectiveness of genomic prediction of maize hybrid performance in different breeding populations and environments. G3 GenesGenomesGenetics 2:1427–1436. doi: 10.1534/g3.112.003699 CrossRefGoogle Scholar
  52. Zhao Y, Gowda M, Liu W et al (2012) Accuracy of genomic selection in European maize elite breeding populations. Theor Appl Genet 124:769–776. doi: 10.1007/s00122-011-1745-y CrossRefPubMedGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.University of MinnesotaMinneapolisUSA

Personalised recommendations