Springer Nature is making SARS-CoV-2 and COVID-19 research free. View research | View latest news | Sign up for updates

Classifying genotypic data from plant breeding trials: a preliminary investigation using repeated checks


Several subjective choices must be made when classifying genotypes based on data from plant breeding trials. One choice involves the method used to weight the contribution each environment makes to the classification. A second involves the use of either genotype-means for each environment or genotypevalues for each block, i.e., considering each block to be a different environment. Another involves whether environments (or blocks) in which genotypes are nonsignificantly different should be included or excluded from such classifications. An alternative to the use of raw or standardized data, is proposed in which each environment is weighted by a discrimination index (DI) that is based on the concept of repeatability. In this study the effect of three weighting methods (raw, standardized and DI), the choice of using environments or blocks, and the choice of including or excluding environments or blocks in which genotypic effects were not significant, were considered in factorial combination to give 12 options. A data set comprised of five check cultivars each repeated six times in each of three blocks at six environments was used. The effect of these options on the ability of a hierarchical clustering technique to correctly classify the repeats into five groups, each consisting of all the six repeats of a particular check cultivar, was investigated. It was found that the DI weighting method generally led to better recovery of the known structure. Using block data rather than environmental data also improved structure recovery for each of the three weighting methods. The exclusive use of environments in which genotypic effects were significant decreased structure recovery while the contrary generally occurred for blocks. The best structure recovery was obtained from the DI weighting applied to blocks (whether genotypes were significant or not).

This is a preview of subscription content, log in to check access.


  1. Blashfield RK (1976) Mixture model tests of cluster analysis: accuracy of four agglomerative hierarchical methods. Psychol Bull 83:377–388

  2. Bull JK, Basford KE, DeLacy IH, Cooper M (1992a) Determining appropriate group number and composition for data sets containing repeated check cultivars. Field Crops Res (in press)

  3. Bull JK, Cooper M, DeLacy IH, Basford KE, Woodruff DR (1992b) Utility of repeated checks for hierarchical classification of data from plant breeding trials. Field Crops Res 30:79–95

  4. Burr EJ (1968) Cluster sorting with mixed character types. I. Standardization of character values. Aust Comput J 1:97–99

  5. Burr EJ (1970) Cluster sorting with mixed character types. II. Fusion strategies. Aust Comput J 2:98–103

  6. Byth DE, Eisemann RL, DeLacy IH (1976) Two-way pattern analysis of a large data set to evaluate genotypic adaptation. Heredity 37:215–230

  7. Davies RG, Boratyński KL (1979) Character selection in relation to the numerical taxonomy of some male Diaspididae (Homoptera: Coccoidea). Biol J Linn Soc 12:95–165

  8. DeLacy IH (1989) Analysis and interpretation of pattern of response of agricultural adaptation experiments. In: DeLacy IH (ed) Analysis of data from agricultural adaptation experiments. ACNARP, Bangkok, pp 50–70

  9. DeLacy IH, Eisemann RL, Cooper M (1990) The importance of genotype-by-environment interaction in regional variety trials. In: Kang MS (ed) Genotype-by-environment interaction and plant breeding. Louisiana State University, Baton Rouge, Louisiana, pp 287–300

  10. DeSarbo WS, Carroll JD, Clark LA, Green PE (1984) Synthesized clustering: a method for amalgamating alternative clustering bases with differential weighting of variables. Psychometrika 49:57–78

  11. DeSoete G, DeSarbo WS, Caroll JD (1985) Optimal variable weighting for hierarchical clustering: an alternating leastsquares algorithm. J Classif 2:173–192

  12. Fehr WR (1987) Principles of cultivar development, vol 1: theory and technique. Macmillan Publishing Company, New York, pp 95–105

  13. Flake RH, von Rudloff E, Turner BL (1969) Quantitative study of clinal variation in Juniperus virginiana using terpenoid data. Proc Natl Acad Sci USA 64:487–494

  14. Gauch HG, Zobel RW (1988) Predictive and postdictive success of statistical analyses of yield trials. Theor Appl Genet 76: 1–10

  15. Ghaderi A, Adams MW, Saettler AW (1982) Environmental response patterns in commercial classes of common bean (Phaseolus vulgaris L.). Theor Appl Genet 63:17–22

  16. Hayward MD, DeLacy IH, Tyler BF, Drake DW (1982) The application of pattern analysis for the recognition of adaptation in a collection of Lolium multiflorum populations. Euphytica 31:383–396

  17. Hogarth DM, Bull JK (1990) The implications of genotype x environment interactions for evaluation of sugarcane families. I. Effect on selection. In: Kang MS (ed) Genotype-byenvironment interaction and plant breeding. Louisiana State University, Baton Rouge, Louisiana, pp 335–344

  18. Hubert L, Arabie P (1985) Comparing partitions. J Classif 2: 193–218

  19. Johnson RW (1982) Effect of weighting and the size of the attribute set in numerical classification. Aust J Bot 30:161–174

  20. Lumelsky VJ (1982) A combined algorithm for weighting the variables and clustering in the clustering problem. Pattern Rec 15:53–60

  21. Milligan GW (1980) An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika 45:325–342

  22. Milligan GW (1989) A validation study of a variable weighting algorithm for cluster analysis. J Classif 6:53–71

  23. Milligan GW, Cooper MC (1986) A study of the comparability of external criteria for hierarchical cluster analysis. Mult Behav Res 21:441–458

  24. Milligan GW, Cooper MC (1988) A study of standardization of variables in cluster analysis. J Classif 5:181–204

  25. Shorter R, Byth DE, Mungomery VE (1977) Genotype x environment interactions and environmental adaptation. II. Assessment of environmental contributions. Aust J Agric Res 28:223–235

  26. Sokal RR, Sneath PHA (1963) Principles of numerical taxonomy. Freeman, San Francisco

  27. Thompson WA, Moore JR (1963) Non-negative estimates of variance components. Technometrics 5:441–449

  28. Thorpe RS (1985) The effect of insignificant characters on the multivariate analysis of simple patterns of geographic variation. Biol J Linn Soc 26:215–223

  29. Ward JH (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58:236–244

  30. Williams WT (1971) Principles of clustering. Annu Rev Ecol Systems 2:303–326

  31. Wishart D (1969) Mode analysis: a generalisation of nearest neighbour which reduces chaining effects. In: Cole AJ (ed) Numerical taxonomy. Academic Press, London, pp 282–311

  32. Yau SK (1991) Need of scale transformation in cluster analysis of genotypes based on multi-location yield data. J Genet Breed 45:71–76

Download references

Author information

Additional information

Communicated by A. R. Hallauer

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Bull, J.K., Basford, K.E., DeLacy, I.H. et al. Classifying genotypic data from plant breeding trials: a preliminary investigation using repeated checks. Theoret. Appl. Genetics 85, 461–469 (1992).

Download citation

Key words

  • Cluster Analysis
  • Genotype x environment interaction
  • Heritability
  • Repeatability
  • Structure-recovery