Introduction

In field experimentation, the evaluation of treatment effects, for example, in crop variety trials, the effects of genotypes of a crop, is carried out against the experimental error variability, which arises due to a contribution of several factors such as local soil fertility, moisture profile, microbial activities, landscape variations, etc. Various standard approaches are followed to control the field variability by means of experimental design and statistical analyses. Commonly-used experimental designs, often practiced by the crop improvement programmes, are based on anticipated, though often not confirmed, homogeneous blocks of suitable sizes (incomplete blocks) [2, 4, 6, 10]. Coupled with the incomplete block design, the possibility of accounting for any possible correlation between the plot errors, particularly auto-correlations along rows and columns, has been found useful in enhancing the efficiency of crop variety trials [3, 5, 12, 13]. These experimental designs and analysis led to higher efficiency for genotypic comparisons and genetic gain over analyses based on complete blocks. Similar advantages were found in chickpea trials [8, 12]. These analysis approaches were based on the assumption of homogeneous error variances. Singh et al. [14] argued that the probably more common situation is that field heterogeneity is present amongst plot errors with different (heterogeneous) error variances. These errors need not have any obvious structured spatial pattern. Such skewed situations arise in experimental stations or on farms, where the residual effects of the crop genotypes may last for more than two crop seasons, making the inter-experimental cover-cropping less effective, or the presence of pests and diseases might create the heterogeneity.

The heterogeneity of a given field is often expressed in terms of the coefficient of variation (CV, in %), where the variation is due to residuals after accounting for any systematic factors such as block differences. Those field trials, which have CV less than 10 % are generally considered well-run trials, the analysis of which cannot be further improved. The findings are normally accepted, including in ISI peer-reviewed journal articles, as long as the CV is less than 10 % as it appears that no substantial precision can be gained by further exploring the plot residuals. Since there is no such theoretical limit on CV arising from the experimental plot errors, the aim of this study is to show that statistical analyses of such supposedly well-run trials can, in fact, be further improved and sharpened by identifying and partitioning such heterogeneities. This study also raises another basic question on the limitation to a single index in terms of CV for describing the heterogeneity of the entire field of multiple plots. The concept of a single homogeneous CV for all plots is based on the assumption of an equal error variance over all plots. It is argued that since plot error variances are most likely to be heterogeneous, field heterogeneity will be better described by calculation of CVs over individual plots. Its distribution summary could include an average of the CVs over the plots, with minimum, maximum, or various quantiles. We address the heterogeneity issue in trials, which are otherwise accepted as well-run and precise in practice, and illustrate the procedure, while discussing the case using legume crop trials. Crop variety trials are generally conducted as multi-environment trials where plot error variances are expected to vary with the environment as per the response scale in those environments. In these cases, a homogeneous error variance is still assumed over all experimental plots in a given environment. However, estimating heterogeneous variances in a single experiment is limited and is being addressed here.

Two legume trials each in lentil and chickpea, where the CV under the randomised complete block (RCB) analysis model was found to be less than 10 % for each trial (i.e. a ‘well-run’ experiment), were studied with the objectives to: (1) explore the un-assumed presence of heterogeneous error variances, (2) identify the groups of experimental units where error variances vary with the group, (3) estimate the standard error of differences of genotype effects and compare the efficiency of the design-analysis methods over the RCB analysis and (4) estimate gain due to selection. Computational details are presented for one of these trials.

Materials and Methods

We consider a set of four trials selected to cover a range of genetic material, comprising F7 and F8 generations in lentil, genotypes suited for winter and spring planting in chickpea in three locations, Tel Hadya and Breda in Syria and Terbol in Lebanon, over the years 1998, 2002, 2004 and 2009. The statistical models for analysing data from RCB design and lattice design will be referred to as RCB model and lattice model, respectively. These trials had CV values of less than 10 % when analysed by means of the RCB model. The CV values were even lower under the lattice model.

Lentil Trials

Lentil-Yield Trial (YT)

Sixteen lentil genotypes were evaluated in a yield trial (F7 generation) conducted in a triple lattice at Tel Hadya, Syria, in 2004 in a 3 × 16 rectangular layout. Tel Hadya is located at 36°01′N, 36°56′E and at an altitude of 284 masl with a long-term average annual rainfall of 334 mm. The harvested plot size was 3 m2 (2 m × 5 rows × 0.3 m inter-row distance), and the CV for seed yield was 6.4 % when analysed by means of either the lattice model or the RCB model. We refer to this experiment as Lentil-YT.

Lentil-Advanced Yield Trial (AYT)

An AYT was conducted in 2002 at Breda in Syria (35°56′N, 37°10′E, at an elevation of 300 masl, long-term average annual rainfall of 266 mm) to evaluate 25 lentil genotypes of an F8 generation in a triple lattice in a 9 × 10 rectangular layout with buffer plots. The harvested plot size was 4.8 m2 (2 m × 8 rows × 0.3 m inter-row distance). The CV was 9.4 and 9.9 % under lattice and RCB models, respectively. We refer to this trial as Lentil-AYT.

Chickpea Trials

Chickpea-Winter Planting (WP)

An experiment was conducted to evaluate 49 genotypes of winter habit chickpeas in a simple lattice on a 4 × 25 (rows × column) layout in 1998 in the winter season at Terbol (33°49′N, 35°59′E), in the Beqaa valley in Lebanon, which has an elevation of 950 masl and a long-term average annual rainfall of 515 mm. The harvested plot size was 2.45 m2 (4 m × 4 rows × 0.35 m inter-row distance: only middle 2 rows of 3.5 m harvested). The CV for grain yield was 5.8 and 6.5 % under lattice and RCB models, respectively. We refer to this trial as Chickpea—WP.

Chickpea-Spring Planting (SP)

A final experiment was conducted at Tel Hadya during the 2009 winter season to evaluate 36 spring habit genotypes of chickpea in a simple lattice. The harvested plot size was 2.45 m2 (4 m × 4 rows × 0.35 m inter-row distance). The CV for yield was 7.8 % under the lattice model and 9.3 % under the RCB model. We refer to this trial as Chickpea—SP.

Statistical Method

Here, we first describe the method used by Singh et al. [14], and then illustrate its application with an example of a Lentil-YT with 16 genotypes.

Consider an experimental design used for generating the response (data) and a model for the data analysis. The data analysis model might have been selected out of a set of candidate models using a certain criterion e.g. an Akaike information criterion (AIC; [1]. Then, plot residuals are obtained by fitting the selected model.

Step 1 (K-Means Clustering)

Based on the data as the squared residuals, the experimental plots are clustered by means of K-means clustering and the criterion which maximizes between group sum of squares. Since variance components are based on the squares of residuals, the clustering is based on squared residuals. The K-means clustering method requires the number of groups or clusters to be set a priori. We varied the number of groups from \( k = 2, \ldots ,10 \). Then, the models, in terms of block effects and any other spatial correlation structure for plot errors, were fitted, while allowing different error variances varying with the cluster/group for each of \( k = 2, \ldots ,10 \). For a given k, the error variance for the plots of group i and the number of plots in it are denoted by σ 2 i and n i ,, respectively, where \( i = 1,2, \ldots ,k \). For example, if the experimental units were grouped into 3 clusters (k = 3), then the fitted model will produce estimates of the 3 error variances: σ 21 on n 1 plots, σ 22 on n 2 plots and σ 23 on n 3 plots. The model fitting is assessed in terms of maximum likelihood of the data for the fitted model. Genstat software [11] was used for computation in this study and expresses the maximum likelihood as restricted maximum likelihood (REML) [9]. The REML value is displayed as a quantity called deviance, which is defined as ‘minus twice the logarithm of REML value’, ignoring a constant, which depends on the fixed terms in the model. It also produces a residual degree of freedom, which varies with the number of error variance components associated with the number of clusters. By means of a clustering method on squared residuals, clusters/groups of plots can be determined. Furthermore, the plots with similar squared residuals would be within a cluster; the error variances arising from different groups of plots could be expected to be different. Two questions arise: how many (heterogeneous) groups are present within plot errors, and whether the error variances are heterogeneous for a selected group of the clusters, that is, for a pre-specified value of k.

Suppose that we want to compare two groups of clusters of the experimental plots, say one group has J clusters (Group J), i.e. with J heterogeneous variances, and another J′(>J) clusters (Group J′). If the Group J′ is nested within Group J, then the change in the deviance can be used as a Chi-square test with J′–J degrees of freedom (d. f.) to test the hypothesis that the error variances in the nested group (Group J′) of units are the same as those of Group J. However, in general, one group of clusters with J′ variances is not necessarily nested within the other group of J variances, in which situation, the model selection can be carried out using a criterion such as AIC (Akaike Information Criterion). AIC has been written in terms of the deviance, AICD = deviance + 2 × number of variance–covariance parameters [13, 17].

Then, based on the AICD values for the model fitted using a constant variance (k = 1, say) and heterogeneous variances for \( k = 2, \ldots ,10 \), select that value of k, say k*, for which the AICD is lowest. Suppose that based on the selected group of k* clusters of plots the estimated variance components are denoted by \( \hat{\sigma }_{i}^{2} (i = 1,2, \ldots ,k^{*} ) \). This Step 1 does not indicate that the variances are all different or whether there is no scope to merge the clusters with variances close by. Then follows Step 2, checking whether the two clusters with the closest variance estimates could be merged and reduced to a new group of clusters, resulting in a reduction down to just one cluster.

Step 2

Let D k be the deviance for the model fitted with k* groups selected by the AICD criterion. Then, merge the two clusters, which have the nearest error variance estimates. In this case, the group with larger number of clusters is nested within the group with the merged smaller number of clusters. The model is fitted again, and error variances and deviance, D k*−1, are computed. If D k*−1 − D k* is greater than the critical values for Chi-square at 1 d. f. at a chosen significance level, then the merging is not accepted and the k* heterogeneous groups are considered final and the process is stopped. Otherwise, the two clusters are merged into one, and the model is fitted with the newly formed group of k* − 1 clusters and the associated AICD value is computed. With this group of k* − 1 clusters, repeat the above step. If merging continues, one needs to proceed until all the units are in one group, which is the (relatively rare) case of homogeneous error variances.

Step 3

Having decided on the clusters of heterogeneous variances, the model is then used to compare genotypes effects and the efficiency of the design. With multiple error variances arising in a single experiment, there will be several CVs based on these variances. The field heterogeneity can then be expressed in terms of the distribution of the CVs, or at least in terms of minimum, maximum, and mean on a plot basis. The mean will be the weighted mean of the CVs, with weights equal to the number of plots within the clusters.

Furthermore, the heritability and genetic advance (gain) due to selection were estimated by fitting the model where the genotypic effects are treated as independent random variables. We obtain estimates of the genotypic variance component and of the heterogeneous error variances, which are likely to differ from that of the model when genotype effects are assumed fixed. Heritability estimates vary with the environmental variance. Thus, for a given trait, if σ 2 g stands for genotypic variance component and σ 2 i the experimental error (environmental variance for a single trial), for the i-th cluster of plots, the heritability on a plot basis is given by h 2 = σ 2 g /(σ 2 g  + σ 2 i ) \( (i = 1,2, \ldots ,k) \). Considering a 20 % selection intensity, the genetic gain, GA (20 %), will be given by

$$ \% GA(20\% )\,\, = \,\,100C(\sigma_{g}^{2} /\bar{Y})/(\,\sigma_{g}^{2} \, + \,\,\sigma_{i}^{2} /r)^{1/2} $$

where for a general p = intensity of selection (0 < p < 1), \( C = \,\,\frac{1}{{p\sqrt {2\pi } }}e^{{ - z_{p}^{2} /2}} \), the truncation point z p in the standard normal distribution is given by the equation \( \,\int\nolimits_{{z_{p} }}^{\infty } {\frac{1}{{\sqrt {2\pi } }}e^{{ - x\,^{2} /2}} } dx\,\, = \,\,1 - p\, \), \( \bar{Y} \) is the trial or location mean and r is the number of replications. For p = 20 %, C = 1.4 [7].

An Illustration

Consider the dataset for Lentil-YT. In an earlier analysis, the best model identified, under the assumption of homogeneous error variances, was ‘randomized complete blocks with a separable first order autoregressive structure along rows and along columns’ (RCBArAr). Further details on such models' descriptions and notations are given in Singh et al. [13]. We attempted fitting heterogeneous error structures based on the clusters obtained from the residuals under RCBArAr, but convergence of the restricted maximum likelihood procedure did not take place. Therefore, we used the residuals under the lattice design and analysis model with constant variance. The K-means clustering approach, which maximizes the between group sum of squares, was applied on the squared residuals. Clusters of the experimental units were obtained for the groups of \( k = 2, \ldots ,10 \) clusters. In Table 1, row k = 1 stands for no grouping having occurred, taking all the units as a single group. For each of these groups of clusters, the mixed linear model was fitted to the data by means of the REML procedure of Genstat. The model was described in terms of random effects for replications, blocks within replications and plot errors having variances varying with the clusters of a given group \( k = 1, \ldots ,10 \). Some of the key directives in fitting the mixed linear models in this case were:

$$ \begin{aligned} & {\text{VCOMPONENTS}}[{\text{Fixed}} = {\text{Geno}}]{\text{Rep}} + {\text{Rep}}.{\text{Blk}} + {\text{f}}.{\text{Rows}}.{\text{Cols}};\;{\text{CONSTRAINTS}} = {\text{POSITIVE}} \\ & {\text{VSTRUCTURE}}[{\text{TERM}} = {\text{f}}.{\text{Rows}}.{\text{Cols}}]\;{\text{MODEL}} = {\text{diag}};\;{\text{Factor}} = {\text{f}} \\ & {\text{REML}}[{\text{PRINT}} = {\text{m}},{\text{c}},{\text{w}},{\text{mean}},{\text{d}}]\;{\text{Yield}} \\ \end{aligned} $$

where Rep, Blk, Geno, f, Rows, Cols and Yield stand for the factors/variate representing plot-wise assignment to replication, blocks within replications, genotypes, grouping factor from the cluster analysis, row and column position on the layout, and yield response, respectively. After fitting these models, we obtained the ‘deviance’, reflecting the departure of the model from the data (Table 1).

Table 1 The residual degrees of freedom, number of random terms, deviance and Akaike Information Criterion expressed as deviance (AICD) when variances were allowed to vary over various clusters

We note that in Table 1, using a k = 3 cluster group results in the lowest AICD value. Thus, there could be three heterogeneous groups. The estimates of the variances for the three groups were obtained as follows: \( \hat{\sigma }_{1}^{2} \, = 5443 \) on 28 plots/experimental units, \( \hat{\sigma }_{2}^{2} \, = 8018 \) on 11 plots and \( \hat{\sigma }_{3}^{2} \, = 23247 \) on 9 plots. Of these, estimates of the first two variances σ 21 and σ 22 are closest as their difference is the smallest of all the paired differences. They were, therefore, merged to form a total of just two groups with 28 + 11 = 39 and 9 plots. Here, the three cluster group is nested within the two cluster group. The mixed model with these two groups was fitted, and the deviance was calculated as D 2 = 327.6 (d.f. = 27).

When compared with deviance D 3 (316.1, d.f. = 26, Table 1) for the group with three clusters, we find a change in the deviance, D 2 − D 3 = 327.6 − 316.1 = 11.5, which, compared against the Chi-square on 27 − 26 = 1 d.f., has a p value of 6.81 × 10−04, which is less than the 5 % level of statistical significance. Thus, the two clusters of the three cluster group should not be merged. Therefore, we conclude that the plot error variances fall into three heterogeneous groups, with variance estimates as above. The positions of the plots with the three cluster group are shown in the layout, as in Table 2.

Table 2 Field layout of the Lentil-PYT and the plot positions of the three (1–3) heterogeneous error variances (1:28 plots with; 2:11 plots with \( \hat{\sigma }_{2}^{2} \, = 8018 \); and 3:9 plots with \( \hat{\sigma }_{3}^{2} \, = 23247 \) )

Using this heterogeneous structure, with factor ‘f’ standing for the three cluster groups in the above directives, one can obtain the best linear unbiased estimates (BLUEs) of the genotype means and an estimated average standard error of the differences of the genotypes effects. In order to compute the heritability and genetic advance due to selection, the first directive in the above codes can be modified as:

$$ {\text{VCOMPONENTS}}\;{\text{Rep}} + {\text{Rep}}.{\text{Blk}} + {\text{Geno}} + {\text{f}}.{\text{Rows}}.{\text{Cols}};\;{\text{CONSTRAINTS}} = {\text{POSITIVE}} $$

The other two lines of codes that follow remain unchanged. After running Genstat with these directives, one would arrive at estimates of the variance components due to genotypes (σ 2 g ) and due to heterogeneous error variances σ 2 k (k = 1, 2, 3) that are likely to be different from those when genotype effects were assumed fixed. The Genstat codes required to carry out the analysis presented here are available from the authors on request.

Table 3 Summary statistics on significance of genotypic differences, CV % and efficiency of design–analysis model duo

Results

CV, Efficiency and Genotypic Significance

Following the method described above, clusters of plots associated with heterogeneous variances were identified, and estimates of their variances, mean CVs per plot, average SED (standard error of differences of genotype effects) and efficiencies of the genotypic comparison were computed. The statistical test of significance of genotypic effect (based the hypothesis of equality of genotype means) is given in terms of a Chi-square distribution based Wald statistics in Genstat [11] and the associated p values for the test of significance are given in Table 3. Table 3 also shows results for RCB and lattice models for comparison.

Table 4 Estimates of variance components, heritability and genetic advance due to 20 % selection intensity

Lentil-YT

In this trial, the lattice design did not show any improvement over RCB. However, a spatial model with first order autocorrelation structure along rows and columns was found to be superior using the AICD model selection criterion [13]. This model had an efficiency of 225 % compared with RCB. Further information on modelling the heterogeneity of variances is given in the illustration above. When modelled with lattice blocks, the plot errors showed three heterogeneous variances. Improved efficiency in this case was over 500 %. Further, the average SED was much lower when the variance heterogeneity was accounted for, which is also reflected in the high value of the Wald test statistic compared to the other three cases of constant error variances. The per-plot CV (4.4 %) was much reduced from 6.4 % in RCB/lattice to 4.8 % under the RCBArAr model (an almost one-third reduction).

Lentil-AYT

In this trial, the CV for the RCB model was 9.9 %. The lattice block design was not very effective, as the efficiency for the lattice design was only 103 % over RCB. The best model using the AICD criterion, the best model with constant variance was ‘randomised complete blocks with plot errors along following first order auto-regression along rows’ (RCBAr), and the efficiency of this model was 125 % compared to RCB. Although the efficiency was higher than the lattice design, the autocorrelation along rows increased the CV only slightly (from 9.9 to 10.1 %). In this case, residuals from the RCBAr models were used for identifying heterogeneous groups for error variances. Except for two clusters, in groups of \( k = 3, \ldots ,10 \) clusters, at least one cluster was found of size 1, i.e. one plot stood out as a plot with a different variance. With poor support to estimate a variance based on just one observation, we selected the RCBAr model with two heterogeneous variances. In this case, the efficiency of genotypes comparison was 252 % compared to RCB. There was also an increased statistical significance in terms of the Wald test statistic of 169.7 for heterogeneous variances vs. 87.39 for constant variance (the RCBAr structure being common). The CV varied from 7.3 to 19.9 % with still a lower average of 8.7 % per plot.

Chickpea-WP

For the Chickpea-WP trial with 49 genotypes and two replications, there were five heterogeneous error variances, with the number of plots per group varying from 8 to 28. The lattice design had an efficiency of 114 % compared with RCB (100 %). As can be expected, the lattice model had a higher power of discriminating between the genotypes as the Wald statistic value was higher than that for RCB. When we incorporated the heterogeneous variances, the efficiency for pairwise genotypes comparisons increased to 160 % compared to RCB. The CV for RCB was 6.5 % and 5.8 % for the lattice model. In the case of heterogeneous variances, the CV varied in the range of 1.3 % (28 plots) to 10.8 % (8 plots), with a mean CV per plot of 4.6 %, which is less than those for the RCB and the lattice models. The Wald statistic was 1368, which showed an enhanced power to discriminate the genotype effects (Table 3).

Chickpea-SP

Four heterogeneous groups of plots with different variances were statistically detected. Accounting for the heterogeneity, in addition to having used a lattice model, increased the efficiency to 165 %, and the significance level in terms of the Wald statistic to 3354 from 461 for the lattice model. The CV per plot was much lower 6.2 % (range: 1.8 % for 34 plots to 18.6 % for 6 plots).

Effect on Genotypic Variance Component, Heritability, and Genetic Advance

Table 4 gives the estimates of variance components due to genotypes and experimental errors, estimates of broad sense per-plot heritability and genetic advance/gain due to selection at 20 % intensity and their mean per-plot basis.

Lentil-YT

There was an increase in genotypic variance from the RCB or lattice model to the spatial model, and the models with heterogeneous plot errors. The spatial model gave the highest genotypic variance component value of the four models. The heterogeneous model involved a lattice model, but not the spatial error autocorrelation structure. Heritability was nearly the same for all the models, except in the case of heterogeneous variances where it varied from 59 to 98 %. A high heritability of over 95 % for seed yield in lentil was also reported by Sarker et al. [12] in a lattice model. The genetic advance values were slightly higher on average by 1 % (from 16 to 17.2 %) over the RCB/lattice models.

Lentil-AYT

There was an increase in genotypic variance under the heterogeneous error variance model over the other three models: RCB, lattice and RCBAr. The average heritability per plot of 53 % (range: 15–58 %) showed a considerable increase over the other three models. The genetic advance increased on average by nearly 1 % (from 8.5 to 9.7 %).

Chickpea-WP and Chickpea-SP

In the case of the two chickpea trials, winter and spring planting, a similar trend as in the case of the two lentil trials was observed. Accounting for the heterogeneity of plot error variance led to a net increase in genotypic variance of 10 % in the Chickpea-WP and 9 % in the Chickpea-SP trials when compared with RCB, while in the case of lattice models, these values were 3.8 % for the Chickpea-WP and 5 % for the Chickpea-SP. There was on average a 10 % increase in heritability over RCB, and a net increase in genetic advance of more than 1 % (14.2 % for the heterogeneous variance model vs. 13 % in RCB for the Chickpea-WP; 28.5 % under the heterogeneous variance model vs. 26.8 % under RCB for the Chickpea-SP).

Discussion and Conclusions

In field trials, crop scientists focus on separating the genotypic differences or variation as precisely as possible from the factors inherent in the field, even after accounting for such known factors as replication, blocks within replications, and any other factors accounting for local fertility trends, or structural parameters such as autocorrelation between plot errors. The magnitude of the CV of plot errors provides a guide to the researchers on how far one should intensify the experimental design and analysis of data aspects to achieve the highest precision level. In orthogonal designs such as complete blocks, the CV and the number of replications determine the precision for comparisons amongst genotypes. In incomplete block (balanced or unbalanced) designs, the autocorrelated plot errors also determine the precision of genotype differences. In the usual context of data analysis from field experiments, the common assumption is that the plot errors have a homogeneous variance. However, due to the presence of many factors creating different degrees of interference, effects on a plot due to residual effects over years or due to within-plot interference due to diseases, pests or competition between and within crop rows [15] such an assumption may not be tenable. Errors in the methodology applied in data collection and entry may also be reflected in extreme residual values, which can be examined for the presence of outliers, or else such errors might be confounded with plot heterogeneity. Often, plot scores such as percentage plant stand (after emergence and at harvest) are taken to explain the heterogeneity, but such variables actually depend on the genotype and should not be used as covariates for plot heterogeneity. Even after accounting for various systematic factors, the pattern of the residuals was found to have a non-constant variance; for example, its spatial structure was found to reflect an exponential variogram model in lentil [12] and a spherical model in herbage plant trials [16].

In the four ‘well-run’ (i.e. CV values of less than 10 %) trials studied, we noticed considerable improvement in efficiency for genotype comparisons and the standard error of differences when lattice blocks and spatial autocorrelations were compared with an RCB model (Table 3). An examination of the graphs (or the table of deviance) established that in all four cases, there was a presence of at least two heterogeneous variances of plot error, as detected by a statistically significant difference in the deviances, when compared with the homogeneous variance model (Table 5). Singh et al. [14] provided a systematic approach for exploring the number of heterogeneous variances. The effect of accounting for the heterogeneity of variances has demonstrated a substantial gain in efficiency of the experimental design, and the enhanced analysis method used in all four trials led to reduced SED values, and hence increased the power of genotypic discrimination. Singh et al. [14] listed examples of high CV values under RCB models where genotypes' effects were found to be non-significant under a homogeneous error variance model. But, by bringing heterogeneous variances into the model, the genotypic difference became apparent. Therefore, some genotypes which were not significantly different from a best check in a homogeneous error variance analysis might become significant in the heterogeneous model or vice versa. While one expects a gain by such an attempt in highly heterogeneous fields, this study showed that improvement in precision can be brought about even in trials considered ‘well-run’, that is to say, with CV values of less than 10 %.

Table 5 Comparing the case of at least two heterogeneous variance groups

There were cases where various groups of heterogeneous variances could not be fitted, while retaining a spatially autocorrelated error structure along rows as well as along columns. This happened in a specific case because of convergence failure in the model fitting. In the case of Lentil-YT, even when sacrificing the autocorrelation structure and introducing heterogeneity of error variances, this led to a substantially higher efficiency (SED = 49.65 for the RCBArAr model vs. SED = 32.95 for the lattice model with three heterogeneous error variances).

The groups of plots with heterogeneous variances identified as mentioned in the Materials and Methods section were further used to estimate the genotype variance component and genetic advance. This process also resulted in a general increase in genotypic variance. This implies that the estimates of heritability and genetic advance in reality could be higher. The presence of multiple error variances, however, does pose a controversial/inconvenient issue of having to deal with multiple indices for heritability and genetic advance. But, the multiple indices offer a more satisfactory explanation of reality than having to compromise on a single value of heritability or genetic advance. In the present four trials, the number of heterogeneous variances varied from 2 to 5. Therefore, it is more realistic to report the distribution of heritability and genetic advance rather than mean values. The mean of genetic advance over the plots gave an additional 1 % gain in each of these cases. Realizing the fact that we are analysing trials with relatively low CV values, this proposed procedure appears to still capture heterogeneous variances, which could help shorten the breeding cycle further by achieving the expected genetic gain sooner and thus increasing breeding efficiency. Modelling has shown that homozygosity can be achieved quicker if we know allele status, which is as expected. In fact, a fully homozygous individual can be identified as early as in the F2, if population size is large enough to accommodate all the loci segregating, which is pure Mendelian genetics. An extension of this study could be to allow the plot error variance to vary with the genotype and evaluate the precision of the predicted means of the genotypes, and will be addressed separately. Furthermore, alternative approaches for formation of groups of homogeneous variances clusters may be explored and compared.

This study was conducted to explore the possibility of the role of heterogeneous variances of plot errors in field experiments, and the study supports that heterogeneous error variances are a reality even in ‘well-run’ trials. The method of analysis proposed by Singh et al. [14] has been described and illustrated here for such ‘well-run’ trials. Modelling of heterogeneous variances, in addition to accommodating the effects of systematic factors and autocorrelations between plot errors, has added substantial value to the trials in terms of a significantly higher efficiency of the experimental design and analysis model, a higher power of genotypic discrimination, a lower average CV and a higher genetic advance. This new analysis approach is recommended for adapting the analysis of field trials, but is also relevant to many other experiments (including outside of agricultural research) permitting more precise examination of the experimental residuals and improved precision of conclusions drawn.