Introduction

Barley (Hordeum vulgare L.) is an extremely important crop for Scotland. There were 291,000 hectares of barley grown in Scotland in 2017 (Agricultural Census 2017) with an estimated value of £250 million (Economic Report on Scottish Agriculture 2018). To protect the industry, all marketed seed is certified. This is to provide assurance to the purchaser that the seed is the correct variety, free from contaminants, has a high germination rate and is free from major pests and diseases. In the EU, the quality standards that certified seed must meet are set out in Council Directive 66/402/EEC (1966). The Organisation for Economic Co-operation and Development (OECD) is responsible for setting the protocols and standards for seed certification. In the UK, NIAB and SASA are the two main organisations that are responsible for crop inspections and the assessment of control plots required for seed certification. Currently around 40 different barley varieties per year are certified at SASA, the certifying authority in Scotland. The identification of the variety during certification is based on the visual inspection of specific characteristics of the seed used to produce the crop (e.g. C1 seed), the crop itself and the seed to be certified (e.g. C2 seed). The management and analysis of certification control plots at SASA is very costly and time consuming.

The taxonomic characters used to ascertain varietal identity and purity are called DUS (Distinctness, Uniformity and Stability) characteristics, and have been assessed during the registration of the new variety according to national and international legislation and protocols (EU Commission Directive 2016/1914/EU; CPVO 2015; UPOV 1994). The UK DUS protocol for barley is available from the animal and plant health agency (APHA) of the UK government. Scoring of DUS characteristics is time consuming, expensive and requires extensive training and expertise.

It has been proposed that molecular testing may be a way to make variety registration and certification more efficient and cost effective (for example in barley, Cockram et al. 2012; Jones et al. 2013; Jamali et al. 2017). Molecular methods combined with appropriate sampling strategies have the potential to determine whether a variety is distinct from other varieties of the same species, and whether it is uniform and stable through propagation.

The International Union for the Protection of New Varieties of Plants (UPOV) is the international organisation with the mission to provide and promote an effective system of plant variety protection, with the aim of encouraging the development of new varieties of plants. A UPOV working group has been considering the potential of molecular markers in support of plant breeders’ rights (UPOV 2010, 2013). UPOV supports the use of “Characteristic-specific molecular markers “, in other words, where molecular markers directly predict a trait. It also supports combining phenotypic and molecular distances in the management of variety collections. However, currently it does not support the use of molecular marker characteristics to replace the DUS characteristics (UPOV 2011). The European community joined UPOV in 2005 and the Community Plant Variety Office (CPVO) is the European Union agency which manages the European Union system of plant variety rights covering the 28 Member States. Examples of where molecular techniques are used in Europe to manage the variety reference collection are potato genotyping (using 9 or 12 SSRs) (Reid et al. 2011; Hoekstra and Reid 2015) and maize genotyping (using 312 KASP SNPs) (UPOV 2014). For potato, genotypes are compared with a database containing over 2000 varieties in common knowledge. Matches of 85% and above are reported to CPVO and the examination office who submitted the sample.

An OECD working group on the use of molecular and biochemical techniques in seed certification was established in 2011. The working group surveys the molecular methods available in order to evaluate their usefulness and to recommend validated tests and how to use them. Internationally validated methods have been approved to be used under certain conditions to complement field inspections and control plots. Another organisation that is interested in molecular tests for variety identification is the International seed testing association (ISTA). A new method for wheat identification using microsatellite markers became effective from January 2017 and is described and published in chapter 8.10.2.1 of the Rules. Other methods using molecular markers are under development, for example, a ring trial using SSRs for barley identification was organised in 2018. There was a joint workshop between OECD, UPOV, ISTA and AOSA on 8th June 2016. The outcome was to produce a joint document explaining the principle features of the systems of OECD, UPOV, AOSA and ISTA and to have an inventory of what molecular techniques each organisation recommends for specific uses.

The work described here is the development of a molecular test to determine the varietal identity of barley using bulk seed samples. The molecular test is based on single nucleotide polymorphism (SNP) genotyping using KASP assays (He et al. 2014). KASP genotyping assays are based upon competitive allele specific PCR. Bi-allelic discrimination is achieved by allele specific primers, each with a different tail sequence. The advantages of SNP analysis using KASP assays are that they are cost effective to run, and simple to set up.

Barley SNPs have previously been extensively studied. The AGOUEB project was an international collaboration which involved development of two Illumina Golden Gate assays called BOPA1 and 2, each able to detect 1536 SNPs (Close et al. 2009). These SNP assays were used to genotype hundreds of barley accessions. The genotypes were grouped using multi-variate analysis into spring barley, two row winter and six row winter barley (Thomas 2014). The data also allowed association of DUS characteristics with specific SNPs (Cockram et al. 2012) and whole genome wide association study (GWAS) mapping of 15 morphological traits, and the identification of the HvbHLH1 gene as the causative gene for loss of anthocyanin pigmentation (Cockram et al. 2010). To improve the resolution of these GWAS a 9k Illumina chip was developed from SNP polymorphism identified by sequencing RNA from 10 different barley cultivars (Comadran et al. 2012). Recently a 50k SNP chip has been developed using SNPs discovered by exome capture from European barley varieties (Bayer et al. 2017). Together with the sequencing of the barley genome by the International Barley Genome Sequencing Consortium in 2012 and 2017 (IBGS 2012; Mascher et al. 2017) this will enable trait linked SNPs to be discovered for selective marker breeding and causative gene detection.

In the work presented here, we identified a minimum set of SNPs to identify and distinguish most of the barley accessions using previous data from around 700 barley varieties genotyped at over 6000 SNPs (Comadran et al. 2012; Looseley et al. 2018). From this subset of 45 SNPs, 38 SNPs were successfully transformed into KASP SNP assays. These 38 KASP SNPs were used to create reference genotypes from the barley varieties certified at SASA. Barley seed was used rather than plant material, to save the time it takes to germinate the seed and grow enough plant material for DNA extraction. The 38 SNP set can distinguish between all of these varieties. The 38 SNP set was used with blind samples and could correctly identify samples with a known reference genotype as well distinguish new varieties.

Materials and methods

SNP selection

To identify the smallest subset from the available 6138 SNPs that can still separate the same pairs as the complete set, we used the IRREDUNDANT directive in the software package GenStat (VSN International 2015). Due to the size of data set, we used the sequential algorithm (Payne and Preece 1980).

DNA extraction from seeds

For reference genotyping of varieties submitted in 2012, 100 seeds were ground using a coffee grinder until the sample was a fine powder. For the other years (2011, 2013 and 2014), 10 seeds of each variety (2 × 5 seeds in a 2 ml Safelock tube (Eppendorf) with a 6 mm cone ball (Retsch) were partially homogenised using a TissueLyser (Qiagen) for 2 × 2 min at 30 Hz. Although this technique did not result in fine powder, the embryo was fully homogenised using this method (Marian McEwan, personal communication). The ground material for both replicate tubes was mixed together and a sub-portion of this was used for DNA extraction. For samples from all years the same DNA extraction method was used: 20 mg of ground material was weighed into a 1.5 ml Safelock tube (Eppendorf) and extracted using the DNeasy Plant Mini Kit (Qiagen). The kit instructions were followed apart from the column was washed twice with buffer AW2, then transferred to a new tube and centrifuged at 14,000 rpm for 2 min to completely dry the membrane. The DNA was eluted in 2 × 50 µl of molecular biology grade water (Sigma).

KASP SNP assay

KASP assays were designed by LGC limited and consist of two tail sequences homologous to two different FRET probes (one VIC labelled, one FAM labelled). When an allele is preferentially amplified, the FRET probe is released from its quencher and florescence corresponding to the amount of allele is released and can be measured at the end point of the PCR reaction. LGC limited recommend using 5–50 ng of template DNA. As this is based on the smaller human genome, for barley a slightly higher concentration of DNA template would be preferable. As there was some variability in DNA concentration between DNA extracts (measured by nanodrop), a final DNA concentration per assay of between 4.5 ng and 65 ng was used. Assays consisted of 5 µl KASP no-ROX master mix (LGC), 0.14 µl KASP assay mix (LGC provided a specific assay mix for each SNP) and 5 µl DNA template. Assays were performed in duplicate in 384 well plates, these were loaded by robot (Hamilton). Thermocycling and allelic discrimination was performed in a ABI 9700 real-time (Thermo-Fisher) with the following cycling conditions: 94 °C 15 min, 10 cycles of 94 °C 20 s, 61–55 °C 60 s (− 0.6 °C per cycle) then 26 cycles 94 °C 20 s, 55 °C 60 s or, 94 °C 15 min, 36 cycles of 94 °C 20 s, 57 °C 60 s, then 30 °C 60 s. The cycling used was as recommended by LGC for each SNP assay. For some of the assays a further cycling step was performed to form tighter clusters: 3 cycles of 94 °C 20 s, 57 °C 60 s. Each assay was run and analysed on a separate 384 well plate. All results were checked by eye. For some SNPs, the allelic discrimination software of the ABI 9700 real-time (Thermo-Fisher) was able to automatically call points, and for others manual analysis was required.

Data analysis

Genotypes were compared using Rogers’ distance (Reif et al. 2005). Dendrograms were then produced following a hierarchical clustering using UPGMA (Sokal and Michener 1958).

Results

Identifying a SNP set for variety identification

A large dataset (Comadran et al. 2012) of around 700 varieties genotyped with 6138 SNPs was used to find an optimal minimal SNP set that can discriminate as many of the varieties as possible. The varieties in this dataset are diverse accessions originating from locations throughout the world. Using the original dataset we found seven pairs of barley varieties could not be distinguished by a single SNP; in other words they were genetically identical to each other across 6138 SNPs. One of these pairs is two varieties which are visually very similar to each other, indicating they could be genetically very similar or identical. Two pairs are varieties which are visually very different to each other and we don’t have information for one or both varieties for the other four pairs (they are not listed on the plant variety database). It is likely that the lack of discrimination of these pairs of varieties may be due to a mix up (handling error during sampling, DNA extraction or genotyping). An additional pair of varieties could not be distinguished by 2 SNPs or more (the varieties only had one SNP difference) and we don’t have information for one of the varieties in the pair.

In order to find a small set of SNPs that can distinguish as many barley varieties as possible, we used a function of Genstat to identify a set of 45 SNPs that can discriminate nearly all varieties tested at 2 SNPs or more (Table 1). This set of 45 SNPs can distinguish most of the varieties by at least 2 SNPs. The 45 SNP set cannot distinguish 10 pairs of varieties by at least two SNPs; 7 pairs are genetically identical (as explained above), three pairs have only one SNP difference. None of these varieties are ones certified in Scotland. The SNP ID, sequence and chromosome location (if known) of the 45 SNPs is shown in Table 1. We consider this small SNP set to be suitable for distinguishing all barley varieties.

Table 1 Name and sequence of SNPs used in this work

Reference genotypes of Scottish varieties

In order to create reference genotypes of barley varieties certified in Scotland we set up a low cost genotyping assay for variety identification. The 45 SNPs were converted into KASP SNP assays. 44 out of the 45 SNPs were successfully developed at LGC limited into KASP assays and 1 failed (**in Table 1). Genotypes were created with KASP assays from seed of barley varieties that were certified at SASA in 2011, 2012, 2013, and 2014. Barley seed was used rather than plant material, such as leaf discs, to save the time it takes to germinate the seed and grow enough plant material for DNA extraction. This ensures the results of the molecular test are achieved as quickly as possible. Preliminary experiments using a few SNPs showed seeds and sprouts gave the same genotyping results (unpublished data). The names and years of the barley varieties used are in Table 2. Genotypes for each variety of a single year were tested in duplicate. 38 SNP assays were found to be of a suitable quality for routine use (bold in Table 1) and this set of SNPs is used for all subsequent analysis. Six KASP assays did not give clear clustering of genotypes and the results were not easy to interpret. These SNPs are not included in the analysis (*in Table 1) and were not used in further studies. The original large data set (around 700 varieties genotyped with 6138 SNPs) was mined to determine whether the 38 SNPs could distinguish most varieties. 8 pairs of varieties are not distinguished with the 38 SNPs by at least one SNP. This compares to 7 pairs of varieties that are not distinguished by the full 6138 SNPs or the 45 SNP set.

Table 2 Barley varieties used in this work, with the year of certification at SASA

All varieties tested can be distinguished from each other (Fig. 1). Most varieties were identical across years and the few exceptions were seen mainly when the genotype appeared heterozygous. Five SNPs showed inconsistencies in a single year of a single variety. Since these were duplicate samples, where the same DNA was used in KASP assay in the same run, this is likely due to inaccurate calling of the genotype. 17 SNPs showed inconsistencies between years for a single variety. One SNP (RS_60145) showed inconsistencies between years for six different varieties. One SNP (RS_162929) showed inconsistencies between years for two different varieties. These inconsistencies could be due to differences in the genotype or inaccurate calling of the genotype. The genotypes of all varieties and years have been kept as ‘reference genotypes’ and can be used to compare samples where the variety is to be determined or verified.

Fig. 1
figure 1

Dendrogram of barley varieties in the reference collection. These are varieties that were certified at SASA in 2012 (or 2011), 2013, and 2014. Genotypes were created with the 38 working SNP KASP assays. Winter varieties are coloured blue and spring varieties are red

Genotyping certifiable seed

To validate that these 38 SNPs can be used to distinguish and identify the barley varieties certified in Scotland, the 2016/2017 sowing season barley varieties submitted to SASA for certification (pre-basic, basic and C1 seed) were genotyped in parallel to characteristic identification in field trials. During the time of this work a new genotyping platform has been developed where approximately 43,000 SNPs are simultaneously genotyped using a 50 k Illumina Infinium iSelect array (Bayer et al. 2017). The barley samples were genotyped using the 50 k SNP chip which contained 35 out of the 38 SNPs. The remaining 3 SNP assays were run using KASP assays in house. The genotype of the samples was compared to the genotype of the reference varieties (Table 3). The genotype of the closest reference variety was found based on the number of SNPs of the submitted variety matching the reference genotypes. For 32 certification samples (out of a total of 60) the sample genotype matched the closest reference genotype at all 38 SNPs, and the submitted variety name was the same as the reference variety name. For 9 certification samples there was 1 SNP difference between sample genotype and closest reference genotype, and for 3 samples there were 2 SNPs difference between the sample genotype and their closest reference genotypes. However, the names of the submitted variety matched the names of the closest reference genotype. The rest of the samples had many (5 or more) SNP differences with the closest reference genotype, and this is because the submitted variety was not present in the reference genotype dataset.

Table 3 2017 certification samples genotyped with 38 SNPs

Discussion

In this work we identified a small number of SNPs that can be used to distinguish and identify barley varieties. 38 SNPs have been used to genotype the barley varieties certified since 2012 in Scotland to create reference genotypes for each variety. The 38 SNP set is able to distinguish these varieties. The reference genotype of a single variety is mostly consistent between samples and certification years. There are exceptions, especially when a heterozygous allele is called. This can be due to differences in genotype or inaccuracies in calling the genotype. Development and maintenance of this reference genotype database will be necessary and the reference genotype database will continue to expand as more varieties are genotyped with the 38 SNPs. The names of the 2017 submitted variety matched the names of the closest reference genotype when the closest reference genotype had 2 or fewer SNPs difference to that of the sample genotype. Where the sample and reference genotype are not identical, possible reasons could be a genetic difference at the SNP or mis-calling of the allele.

The 38 SNPs are expected to be able to distinguish nearly all varieties as previous data (Comadran et al. 2012) including diverse varieties from throughout the world, was examined and only 8 variety pairs out of around 700 varieties could not be distinguished with this 38 SNP set. For this reason it is likely that any new variety developed will have a unique genotype at these 38 SNPs. Genotyping new varieties with the 38 SNPs is ongoing at SASA, and the SNP set will be continually assessed for its ability to distinguish new varieties. Additional SNPs will be added if necessary to the 38 SNP set.

The work presented here is a tool which can be used to identify and confirm barley varieties. This is useful to support seed certification as well as DUS testing. The genotype of a variety can be used as a type of barcode to determine identity. A future aim is to assess the usefulness of SNP genotyping (and this SNP set in particular) for supporting plant breeders’ rights. In order to register a new variety and protect it, the variety must be tested to be distinct, uniform and stable (this is called DUS testing). Currently, DUS characteristics are morphological features which are scored by inspection of plots as determined by UPOV. In order for molecular testing to be used instead of scoring morphological characteristics in seed certification, it would require a shift in practice from variety testing through scoring physical characteristics of plant and seed to molecular testing in the laboratory and to be accepted by UPOV, CPVO and OECD, and the corresponding regulations and protocols adjusted (Van Ettekoven 2017). Currently, using the genotype of a variety as the only way to identity a variety is not supported by UPOV.

However, a working group within UPOV is focused on establishing whether there are other ways genotyping data can be used to support DUS testing, or plant breeders’ rights. Previous work has attempted to determine whether genetic distance can be used to make decisions on which reference varieties to use during DUS testing. For barley, the AGOUEB data was used to investigate this and despite a correlation between genetic distance and morphological distance (determined by DUS characteristics) there was ambiguity when using genetic data instead of morphological data in determining similar varieties (Norris et al. 2011).

Some barley DUS characters can be well predicted using molecular markers for example “Ear: number of row”, “Grain: disposition of lodicules” and “Seasonal type” can be predicted correctly and a further nine characteristics are predicted 81–99% correctly (Cockram et al. 2012). In the case where markers can predict a DUS character, in principle this could replace the morphological character scoring in plots. A future aim would be to use the 50k SNP chip to determine if other SNP markers could also effectively predict additional DUS characteristics. This would reduce the number of characteristics needed to be scored in the field, and enable a reduction in the plot numbers by reducing the number of reference varieties required for comparison. Many other complex barley characteristics have been linked to molecular markers (for example; Fan et al. 2017; Nadolska-Orczyk et al. 2017; Zang et al. 2015; Lakew et al. 2013; Sandhu et al. 2012; Houston et al. 2012; Zhou et al. 2012; Wang et al. 2010, 2012).

Another future aim is to develop a barley variety identification tool to determine varietal purity. This will involve either performing high throughput barley identification on single seeds or by genotyping pooled seeds using a more sensitive SNP detection system which can detect differences of genotypes at very low levels. High throughput genotyping systems are available for KASP assays as well as chip and sequencing based assays. For purity testing it may be more cost effective to use a genotyping by sequencing method (for the 38 SNPs) using DNA extracted from a pool of seeds. The output would be number of reads of specific SNPs, and the results could give an indication of purity if careful calibration with pure reference genotypes was performed. To date no molecular methods have been approved by OECD for determining varietal purity. Previous work on barley purity testing has involved microsatellite genotyping with pools of small numbers of seeds (White et al. 2004).

In conclusion, the SNP assay described in this paper can be used to identify and confirm barley varieties in support of seed certification and DUS testing.