Introduction

Knowledge about the extent and distribution of genetic diversity in crop plants is important for any breeding program (Rahman et al. 2016; Khan et al. 2007). Linum usitatissimum L. is a multipurpose crop grown in many environments for fibre, food, industrial, feed and potentially pharmaceutical uses. The bast or phloem fibres have outstanding mechanical properties (Fernández 2016) with strength and flexibility. Fibre orientation and ratio of fibre length-to-diameter are important characters (Sliseris et al. 2016). Linseed oil has high α-linolenic acid (an omega-3 fatty acid) content (Schardt 2005) and seeds are used whole or crushed in foods. The hardening oil is used for varnishes, linoleum, putty and for leather preparation. It is in local medicines as demulcent, emollient and laxative, and is taken orally in bronchial infection and diarrhoea (Gul et al. 2016). Cellulosic and lignified material remaining after extraction of fibre or oil is used for manufacture of paper and straw board, and for animal feed or bedding. As a food, linseed is eaten with wot (a stew) particularly in the north Ethiopia and used to make a beverage of fasting days. Linseed is one of the oldest cultivated plants and was an important commercial crop before the invention of petroleum and extensive use of cotton (McHughen 1990). Because of the diverse uses and sustainability, there is renewed interest in its cultivation. Additional information on the genetic basis of variation is required to enable modern breeding exploiting the biodiversity in the species and its wild relatives (McKenzie et al. 2008; Kurt and Evans 1998). In Ethiopia, a Vavilovian centre of crop diversity (Zohary and Hopf 2000) and a major part of the Horn of Africa endemism hot-spot (Harrison and Noss 2017), linseed is valued for food and is also exported. We have assessed the morphological variation in the highly diverse Ethiopian germplasm (Worku et al. 2015), which is grown over a wide topical and sub-tropical environments from 3 to 15°N and from 1200 to 3500 ma.s.l. Ethiopia has several agro-ecological zones (Hurni 1998) and other crops such as durum wheat (Mengistu 2016); enset (Ensete ventricosum (Welw.) Cheesm.) (Olango et al. 2015); tef (Eragrostis tef (Zucc.)) Trotter (Bedane et al. 2015; Ayalew et al. 2015); coffee (Tadele et al. 2014); sesame (Gebremichael and Parzies 2011); and barley (Muhe and Assefa 2011) have many landraces shown to have high genetic diversity in Ethiopia.

DNA-based molecular markers have advanced genetic studies in the last three decades. Reports from different authors (Uysal et al. 2010; Cloutier et al. 2009; Fu et al. 2002; Wiesnerova and Wiesner 2004) showed that RAPDs, AFLPs, SSRs and ISSRs have low diversity in linseed germplasm, although Pali et al. (2015) reported higher variation within 48 Indian accessions. Oil seed cultivars have been considered to have more genetic diversity than fibre cultivars (Fu et al. 2002). As might be expected from genetic bottlenecks during selection, diversity is higher among wild species, and within cultivated germplasm it decreases from landraces to breeding lines and then to cultivars (Smykal et al. 2011; Habibollahi 2015). The wide variation in chromosome number found in the genus Linum (2n = 16 up to 84) indicates that hybridization and polyploidy have played a role in its evolution (Bolsheva et al. 2015). To contribute to a better understanding of the genetic diversity and species relationships in this genus, Bolsheva et al. (2015) recommended comparative studies of karyotypes, and the development of suitable molecular markers will complement this approach.

A high proportion of the linseed genome is composed of repetitive DNA sequences with motifs present in tens to millions of copies (Cullis 2005). Primers amplifying DNA between SSRs (inter simple sequence repeat, ISSR), retroelements (inter retroelement amplified polymorphisms, IRAP) or SSRs and retroelements (retrotransposon-microsatellite amplification polymorphism, REMAP) have proved to be valuable as informative and polymorphic markers (Kalendar and Schulman 2006; Teo et al. 2005; Alsayied et al. 2015). Both SSR (evolving by slippage replication or recombination) and retroelements (amplifying through an RNA intermediate and reinserting in the genome) can evolve rapidly and hence show polymorphisms in germplasm with little diversity. Different markers reveal different classes of variation (Powell et al. 1996), and evolve at different rates, so a primer useful in distinguishing varieties of a crop may be too polymorphic to be of use in wild germplasm of the same species. Saeidi et al. (2006, 2008) analysed a collection of wild Aegilops tauschii Coss. germplasm from Iran, finding that SSR markers developed for use in the D genome (derived from Ae. tauschii) of wheat cultivars were very highly polymorphic. Thus, while useful in the cultivated hexaploid, the SSR markers gave little information about relationships in the wild species because of their fast evolutionary rate compared to the time of diversification of the accessions. In contrast, within the A. tauschii subspecies, another marker class, IRAP markers, were able to show phylogeographic evolutionary patterns over the northern Iran region.

Here, we aimed to exploit molecular (ISSR and IRAP) markers to determine the levels and patterns of genetic diversity and polymorphism of Ethiopian linseed accessions comparing with other international lines and wild relatives. This work towards developing a functional genomic map underpins work to identify candidate genes for agronomic or morphological characters; provides data on linseed crop diversity; can help choose parents for crosses in breeding program; and suggests appropriate sites for increased activity in germplasm conservation or collection.

Materials and methods

Plant materials

Linseed germplasm from cultivated and some wild Linum species included 203 lines of which 192 were linseed accessions (of these 17 genotypes were segregating landraces which were divided during the study) and 11 (one accession represented by two genotypes) were accessions from seven wild Linum relatives (Table S1). The germplasm was acquired from international and in country gene banks, Research Centres, and local farmers’ on-farm holdings (Worku et al. 2015). One hundred and four (92 original + 12 segregating) accessions came from the Ethiopian Biodiversity Institute (EBI); 22 (17 + 5 segregating) lines from Ethiopian Agricultural Research Centres; 57 collections from local farmers; six varieties (AC Carnduff, AC Emerson, AC McDuff, AC Watson, CDC Bethune and Macbeth: AC McDuff represented by two genotypes) were from Canadian Crop Development Centres; one Irish origin linseed cultivar; 10 (PI 253971 with two genotypes) accessions of six wild Linum species from USDA (PI 650336, PI 650322, PI 650318, PI 650315, PI 522290, PI 253971, PI 253971, PI 231886, PI 650297, PI 440472); and L. volkensii Engl. was a new collection from the Ethiopia. Germplasm from Agricultural Research Centres is coded ARCxx and from local farmers is coded WLxxxx. Selection of accessions acquired from EBI took into consideration their distribution to represent the different parts of the country and agro-ecosystems.

For spatial diversity analysis altitude information grouped into eight classes using Agarwal (1996) formula: \(I = \left( {\frac{L - S}{K}} \right)\); where I is class width; L is the largest (3449 ma.s.l.) and S is the smallest (1410 ma.s.l.) values from altitude records, respectively; K is number of classes obtained from \(K = 1 + 3.322\log_{10}^{n}\); and n is total number of observations, which is 130. Former administrative regional divisions of Ethiopia were used to study the regional diversity in the germplasm.

DNA isolation, ISSR and IRAP-PCR analysis

Genomic DNA was isolated from freshly harvested leaves of a young, single plant per accession. One hundred and fifty mg lyophilized dry leaves from 30 to 45-day old seedlings were ground to fine powder using a Silamat plus Vivadent machine in 1.5 mL micro centrifuge tube with 8–10 glass beads. To the finely ground and chilled powdered leaves, a pinch of PVP (polyvinylpyrrolidine) and 1 mL preheated 2% CTAB with β-mercaptoethanol solution was added. Then the homogenized content was transferred to 2 mL tube and incubated for 30 min at 65 °C in a water bath and mixed occasionally 2–3 times by inverting the tubes. Tubes were cooled to ambient temperature and 500 μL chloroform:isoamyl alcohol (24:1) was added to each tube. The content in the tubes was rocked at 180° with 36 turns/min for 30 min using an orbital shaker, the centrifuged for 10 min at 13,000 rpm and the supernatant transferred to a new 2 mL tube and the chloroform extraction steps repeated for the second time. To the final supernatant 600 μL ice-cold isopropanol was added, rocked gently and high molecular weight DNA was spooled out. The DNA extracts were washed using 1 mL wash buffer and dried then the dried pellets were again dissolved in 150 to 250 mL TE pH 8.0 buffer. RNase treatment was conducted and DNA quality and concentration was checked using 0.8% agarose gel electrophoresis and a Nanodrop (Thermo Scientific NanoDrop-1000 Spectrophotometer, USA) for A260/A280 and A260/A230 ratios. Finally, the DNA extracts were stored at − 20 °C until needed for PCR and other experiments.

Two ISSR primers (3PCT1 and 3PCT2) from Wiesner et al. (2001), and a set of six IRAP primers: Sukula, Nikita, 3′LTR, 5′LTR, LTR6149 and LTR6150 (Kalendar et al. 2006; Teo et al. 2005; Saeidi et al. 2008) were used (Table S2). After tests with some genotypes and optimization of annealing temperature, two ISSR and six IRAP primers giving clear and polymorphic patterns were used for further analysis of DNA from single plants of the accessions.

PCR amplification reactions were conducted in a reaction volume of 15µL using appropriate single or pairs of primers, 8.15–9.15 µL sigma water, 1.5 µL 10× buffer, 1 µL of 250 mM MgCl2, 1.2 µL of 10 mM dNTP mix, 1 µL forward primer, 1 µL reverse primer where needed, 1 µL of 100 ng template DNA and 0.15 µL Taq polymerase (KAPA Biosystems). The annealing temperature was optimized using gradient PCR and the annealing temperatures for 3PCT2, Sukula, Nikita and 3′LTR were 60, 48, 44, and 48 °C, respectively. The PCR programme for ISSR (3PCT2) primer was set at 94 °C for 4 min for initial denaturation and then for 30 cycles each cycle comprised 1 min denaturation at 94 °C, 45 s annealing at 60 °C temperature, 2 min extension at 72 °C with 7 min final extension at the end of the 30th cycle and pause at 4 °C. The PCR programme for IRAP primers was 95 °C for 2 min for initial denaturation and then for 30 cycles each cycle comprised 1 min denaturation at 95 °C, 45 s ramping + 0.5 Cs−1 to annealing temperature for each primer, 2 min + 3 s extension at 72 °C with 10 min final extension at the end of the 30th cycle, and pause at 4 °C. Fragments were size-separated by electrophoresis in 2% agarose gels, for IRAP primers, using one part “high resolution” agarose and three parts normal agarose.

Analysis

Each marker band was scored for presence (1) or absence (0), and the data were analysed using the genetic analysis package PowerMarker ver. 3.23. Genetic distances were calculated from the data table using Nei et al. (1983) frequencies between each pair of accessions (Lapitan et al. 2007; Yoon et al. 2012). ISSR and IRAP data was analysed using Neighbour-joining with Nei et al. (1983) genetic distances. In the cluster analyses for the associations between each of two individuals or groups of assays were replicated. Based on the number of individuals or groups of individuals, 100 and 1000 replications were made for cluster analyses or for grouping assays into more genetically related groups. Bootstrap percentages from 50 to 70% were used to determine members of a cluster and phylogeny trees were displayed using Geneious (www.Geneious.com).

Results

ISSR and IRAP primers amplified multiple, polymorphic products from genomic DNA of linseed (L. usitatissimum) and seven wild species. The total numbers of bands, degree of polymorphism, number of unique alleles, range of band size, gene diversity (GD) and polymorphism information content (PIC) exhibited by ISSR and IRAP primers from linseed and wild genotypes were calculated (Table 1a–d). In total, 203 genotypes studied using four optimized PCR primer amplifications, 435 independent marker bands were scored in 41 gels; 88,305 band positions were scored for presence or absence of marker bands. From the 203 defined accessions, 11,008 marker bands (with only 4.80% missing data where amplification or gel failed or was not tested) were analysed. Genotypes were split into cultivated and wild species groups and the results from the parameters for each group are indicated in Table 1a–d. Wild species were polymorphic with all primers and had multiple (up to 36 out of 59) unique alleles (Table 1c). Table 1d shows analysis of the Ethiopian flax and linseed accessions excluding the Canadian reference lines; three of the 22 unique alleles were present in the Canadian germplasm only.

Table 1 Polymorphism and gene diversity exhibited by ISSR and IRAP primers: (a) in 193 accessions from cultivars + 11 accessions from seven wild Linum spp; (b) in 193 linseed accessions (185 from Ethiopia + seven [one doubled] from Canada + one from Ireland); (c) in 11 accessions [one doubled] from seven wild spp. [one from Ethiopia]; and (d) in 185 Ethiopian linseed accessions

The gene diversity (GD) value at individual primer level varied from 0.004 to 0.500 in the assayed genotypes (for the related metric of polymorphism information content (PIC), the maximum value was 0.375). The mean GD value from regional groups of genotypes ranged between 0.209 ± 0.152 from Gondar region and 0.385 ± 0.139 from Kefa region (mean PIC 0.176 ± 0.113 and 0.301 ± 0.106 respectively). The Canadian lines scored 0.210 ± 0.188 mean GD (0.171 ± 0.148 PIC), similar to the least diverse Gondar/Shewa accessions. It was also notable that the lines from the ARC Ethiopian research centres had relatively little diversity (GD 0.242 ± 0.150 and PIC 0.202 ± 0.113).

Germplasm was grouped into its origin in 13 former administrative regions of Ethiopia where these data were available (see Fig. 1, Table S1 and Table 2a). Individual regional groups of accessions (Table 2a) had showed polymorphism levels from 61.90 to 99.65%: the lowest was in Canadian lines and the highest from Gondar region. Ethiopian accessions with collection information were grouped into eight altitude classes (Table 2b). Like regional groups, all altitude classes showed less than 100% polymorphism (91.11 to 96.82%) with the least from altitude 7 (high, around 3000 m) and the most from altitude 2 (low, below 2000 m), almost from the extremes of altitude ranges.

Fig. 1
figure 1

Map of Ethiopia: with Latitude–Longitude position information (Wikimedia Commons); and its former administrative regions

Table 2 Genetic diversity, degree of polymorphism and number of unique alleles: (a) within regional grouping; and (b) within altitude classes

Phylogenetic relationships

The genetic-marker distance data were used for phylogenetic analysis. The overall 203 genotypes were split into different populations based on their species, region and altitude classifications and used to create trees.

The 11 wild Linum accessions (Table S1) were chosen to represent four sections of the genus and various chromosome numbers (2n = 16, 18, 20, 28 30). L. hirsutum L. was used as an outgroup: it is in section Dasylinum, and groups with L. volkensii (an old-world yellow-flowered flax) and L. trigynum L. (both section Linopsis; see McDill et al. 2009). The two L. austriacum L. (sect. Linum), from German and Russia place in different groups: from Russia forms independent cluster and from German groups with L. narbonense L. (sect. Linum) and L. flavum L. (sect. Syllinum). The four L. bienne accessions (sect. Linum) form a well-supported cluster.

When grouped by the 13 former regions (Fig. 1) used for collections, there were five reasonably well-supported clusters with greater than 65% bootstrap (Fig. 2b). Accessions from North Ethiopia (Tigray), grouped into cluster I; accessions from North and northern central parts of Ethiopia (Gondar and Shewa regions) grouped into cluster II; accessions from northwest, southwest and South Ethiopia (Gojam, Wellega, Illubabour, Kefa, Gomugofa and Sidamo) grouped into cluster III; accessions from southeast Ethiopia (Bale and Arsi) grouped into cluster VI; and accessions from eastern Ethiopia (Wollo and Hararghe) grouped into cluster V.

Fig. 2
figure 2

Neighbour joining trees based on ISSR and IRAP data showing the genetic relationships among linseed and other Linum accessions. Numbers represent bootstrap re-sampling values for support at nodes (%). a 11 accessions of seven wild Linum species; b 13 former Ethiopian Administrative Regions for their linseed germplasm genetic diversity based on 163 linseed germplasm; c Eight altitude classes for their linseed germplasm genetic diversity based on 107 linseed germplasm Altitude class 8 is used as an outgroup

The eight altitude classes from a range of 1410–3449 m (Table 2b) were grouped into four clusters by the relationship tree: the high altitude class 8 (3195–3449 m); altitude classes 1 (1410–1665 m) and 3 (1920–2174); altitude classes 4, 5, 6 and 7 (2175–3194 m); and altitude class 2 (1665–1919).

Discussion

Genetic diversity and polymorphism

IRAP and ISSR markers were appropriate for assay of genetic diversity in the Ethiopian linseed accessions and the Linum genus. Levels of polymorphism and shared bands enabled inferences about genetic structure, diversity and biogeography.

The relationship between the seven Linum wild species (Fig. 2a) were broadly consistent with the results based on a combined ITS and chloroplast (ndhF + trnK intron + trnL − F) topology by McDill et al. (2009). Based on EST-SSR primer data Fu (2011), reported L. bienne Mill. and L. decumbens (both sect Linum) collected from different countries grouped into different clusters. Linseed and six of the seven Linum species did not have common alleles for ISSR and IRAP primers used for the study. Similarity between linseed and all four L. bienne Mill. genotypes (Fig. 2a, Table 1) was expected since L. bienne Mill. is the progenitor of L. usitatissimum (Soto-Cerda 2011); one allele was common to all linseed and L. bienne Mill. genotypes but not to other Linum species. Melnikova et al. (2014) also reported L. usitatissimum and L. bienne were grouped under Sect. Linum. L. strictum L., L. keniense Fries., L. holstii Engl. ex R. Wilczek (likely to be a synonym of L. volkensii Engl.) and L. trigynum L. var. sieberi (Planch.) Cuf. are the four wild Linum species in Ethiopia (Edwards 1991). Even these more distant germplasm pools may be useful to increase variation by crossing with linseed, since several Linum species can produce viable and fertile hybrids (Seetharam 1972; Friedt 1989). Diederichsen and Richards (2003) and Cullis (2011) reported the L. bienne Mill. itself is more diverse in morphology than L. usitatissimum: in the accessions in the present study, the three L. bienne Mill. showed variation in their major morphological and maturity characters, with heterogeneity in maturity, plant height, boll size and seed characters (Worku et al. 2015), supported by the high genotypic diversity revealed here (Fig. 2a; Tables 1c and S1). Additional collections and detailed studies on wild relatives may increase the germplasm pool available to breeders for introgression of novel alleles.

The GD and PIC from the present study showed the presence of high GD and PIC in Ethiopian linseed accessions as compared to the results obtained for samples collected from different countries (Soto-Cerda et al. 2012: Žiarovská et al. 2012).” Soto-Cerda et al. (2012) used a core set of linseed germplasm developed from 60 accessions as representative of 16 countries from three continents and obtained 0.34 mean value of GD from 150 microsatellite (85 gSSR and 65 EST-SSR). Žiarovská et al. (2012) reported PIC values from four ISSR and two IRAP primers ranged from 0.12 to 0.37 and from 0.22 to 0.28, respectively from 18 accessions collected from 11 countries. These values were similar to the range of results from the present study. In Piper betle, Grewal et al. (2016) reported ISSR polymorphism was between 43% and 100% (PIC 0.17 to 0.45, mean 0.32). IRAPs were also informative with Crocus, where the lack of diversity within the saffron crocus and high diversity in other members of the genus were evident (Alsayied 2015). Diederichsen (2001) compared the genetic variation from Canadian linseed with world collections and reported Canada linseed cultivars showed higher rate of loss in genetic variation because of the breeding programs which largely use selected varieties (and reflected in the GD and PIC values of Canadian lines here). The selected accessions from Ethiopian Agricultural Research Centres also showed relatively low GD and PIC values.

Despite the substantial morphological variation in all the flax accessions here (Worku et al. 2015), the molecular genotype markers were able to group accessions by both altitude and region (Fig. 2) indicating lack of gene flow across the country and/or selection for particular genotypes in each environment. Ethiopia is a mountainous country including dry areas (Engels and Hawkes 1991; Friis et al. 2010) and diverse topography (particularly altitude), important for agro-ecological zonation (Hurni 1998). Linseed grows in tropical highlands where mean temperatures during the growing period correlate with altitude. Selection may be related to agronomic performance and also regional use differences (Worku et al. 2005, 2012). Previously, Ali et al. (2016) showed clusters of variation between the Ethiopian linseed landraces and non-native cultivars for majority of morphological traits. Noormohammadi et al. (2015) reported wild L. austriacum L. populations at intermediate altitudes have greater diversity than populations at lower and higher altitudes. The existing diversity of linseed in Ethiopia is the contribution of diversities in agroecological systems, cultural history of the people and farmers knowledge (Engels and Hawkes 1991; Kiros-Meles et al. 2008). In addition to agroecological (Hurni 1998), morphological (Worku et al. 2015) and molecular similarity, and geographical proximity, linseed in the Arsi and Bali regions is cultivated for cash and most growers used selected linseed varieties to match their demand for higher price with the demand of oil pressing factories for higher oil content from the crop (Worku et al. 2012). Some of the research centres and sites, like Sinana, Bekoji, Kulumsa, Arsi-Robe, for linseed and other crops breeding are located in the two regions, so knowledge awareness and seed acquisition are easily accessible to farmers in the regions.

Gondar and Gojam are major linseed producing regions while Kefa and Gamugofa are not. The attention given to linseed in major producing regions (including by the government and local administrators to improve their linseed production) means new and uniform varieties may be more prevalent in these regions: Victory, Concurrant, CI-1525 and CI-1652 are examples of the four selected varieties introduced to Ethiopia and released in 1978 and 1984 (Belayneh 1991), where oil pressing machines are used and linseed is a cash crop. Accessions were collected in different decades by EBI from different regions in Ethiopia. In the cluster analysis (see Figure S2 and Table S3a & b), cultivars collected after 2000 by EBI grouped. This suggested that newly introduced lines are currently substituting the native linseed cultivars in Ethiopia on farm. The introduction of uniform lines would be cause for genetic erosion in Ethiopian linseed cultivars (Belayneh 1991). However, during the expansion years, the Tigray Region was not accessible. Worku et al. (2012) reported only a single variety name, Yehagerie or Yehabesha, meaning cultivar from own country, whereas from other regions at least two different additional names of introduced ‘improved’ varieties: Yeferenj (cultivar introduced from abroad) and Yemengist (cultivar given from the state). For breeding and improvement programs, unimproved germplasm (with high diversity and genetic distances) must be included in crosses (as has been successful in wheat; e.g. Ali et al. (2014), particularly with requirement to meet new climatic stress factors. Landrace linseeds and other crops, once identified and integrated into a program, can maintain diversity; while enabling adoption of varieties with improved performance, it is important to mitigate loss of potentially valuable and adapted genotypes.

Conclusions

In Ethiopia linseed is traditionally grown on marginal lands and farmers accept low crop yields since they do not contribute any input for its growth (Worku et al. 2012) and it is drought tolerant (Durant 1976; Seegeler 1983). Nevertheless, there are strong agro-ecological associations between linseed and other grain crops, and a cultural value in the multi-use crop within Ethiopian communities (Mulatu et al. 2002). The presence of different socio-cultural conditions and the diverse tropical and sub-tropical, mountainous topography in Ethiopia support high genetic diversity in linseed both among regional and altitude groups. In contrast, temperate linseed varieties (Canadian here) have lower genetic diversity and polymorphism, expected because of strong selection made by breeders, farmers and researchers.

Exploitation of diversity within linseed, hybridization with L. bienne, and backcrossing with other wild Linum species, can introgress novel and useful characters to linseed for quality (oil and fibre), yield, biotic and abiotic stress resistances. The regional and altitude diversity revealed here suggests genes are available for improvement, particularly with respect to abiotic stress resistances. Both genetic mapping and perhaps association genetics using a combination of DNA markers and performance or morphological data may be valuable for in country breeding centres, and have the potential to expand linseed producing regions and productivity under developing agroecological conditions.