Introduction

Hazel, Corylus avellana, has historically been a major component of woodlands across Europe and especially so in Great Britain and Ireland. Pollen records suggest that hazel was one of the first species to re-colonize Europe after the last ice age (Huntley and Birks 1983; Huntley 1993), and the species’ present-day natural geographic distribution is extensive, ranging from southern Norway and Finland to the north, the Ural mountains of Russia to the east, northern Iberia to the west and Morocco to the south. In Britain and Ireland, it is thought to have replaced the early-colonizer Betula (birch), before being relegated largely to an understory component by the arrival of the three principal tree taxa of Pinus (pine), Quercus (oak) and Ulmus (elm; Huntley and Birks 1983; Mitchell 2006). Chloroplast DNA variation in the sole phylogeographic study on wild hazels (Palme and Vendramin 2002) revealed genetic uniformity across the recolonized part of the species’ range, and it was hypothesised, taking into account the palynological data, that recolonization took place from a southwestern refugium and that populations in other refugia in Italy and the Balkans remained in situ and did not spread outside these areas.

Hazel is a small, deciduous shrub that grows mainly not only in forests but also in hedgerows. It is wind pollinated and monoecious but has mechanisms to prevent selfing, including dichogamy and sporophytic incompatibility (Thompson 1979). Its morphology is highly polymorphic, with the previously described species Corylus maxima, Corylus pontica and Corylus colchica now being included as members of C. avellana (Molnar 2011). Hazel trees produce nuts, which are eaten by a range of animals, including birds such as jays and nuthatches, and squirrels, dormice and other rodents (Howe and Smallwood 1982). It also acts as a host for the rare fungus Hypocreopsis rhododendri whose abundance has declined in Britain due to habitat clearance and consequently is included on the IUCN red list as being “near threatened” (reviewed in Grundy et al. 2012). Genetic, historical and archaeological data suggest that hazel was independently domesticated in the Mediterranean (Spain and Italy) and in the Black Sea countries, and to date, over 400 cultivars have been described (Molnar 2011). There is evidence that hazelnuts were a food source for early people, and thus, hazel may have been planted or accidentally spread by human migration, potentially increasing its range and abundance beyond that which resulted from natural seed dispersal alone (reviewed in Boccacci and Botta 2009).

Like many woodland species, hazel has come under threat from a number of emergent diseases in recent years, as well as being threatened by the same factors which affect native woodlands in general such as climate change, land use change, habitat loss, invasive species and pests (Rackham 2008). It is currently at risk from Eastern Filbert Blight (Anisogramma anomala), a canker disease that is fatal in C. avellana. The fungal pathogen spreads through ascospores, which are discharged from infected branches from autumn to late spring and are spread by rainwater. It can take up to 15 years for a mature infected tree to die, but during that time, the tree retains the potential to release ascospores, which will infect other trees (Johnson et al. 1996). In the UK, the disease has been classified by the Department for Environment, Food and Rural Affairs (DEFRA) as 3/5 in terms of likelihood of entry, 5/5 in terms of likelihood of establishment and 5/5 in terms of impact (DEFRA 2014). Among other potential threats, in Greece and Italy, the bacterium Pseudomonas avellanae, which causes bacterial canker and decline of hazelnut, has resulted in the destruction of cultivated hazel trees, and to date, more than 40,000 trees have been lost (Scortichini et al. 2002).

Knowledge of the genetic structure of natural populations of trees is vital for managing these threats, as well as an essential basis for selection of material for replanting and restocking (Müller-Starck et al. 1992). Recent genetic studies on British and Irish trees suggest that “seed zones,” areas with similar ecological, geographical and climatic features (Herbert et al. 1999) such as those drawn up by the Forestry Commission, may not be necessary for ash (Sutherland et al. 2010; Beatty et al. 2015a), alder (Beatty et al. 2015b) and hawthorn (Brown et al. 2016), due to genetic homogeneity. As there is a lack of information about the level of genetic diversity at a similar scale for hazel, we used highly polymorphic microsatellite markers (Powell et al. 1996) to elucidate levels and patterns of diversity in natural populations of this key native woodland species. We also investigate fine-scale genetic structuring, since it is envisaged that the large seeds of hazel may have limited capacity for dispersal, even within individual woodlands (Howe and Smallwood 1982).

Materials and methods

Sampling and DNA extraction

Samples were collected from 25 sites across Northern Ireland along with four sites in the Republic of Ireland (Fig. 1; Table 1). The samples were taken from a combination of hedgerows and woodland, depending on the site, but mainly the latter. Woodlands were selected that had been designated as ancient or semi-natural, based on the data collected for the Woodland Trust Inventory of ancient and long-established woodland in Northern Ireland (www.backonthemap.org.uk) and the National Survey of Native Woodlands 2003–2008 in the Republic of Ireland (www.npws.ie). Woodlands were also selected based on government information from the Department of the Environment such as Areas of Special Scientific Interest (ASSIs), as well as the landscape character areas listing the woodlands and species present in each region (https://www.doeni.gov.uk). A set of samples were also taken from the natural vegetation surrounding the botanical garden at Sospirolo, Italy. For sampling, a number of leaves were taken from each of a maximum of 30 trees and stored in silica gel. The GPS coordinates of each tree were recorded. DNA was extracted using the cetyltrimethylammonium bromide (CTAB) method (Doyle and Doyle 1987).

Fig. 1
figure 1

Locations of hazel (Corylus avellana) populations used in this study. Numbers correspond to those in Table 1. Not shown—Sospirolo population (Italy)

Table 1 Populations of hazel (Corylus avellana) analysed in this study

Genotyping

All samples were genotyped for seven nuclear microsatellites (A604, A605, A613, A614, B671, B767, B791), which had previously been developed for use in European hazelnut (Gürcan et al. 2010a). PCR was carried out in a total volume of 10 μl containing 100 ng genomic DNA, 5 pmol of 6-FAM-labelled M13 primer, 0.05 pmol of each M13-tailed forward primer, 5 pmol each reverse primer, 1× PCR reaction buffer, 200 μM each dNTP, 2.5 mM MgCl2 and 0.25 U GoTaq Flexi DNA polymerase (Promega, Sunnyvale, CA, USA). PCR was carried out on a number of machines: the MWG Primus thermal cycler (Ebersberg, Germany), MJ Research PTC-200 and PTC-220 Gradient Peltier thermal cyclers (Quebec, Canada) and Biometra T-Gradient thermal cycler (Göttingen, Germany) using the following conditions: initial denaturation at 94 °C for 3 min followed by either 35 (A604, A605) or 40 (A613, A614, B671, B767, B791) cycles of denaturation at 94 °C for 30 s, annealing at 58 °C for 30 s, extension at 72 °C for 30 s and a final extension at 72 °C for 5 min. Genotyping was carried out on an AB3730xl capillary genotyping system (Applied Biosystems, Foster City, CA, USA). Allele sizes were scored using the GeneMarker software (V1.8, SoftGenetics).

Data analysis

GENEPOP (V3.4; Raymond and Rousset 1995) was used to test for linkage disequilibrium between nuclear microsatellite loci. As C. avellana possesses the capacity for clonal reproduction, we tested for clonemates by calculating the probability (P GEN) of each multilocus genotype (MLG) arising through sexual as opposed to clonal reproduction following the method of Parks and Werth (1993):

$$ {P}_{GEN}={\left[{2}^h\prod \left({x}_{1i}{x}_{2i}\right)\right]}^{n-1} $$

where h is the number of loci at which the genotype is heterozygous, x 1i is the allele frequency of the first allele in the genotype at locus i, and x 2i is the allele frequency of the second allele in the genotype at locus i. P GEN values were calculated using the GenClone software package (V2.0; Arnaud-Haond and Belkhir 2007). To estimate genetic diversity within the populations, levels of observed (H O ) and expected (H E) heterozygosity were calculated using the Arlequin software package (V3.5.1.2; Excoffier and Lischer 2010), whilst levels of allelic richness (A R) and fixation indices (F IS) were calculated using the FSTAT software package (V2.9.3.2; Goudet 2001). Significance of F IS was determined by 10,000 randomisation steps.

The overall level of genetic differentiation between populations was estimated using Φ ST, which gives an analogue of F ST (Weir and Cockerham 1984) calculated within the analysis of molecular variance (AMOVA) framework (Excoffier et al. 1992) using Arlequin. To further identify the possible patterns of genetic structuring, the software package BAPS (V6; Corander et al. 2003) was used to identify clusters of genetically similar populations using a Bayesian approach. Ten replicates were run for all possible values of the maximum number of clusters (K) up to K = 30 and the number of populations sampled, with a burn-in period of 10,000 iterations followed by 100,000 iterations.

A test for isolation by distance (IBD; Rousset 1997) was carried out using the Isolde test implemented in the GENEPOP software package to assess the relationship between genetic distance, measured as F ST / (1 − F ST), and geographical distance between population pairs. One thousand permutations were used for the Mantel test.

To test for spatial genetic structuring (SGS) within Irish populations, we carried out spatial autocorrelation analyses using SPAGeDi (V1.4; Hardy and Vekemans 2002). Mean coancestry coefficients (θ xy ; Loiselle et al. 1995) between pairs of individuals were calculated at 50-m distance intervals (with the exception of the Glenarm, Peatlands Park, Ness Wood and Camcor Wood populations, where 100-m intervals after the first two 50-m classes were used due to the large size of the areas sampled) and plotted as a correlogram, with 95% confidence intervals calculated from 1000 permutations of individuals within each distance class and for estimates of θ xy using 1000 permutations.

Results

Genotypes were obtained for between 26 and 30 individuals (total = 879) from 30 sites (mean = 29.3 individuals per site; Table 1). No significant evidence of consistent linkage disequilibrium (i.e. involving the same loci) was detected between any of the seven nuclear microsatellites analysed (51 out of 621 tests). Only a single pair of putative clonal individuals was found, this being in the Clarkhill Wood population. The two trees were ca. 750 m apart and were not considered as clones (see “Discussion” section). Between 14 (A604) and 18 (A613, A614, B671 and B767) alleles were detected per locus, with a total of 119 (mean = 17 per locus). Levels of observed (H O) and expected (H E) heterozygosity ranged from 0.481 (A613) to 0.850 (A614) with a mean of 0.210 and from 0.784 (B767) to 0.888 (A614) with a mean of 0.828, respectively. Levels of F IS ranged from −0.025 (A605) to 0.430 (A613), with a mean value of 0.121.

Within populations, levels of allelic richness (A R) averaged over loci ranged from 7.471 (Drumlamph Wood) to 9.724 (Peatlands Park) with a mean value of 8.568. Levels of observed (H O) and expected (H E) heterozygosity ranged from 0.668 (Knockmore) to 0.793 (Knockaginney Wood; mean = 0.731) and from 0.794 (Foynes) to 0.855 (Clare Glen; mean = 0.828), respectively. Levels of F IS ranged from 0.054 (Knockaginney Wood) to 0.188 (Sospirolo, Italy; mean = 0.121) with all of the values being significantly different from zero.

The level of population differentiation calculated from the AMOVA was Φ ST = 0.017 (Table 2). The BAPS analysis assigned the populations to two genetic clusters, one of which contained five populations (Crawfordsburn, Redburn Country Park, Peatlands Park, Castle Archdale and the Sospirolo population from Italy), whilst the other contained the remaining 25 populations. Multiple independent runs gave the same outcome. There was no geographical structuring of genetic clusters. The test for IBD revealed a weak but significant correlation between genetic and geographic distances (Fig. 2). Spatial autocorrelation analysis indicated significant genetic structuring up to 100 m in Peatlands Park and up to 50 m in a further four of the populations: Breen Wood, Drumlamph Wood, Castle Archdale and Waterford (Fig. 3).

Table 2 Analysis of molecular variance (AMOVA)
Fig. 2
figure 2

Mantel test for isolation by distance (IBD) between populations in Ireland

Fig. 3
figure 3figure 3figure 3figure 3figure 3figure 3figure 3figure 3

Correlograms of autocorrelation coefficient (θ; y-axis) plotted against distance in kilometre (x-axis). Ninety-five percent confidence intervals are indicated by dashed red lines. Note that in some correlograms, the first two distance intervals (0–50 and 50–100 m) may be at a different scale to subsequent intervals

Discussion

The results of this study suggest that the common hazel in Northern Ireland and indeed Ireland as a whole maintains high levels of genetic diversity along with low levels of population differentiation resulting from high levels of gene flow. Whilst there have been several studies on C. avellana, the present study is the first to look at natural populations of hazel using high-resolution, codominant nuclear microsatellite markers, in contrast with those which have used low resolution and/or dominant markers (allozymes and amplified fragment length polymorphisms (AFLPs)) or those which have focused on the genetics of cultivated varieties. Two allozyme studies, one of which examined populations from central to northern Europe (Persson et al. 2004) and another which used samples mainly not only from Germany but also from Italy and Hungary (Leinemann et al. 2013), found similar levels of genetic diversity, measured by both number of alleles (A) and expected heterozygosity (H E) Although it is not possible to draw meaningful comparisons between levels of diversity observed in the present study with those calculated from allozymes in natural populations, it is possible to contrast these levels with those observed in cultivated varieties and landraces based on microsatellites, which were lower and ranged from H E = 0.71 to H E = 0.78 (Boccacci et al. 2006; Gökirmak et al. 2009; Boccacci and Botta 2010; Gürcan et al. 2010b; Campa et al. 2011; Boccacci et al. 2013). In contrast, all but one sample analysed here had an H E >0.8 (Table 1).

In comparison with studies on other broadleaved tree species from Ireland that utilized microsatellites, hazel has the highest level of genetic diversity to date (mean H E = 0.828) compared to hawthorn (mean H E = 0.803; Brown et al. 2016), ash (mean H E = 0.765; Beatty et al. 2015a), sessile oak (mean H E = 0.720; Beatty et al. 2016), pedunculate oak (mean H E = 0.714; Beatty et al. 2016) and alder (mean H E = 0.663; Beatty et al. 2015b). Levels of inbreeding, measured as F IS (mean F IS = 0.121), were higher than those reported for ash (mean F IS = 0.067; Beatty et al. 2015a), alder (mean F IS = 0.078; Beatty et al. 2015b) and hawthorn (mean F IS = 0.047; Brown et al. 2016). The levels of F IS observed in hazel are somewhat surprising, given that the species shows dichogamy and has sporophytic self-incompatibility (Thompson 1979), but records of self-fertilization have been reported in hazel (Persson et al. 2004) and partial self-compatibility observed in a few cultivars (Mehlenbacher and Smith 2006; Mehlenbacher 2014).

Only a single pair (ca. 0.1%) of potential clonal individuals was identified. This is in marked contrast to the findings of Persson et al. (2004), who estimated levels of clonality to be ca. 5%. The discrepancy is most likely a methodological one, since the microsatellite markers used here have far greater discriminatory power than the allozymes used in the earlier study. Indeed, it is unlikely that the pairs of individuals identified in this study truly are clones, as they were separated by around 750 m. Although hazel possesses the capacity for vegetative reproduction, this occurs via layering of adventitious shoots or branches, rather than via runners or underground shoot systems (Persson et al. 2004). The probability of the matching genotypes arising through sexual reproduction was extremely low (P = 7 × 10−8), which is still significant after Bonferroni correction for multiple tests (α = 3 × 10−6). Given the spatial relationship between the individuals concerned, however, it is likely that this represents an artefact, and our results do not support the view that asexual reproduction occurs with appreciable frequency in natural populations of hazel in Ireland.

Levels of population genetic differentiation were low but significant (Φ ST = 0.017) and lower than comparable measures in hazel from the allozyme studies (G ST = 0.077; Persson et al. 2004; Φ ST = 0.035; Leinemann et al. 2013) and from AFLPs (Φ ST = 0.035; Leinemann et al. 2013). They were also comparable with those from other broadleaved tree species studied in Ireland, all of which exhibited levels of Φ ST <0.020 (Beatty et al. 2015a, b; Brown et al. 2016). The BAPS analysis indicated that the populations sampled are split into just two genetic clusters, but this is likely to be an artefact. Firstly, the clusters are not geographically localized, and secondly, the BAPS algorithm has previously been shown to overestimate the number of clusters when levels of differentiation are low (Latch et al. 2006). The observed weak pattern of IBD is also consistent with populations being well connected by gene flow in a stepping-stone fashion (Wright 1943; Slatkin 1993) and is comparable to that found in populations from Germany over a similar geographical scale (Leinemann et al. 2013).

Hazel seeds are heavy nuts, and it could be predicted that seed dispersal is limited. The present results showing a weak but significant correlation between genetic and geographic distances and absence of population structure, however, were inconsistent with this idea. Dispersal of hazelnuts is carried out by a range of different animals, including birds such as jays (Garrulus glandarius), nutcrackers (Nucifraga caryocatactes) and nuthatches (Sitta europaea), which are likely responsible for longer-distance dispersal events and mammals such as squirrels (Sciurus spp.), mice (Apodemus spp.) and hazel dormice (Muscardinus avellanarius; reviewed in Persson et al. 2004), which might be expected to only move hazelnuts short distances. A study on seed dispersal in Quercus petraea and C. avellana found that jays were able to disperse hazel seeds over several hundred meters, whilst the dispersal distance for mice was 10–20 m (Kollmann & Schill 1996). In contrast, nutcrackers have been observed to transport seeds up to 22 km (reviewed in Wästljung 1989). It is seed dispersal that ultimately determines adult vegetation composition with the processes of natural recolonization, regeneration and succession of plants relying on it (Howe & Smallwood 1982; Levine & Murrell 2003; Nathan & Muller-Landau 2000). It is interesting that of the few examples of fine-scale genetic structuring observed, four out of five (Breen Wood, Drumlamph Wood, Castle Archdale and Waterford) occurred in some of the smallest woodlands analysed. Wästljung (1989) reported slower rates of seed dispersal by animals in larger stands, but this may also be a factor of nut density.

The spatial autocorrelation in the present study may reflect restricted, pollen-mediated gene flow. In the vast majority of angiosperms, the chloroplast genome is transmitted maternally, meaning that chloroplast genetic markers can be used to determine levels of gene flow through seeds. Many previous studies have used high-resolution chloroplast microsatellites to this end (Provan et al. 2001), but in the present study, the chloroplast microsatellites that we developed for hazel (primer sequences available on request) exhibited no genetic variation in a preliminary screen (Brown et al., unpublished results). This is consistent with the only previous study to use chloroplast microsatellites in hazel (Leinemann et al. 2013), which found a single haplotype in populations from Germany and Hungary. An earlier phylogeographic study also found extremely limited variation outside the “classic” refugia of Italy and the Balkans and no variation in Britain (Palme and Vendramin 2002). Thus, it would appear that natural populations of hazel in northern Europe are characterized by an extremely narrow chloroplast gene pool and that there is thus limited scope for disentangling the relevant roles of seed- and pollen-mediated gene flow using population genetic approaches (Ennos 1994).

In conclusion, the results of the present study show that natural populations of hazel in Ireland do not reproduce asexually but exhibit high levels of diversity and low levels of population differentiation, indicating effective gene flow across the populations. Genetic connectivity is likely maintained by pollen, as seed dispersal is constrained. There is strong evidence that overall seed zones are not needed in Northern Ireland and indeed Ireland as a whole. These results mirror those of other native tree species, reinforcing the idea that their management is best carried out at regional level. More work would need to be carried out, however, to see if the same pattern emerges across the rest of the UK and indeed Europe.