Conservation genetics of a threatened butterfly: comparison of allozymes, RAPDs and microsatellites

Turlure, Camille; Vandewoestijne, Sofie; Baguette, Michel

doi:10.1186/s12863-014-0114-7

Conservation genetics of a threatened butterfly: comparison of allozymes, RAPDs and microsatellites

Research article
Open access
Published: 04 November 2014

Volume 15, article number 114, (2014)
Cite this article

Download PDF

You have full access to this open access article

BMC Genetics Aims and scope Submit manuscript

Conservation genetics of a threatened butterfly: comparison of allozymes, RAPDs and microsatellites

Download PDF

Camille Turlure¹,
Sofie Vandewoestijne¹ &
Michel Baguette^2,3

7817 Accesses
16 Citations
1 Altmetric
Explore all metrics

Abstract

Background

Addressing genetic issues in the management of fragmented wild populations of threatened species is one of the most important challenges in conservation biology. Nowadays, a diverse array of molecular methods exists to assess genetic diversity and differentiation of wild populations such as allozymes, dominant markers and co-dominant markers. However it remains worthwhile i) to compare the genetic estimates obtained using those several markers in order to ii) test their relative utility, reliability and relevance and iii) the impact of these results for the design of species-specific conservation measures.

Results

Following the successful isolation of 15 microsatellites loci for the cranberry fritillary butterfly, Boloria aquilonaris, we analyzed the genetic diversity and structure of eight populations located in four different landscapes, at both the regional and the landscape scales. We confront results based on microsatellites to those obtained using allozymes and RAPDs on the same samples. Genetic population analyses using different molecular markers indicate that the B. aquilonaris populations are characterized by a weak genetic variation, likely due to low effective population size and low dispersal at the regional scale. This results in inbreeding in some populations, which may have detrimental consequences on their long term viability. However, gene flow within landscape is limited but not inexistent, with some long range movements resulting in low or no isolation by distance. Spatial structuring was detected among the most isolated populations.

Conclusions

The use of allozymes and RAPD are of very limited value to determine population structuring at small spatial (i.e. landscape) scales, microsatellites giving much higher estimate resolution. The use of RAPD data is also limited for evidencing inbreeding. However, coarse-grain spatial structure (i.e. regional scale), and gene flow estimates based on RAPD and microsatellites data gave congruent results. At a time with increasing development of new molecular methods and markers, dominant markers may still be worthwhile to consider in organisms for which no genomic information is available, and for which limited resources are available.

Genetic patterns in Neotropical Magnolias (Magnoliaceae) using de novo developed microsatellite markers

Article Open access 27 October 2018

Population differentiation in relation to conservation: nuclear microsatellite variation in the Canary Island endemic Lotus sessilifolius (Fabaceae)

Article 26 May 2017

Analysis of genetic diversity in a peatland specialist butterfly suggests an important role for habitat quality and small habitat patches

Article 25 June 2018

Background

Addressing genetic issues in the management of fragmented wild populations of threatened species is one of the most difficult challenges in conservation biology [1]. Indeed, there is now compelling evidence that inbreeding and loss of genetic diversity compromise the viability of populations [2]. Extinction risk is increased through loss of genetic diversity by inbreeding and genetic drift, particularly in small, isolated populations. Accordingly, to elaborate sound management plans, it is necessary to delineate population units, to quantify gene flow, to estimate effective population sizes and to relate genetic diversity to individual fitness and population viability. Nowadays, a diverse array of molecular methods exists to obtain information about the genetic diversity and differentiation of wild populations [3],[4]. However it remains worthwhile i) to compare the results obtained using those several methods and ii) hence to test their reliability and utility (e.g. [5]-[7]).

Variants in amino-acid sequences of the same enzyme due to mutation of genes, or allozymes, were one of the first genetic marker systems used in molecular ecology [8]. They are time and cost effective and have been efficiently applied in population genetic and phylogeographic studies [9]. However, they necessitate invasive sampling, might be subject to selection and may not reveal much variation at small spatial scales and on short time intervals, mainly in species characterized by isolated populations and/or small population sizes [10]. This lack of neutrality is suited in the context of evolutionary studies but can be particularly problematic to test population differentiation. Dominant markers (i.e. markers revealing only a single dominant allele [8]), such as RAPDs (Random Amplified Polymorphic DNA), ISSR (Inter-Simple Sequence Repeats) and AFLP’s (Amplified Fragment Length Polymorphism) have demonstrated their utility (e.g. [11],[12]), even in population assignment tests [13]. However, they remain of limited use in conservation studies where genotypic data are necessary to guide management practices aiming at avoiding inbreeding (i.e. where it is necessary to distinguish heterozygous from homozygous individuals). Although RAPD’s studies do not require much DNA, problems of reproducibility resulted in difficulties to publish RAPD based results [14]. Nonetheless, they remain the least expensive population genetic markers for use on species for which no other genomic information is available.

Co-dominant markers, such as RFLPs (Restriction Fragment Length Polymorphism), microsatellites and SNPs (Single-Nucleotide Polymorphism), (i.e. markers for which all the alleles present at a particular locus can be identified [8]) are nowadays the most widely used molecular markers in population genetic and conservation studies [14]. In particular, microsatellites have become popular and versatile, because they are hyper variable, neutral and reproducible molecular markers [15],[16]. However, developing these markers through microsatellite enrichment followed by Sanger sequencing is time consuming and costly. Genotyping errors and difficulties in isolating microsatellite markers can appear in some groups, including butterflies [17], although Lepidoptera are often considered as an ideal model for conservation biology studies. Indeed, they respond rapidly to both habitat change and global warming [18] and they are good indicators of habitat quality [19]. The difficulty of isolating microsatellites in Lepidoptera [17] has been associated to high similarity in flanking regions between different microsatellites within a same species [20] and/or the lack of conserved flanking regions leading to unrepeatable banding patterns [21]. However, next-generation sequencing offers the opportunity for more efficient microsatellite isolation than was previously possible [22],[23]. The combination of a biotin-enrichment protocol and high-throughput pyrosequencing has recently resulted in the successful isolation of microsatellites for several butterfly species, including the cranberry fritillary Boloria aquilonaris[24], a butterfly species of major conservation concern.

The cranberry fritillary is a glacial relict species whose populations are more or less continuously distributed towards the north of its range, whereas western European populations are highly fragmented. Indeed, suitable habitats are restricted to uplands where climatic conditions are cold and wet. Over the past century, drainage and afforestation of peat bogs (mainly by Norway Spruce Picea abies) has resulted in the substantial decline of the species in this part of its distribution range [25], which has subsequently increased the isolation of the remnant populations.

Intensive demographic and ecological work has been carried out on the Cranberry fritillary in Belgium showing i) a demographic decline in small local populations [26], ii) a relatively high amount of dispersing individuals between closely located populations [27] but iii) also some very long distance dispersal movements [28]. Population genetics has also been used to assess effective connectivity between populations [29],[30]. These studies were essentially based on RAPD (Random Amplified Polymorphic DNA) markers, because allozymes revealed very little polymorphism in the study area. However, RAPD’s revealed low population differentiation within the study landscape, which confirmed that populations were functionally connected, as expected from ecological studies. The genetic structure of the populations inferred from the RAPD data set was indeed remarkably congruent with the geographic configuration of the populations within the landscape and the observed movements among populations [29].

Following the successful isolation of 15 microsatellites loci for B. aquilonaris[24], here we analyze the genetic diversity and structure of eight populations located in four different landscapes. Analyses are hence carried out at both the regional scale (i.e. between landscapes) and the landscape scale (i.e. between populations located in one landscape). We then confront results based on microsatellites to those obtained using allozymes and RAPDs on the same samples [29] and discuss the utility and relevance of these three markers. We also discuss the impact of these results for the design of species-specific conservation measures.

Results

Genetic samples

All of the RAPD loci exhibited overall frequencies of band presence of less than 1 - 3/N (where N is the number of individuals sampled), reducing potential bias in the allele frequency estimates stemming from low counts of band absence phenotype [31]. The results of analyses based on biased and unbiased allele frequencies were very similar, therefore, we only report results based on the latter.

For the microsatellites data, Microchecker analyses indicated the presence of null alleles at two loci in Grande Fange and in Mirenne. However, it should be stressed that no population - primer pair had null allele frequencies of more than 0.001. Simulations showed that the bias induced by null alleles is negligible at frequencies below 0.2 [32]. After controlling for false discovery rate, no primer pair - population combinations were in linkage disequilibrium, nor were any primer - population combinations in Hardy-Weinberg disequilibrium. However, global tests at the population level indicated significant deviance from Hardy-Weinberg equilibrium for four populations (Grande Fange, P =0.0293; Mirenne, P =0.0003; Libin, P =0.0200; Grand Passage, P =0.0346). Hardy-Weinberg disequilibrium in Grande Fange and Grand Passage were due to significant heterozygote deficiency (P =0.0173 and P =0.0057 respectively). Significant heterozygote deficiency was also detected in Pisserotte (P =0.0269).

Genetic diversity estimates

The average number of polymorphic loci (P95) was 0.50 ± 0.13 for allozymes, 0.763 ± 0.118 (ranging 0.526-0.895) for RAPD and 0.895 ± 0.098 (ranging 0.692-1) for microsatellites. The average expected heterozygosity (He) was 0.188 ± 0.029 for allozymes, 0.315 ± 0.047 (ranging 0.239-0.388) for RAPD and 0.485 ± 0.079 (ranging 0.294-0.576) for microsatellites. The mean number of alleles per microsatellite locus (A) was 3.103 ± 0.615, the allelic richness (AR) was 1.250 ± 0.171 and 89.5% of microsatellite loci were polymorphic. Significant inbreeding (f) was detected in Grand Passage and Pisserotte populations (f =0.125 and f =0.130, respectively). All values per populations are presented in Table 1.

Table 1 Genetic diversity and inbreeding estimated for B. aquilonaris populations in four Belgian landscapes

Full size table

N_e estimates were obtained for the two populations with the largest sample sizes: Mirenne and Grande Fange. Bayesian and LDN_e estimates were not significantly different between populations. LD derived estimates of N_e had infinite confidence intervals. Therefore, the N_e:N_c ratio was calculated for Bayesian derived estimates only. N_e:N_c ratios were consistently smaller for the Grande Fange population, due to similar N_e estimates but high N_c estimates (Table 2). No significant bottleneck was detected, except for the Logbiermé population (Wilcoxon test, P =0.012). However, due to the small sample size, the validity of this significance is uncertain.

Table 2 Effective and consensus population size estimates on two B. aquilonaris populations

Full size table

Structure of the populations

Population differentiation was high and statistically significant for both RAPDs and microsatellites (Table 3). Many population pairs were significantly differentiated for both marker systems, especially at the regional scale (Table 3). Mirenne population was also significantly differentiated from many populations, even from those within the same upland, confirming results from spatial structure analyses. Previously, significant genetic differentiation was observed for 16 out of the 28 population pairs using allozymes [29]. F_st using microsatellites data was estimated at 0.129 (SD: 0.018) at regional scale and 0.082 (SD: 0.016) at the landscape scale (i.e. populations of the Tailles upland only). RAPD based F_st values were very similar with 0.171 and 0.091 for the regional and landscape scales, respectively.

Table 3 Pairwise estimates of F _st between populations of B. aquilonaris

Full size table

Spatial genetic structure detected at the regional scale previously identified i) two clusters (namely the Hautes Fagnes upland vs. the other uplands) based on allozymes and ii) three clusters (namely the Recogne upland vs. the Saint Hubert upland vs. both the Hautes Fagnes and Tailles uplands) based on RAPD [29]. The structure based on microsatellite data was similar between the Bayesian and multivariate analyses. Indeed, the lowest BIC value (multivariate analyses) and the highest likelihood lnP(D) (Bayesian analyses) indicated the presence of five genetic clusters with similar individual attribution to the respective clusters. The most distant uplands, namely Recogne and Saint Hubert, formed separate clusters from Tailles upland and the closely situated Hautes Fagnes upland (Figure 1). However, at the landscape scale, the use of microsatellites enabled the detection of a more fine-scale spatial structure compared to RAPD. Indeed, using RAPD, two genetically distinct clusters were detected, while three clusters were detected using microsatellites (Figure 2). For both markers, the Mirenne population formed a separate genetic cluster. In addition, the microsatellite data identified the Logbiermé population as a distinct cluster from the remaining populations of the Tailles upland. The majority of the individuals in this last cluster (7 out of 9) had a high probability of membership (>0.75).

Dispersal and gene flow

Genetic similarity decreased with increasing distance for both the RAPD (Mantel test: r =0.706, P =0.004) and microsatellite datasets (Mantel test: r =0.599, P =0.022) at the regional scale. It was similar for allozymes, despite not significantly (Mantel test: r =0.414, P =0.092; [29]). However, at the landscape scale (i.e. within the Tailles upland) isolation by distance (IBD) was not significant (P_RAPD =0.189, P_{microsatellite} =0.223).

MRR-based estimates of dispersal at the landscape scale were highly correlated to both the effective migration rate estimate (r =0.919, P =0.015) and the maximum likelihood estimate (r =0.912, P =0.019). Using the Akaike information criteria (AIC) in the maximum likelihood analyses of dispersal, the full model (i.e. dispersal between all populations) provided a significantly better fit to the observed genetic data than both stepping stone models. Indeed, the dispersal within 10 km and the between spatial genetic cluster models fitted the data less well (AIC difference with the full model: ΔAIC =18350.17 and ΔAIC =53105.35, respectively). Number of migrants per generation based on the full model were relatively important, with around 50 individual dispersers (Table 4).

Table 4 Number of migrants per generation based on maximum likelihood method

Full size table

Discussion

Genetic diversity in B. aquilonaris populations with DNA markers

Genetic diversity based on microsatellites data measured in B. aquilonaris populations (He =0.485) is low compared to many species, such as the co-occurring specialist butterfly Lycaena helle (He =0.75 in the Pisserotte population; [33],[34]) or as the common butterfly Pararge aegeria (H_e =0.828; [34]). Genetic diversity based on RAPD data (He =0.315) was also low and similar to another co-occurring and conservation-targeted species Boloria eunomia (H_e =0.368; [35]) and to that of the calcareous grassland specialist butterfly Polyommatus coridon (H_e =0.321, based on ISSR dominant markers; [11]). The lowest genetic diversity estimate was given by allozymes (H_e =0.188). This low genetic diversity may result from recurrent bottlenecks, a low effective population size N_e and/or a genetic isolation of the populations.

Possible bottleneck was detected for only one out of the 8 studied populations. Given the uncertainty of this result (see above), it is hard to conclude that the low genetic diversity emerged or not from recurrent bottlenecks in this case. However, the really low N_e and N_e:N_c ratios estimated in the two investigated populations let us think that low N_e is one of the source of the genetic paucity for B. aquilonaris. N_e:N_c ratios for B. aquilonaris populations were smaller than the typical average values of the ratio, i.e. between 0.1 and 0.5 [2]. It is although worth mentioning that the high variance in reproductive success typical for invertebrates may result in highly fluctuating N_e:N_c ratios [36]. Therefore, it would be of interest to investigate the temporal fluctuations of the N_e:N_c ratio in this species on large temporal scale. Conversely, Saarinen et al. [37] obtained very high N_e:N_c values (>1.0) in the endangered Miami blue butterfly and attributed this to genetic compensation, i.e. changes in biological interactions at low abundance and hence reduced variance in reproductive success. Higher N_e:N_c in smaller populations was also attributed to genetic compensation in an endangered damselfly [38]. Finally, it should be mentioned that sample size was low in this study, and may explain the large confidence intervals observed for LD estimates of N_e. Initially, destructive sampling was necessary because of associated allozymes studies, therefore limiting the number of samples.

Whatever its origin, genetic paucity is often associated with reduced individual fitness (e.g. [11],[39], but see a contradictory example with Drosophila flies: [40]) and may have negative consequences on the long-term viability of the metapopulation (e.g. [41]). Genetic diversity is also considered as an important asset allowing organism to adapt to climate change, although some confusion still exists as to the relationship between neutral genetic diversity, as measured here, and adaptive genetic diversity. Significant inbreeding was detected in two out of the five populations in the Tailles landscape. Recent population size estimates based on MRR indicate that these two populations decreased dramatically in N_c, while the populations without significant inbreeding did not (Turlure et al. unpublished data).

Gene flow and population structure with DNA markers

Gene flow is important to avoid local inbreeding. Consequently, quantifying gene flow enables landscape management so as to maintain viable metapopulations through recolonization and demographic rescue. Despite high differentiation of the populations, results on DNA markers suggest that some gene flow still occurs between populations of B. aquilonaris, especially within landscapes.

At the regional scale, the degree of genetic differentiation is high, whatever the marker used. Based on the microsatellite data it is very similar to the co-occurring species L. helle[33]. This differentiation is largely influenced by distance, as shown by the significance of the IBD. Note that the IBD was positive but not significant while using allozymes (r =0.414; P =0.092). These results suggest that gene flow is relatively rare at regional scale. This is not surprising as there is not a single population in between the Tailles upland and the Recogne/Saint-Hubert uplands that are separated by c.a. 40 km, which is out of the dispersal range of most specialist butterflies [42].

At the landscape scale, the significance of IBD is lost for both DNA markers. Absence of IBD at relatively small scales has been observed for other butterfly species (e.g. B. eunomia, [43]; and P. coridon, [11]). Absence of IBD at the landscape scale and maximum fit of data to the full migration model together suggest that dispersal is possible between all populations localized within the Tailles upland, and that geographic distance is no longer the principal factor determining the amount of gene flow at this spatial scale. MRR data confirm this, with detection of movements by B. aquilonaris of more than 10 km within the studied landscape ([28]; Turlure unpublished data). Dispersal estimates based on the inverse likelihood function fitted to MRR data and microsatellite data were highly correlated, as it was the case for previously published RAPD data. The ratio of migrants based on demographic estimates versus genetic estimates are respectively 0.765 (SD 1.249) and 0.339 (SD 0.168) for Slatkin’s effective migration rate and the maximum likelihood estimate based on microsatellite data. This indicates that (1) dispersal success is high, with mortality within the matrix remaining limited and (2) immigrants have a high mating success. Spatial analyses with both DNA markers illustrate that the populations of Grande Fange, Grand Passage and Pisserotte are more connected with each other, i.e. they form one genetic cluster, than with the other populations within the Tailles upland. Logbiermé (for microsatellite data only) and especially Mirenne (for both microsatellites and RAPD data) are genetically very different from the other sampled populations. This spatial structure likely results from the geographic isolation of the latter two populations within the Tailles upland.

Implication for B. aquilonaris conservation measures

Genetic population analyses using different molecular markers indicate that the B. aquilonaris populations are characterized by a weak genetic variation, likely due to low effective population size and low dispersal at the regional scale. This results in inbreeding in some populations, which may have detrimental consequences on their long term viability. However, gene flow within landscape is limited but not inexistent, with some long range movements resulting in low or no isolation by distance. Spatial structuring was detected among the most isolated populations. Gene flow between B. aquilonaris populations within landscape is generally superior to one migrant per local population per generation, a genetic rule of thumb to insure the maintenance of genetic diversity and prevent inbreeding depression in fragmented populations [44]. Despite this, significant inbreeding was detected in several populations. Wang [45] suggested that the one migrant per generation rule may be violated when there is variation in size among subpopulations and the number of subpopulations is small. This is the case in this study and it may explain the observed inbreeding despite a number of migrants per generation superior to one. Mills and Allendorf [44] also prescribed more than one migrant per generation when, amongst others, N_e is much less than the total population size. It should be added that too many migrants per generation are not favorable because it may hinders the emergence of local adaptations [46]. In this vein, Mills and Allendorf [44] suggest that up to 10 migrants per generation should not cause uniformity of allele frequencies across subpopulations. In their review, Tallmon et al. [47] highlighted that immigrants may have both positive (i.e. rescue effects) and negative (i.e. outbreeding depression) impacts on the population. Therefore, genetic population analyses of all existing populations on the Tailles upland on multiple generation time series should provide a more detailed picture of gene flow between populations of B. aquilonaris, and is currently underway.

Besides, the conjugation of those results with the knowledge accumulated on the species habitat requirement [48],[49], mobility [50],[51] and dispersal [28] could smoothly guide conservation measures for this species. As the recreation of bog habitat patches in between the current patches is no option, managers may be encouraged 1) to maintain and restore local habitat quality in order to keep suitable census population sizes and 2) to favour the presence of nectar plants along the roads and the rivers as both linear elements are used during dispersal (Turlure, personal observation).

Conclusions

As previously mentioned, allozymes remain of very limited value at small spatial scales; microsatellites giving much higher estimate resolution (e.g. [52]). This is especially true in species such as B. aquilonaris where enzyme variation is extremely limited and hence impedes the correct and accurate estimate of genetic population parameters ([29]; but see contradictory results in [53],[54]). However, here we show that dominant markers may still be worthwhile to consider in organisms for which no genomic information is available, and for which limited (financial and technical) resources are available. Indeed, coarse-grain spatial structure, genetic diversity, and gene flow estimates based on previously published RAPD data gave congruent results with both demographic data and microsatellite-based estimates. Congruent results of population structure and/or differentiation at broad scale while comparing RAPD and microsatellites have also been reported for fishes (e.g. [55]), snakes (e.g. [56],[57]) or ants (e.g. [58]), among other organisms. The use of RAPD data remain however limited for evidencing inbreeding and population structuring at small spatial scale (e.g. [56]). Also, RAPD suffer from poor reproducibility and their use is consequently nowadays limited in ecological studies [59]. New technologies for robust and less expensive genetic polymorphism detection are continuously developed [4], making valuable “new markers” more accessible for population genetics. This is especially possible since the use RAD- and next-generation sequencing generating large amount of DNA sequence or even complete genome sequencing from small amount of genetic material in a relative cost-effective way [60]-[62]. However, their price and accessibility is not always compatible with local conservation issues, as well as the storage and analysis of the volume of sequences produced. Therefore, “old markers”, such as microsatellites or AFLP, may still be valuable markers to answer questions related to species conservation.

Methods

Study system and genetic samples

The cranberry butterfly, Boloria aquilonaris, is a bog specialist butterfly species of conservation interest (listed as ‘vulnerable’ on the Red Data Book of European Butterflies, [63]). It is a glacial relict of acid peat bogs. Adults fly in one generation a year in July and females lay their eggs singly on the host plant Vaccinium oxycoccos. Samples were collected from eight populations in the Belgian Ardenne in 1996, including four different landscapes, namely the uplands of the Hautes-Fagnes, Tailles, Saint-Hubert and Recogne (Figure 1). Samples analysed in this study are identical to those in Vandewoestijne and Baguette [29] with the exception of a few samples for which sufficient DNA was lacking (Table 1).

Genomic DNA was extracted using a phenol-chloroform-isoamyl alcohol method (as described in Vandewoestijne and Baguette [29], followed by RNase treatment. Detection of enzyme polymorphism was carried out as in Vandewoestijne et al. [64]. Only four out of 17 enzyme systems proved to be polymorphic [29]. Due to this extremely low level of polymorphism (e.g. only four polymorphic loci with a mean number of 2.31 alleles), the allozyme data was not re-analyzed in this study; results obtained previously being included for the sake of a broader comparison of markers. Four RAPD primers produced 28 reproducible bands of which 18 were polymorphic. Analyses were carried out as in Vandewoestijne and Baguette [29], i.e. allele frequencies were calculated from null homozygote frequencies assuming panmixia and corrected for dominance according to Lynch & Milligan [31] using TFPGA 1.3 (http://www.marksgeneticsoftware.net/tfpga.htm). Fifteen microsatellite loci were amplified in three multiplex reactions [24]. Multiplex amplifications were conducted using QIAGEN Multiplex PCR Kit (QIAGEN) on Veriti Thermal Cycler (Applied Biosystems) with the following protocol: 95°C for 15 min, followed by 32 cycles (94°C for 30 s, 56°C for 90 s, 72°C for 90 s), and 72°C for 30 min, terminating at 25°C. Allele sizes were scored against the internal standard ROX-400SD (Applied Biosystems) using GeneMapper 3.5® (Applied Biosystems). Hardy Weinberg (HW) equilibrium was tested for all microsatellite primer - population combinations, as well as for each population. Probability values were estimated using the Markov chain method implemented in GENEPOP 4.0.10 (http://genepop.curtin.edu.au/). Linkage disequilibrium was tested for all primer pair - population combinations in GENEPOP using the log likelihood ratio statistic. Significance levels were corrected for false positives (i.e. false discovery rate) following the procedure of Benjamini and Hochberg [65]. In the case of departure from Hardy-Weinberg equilibrium, the data was analysed with Micro-Checker ver. 2.2.0.3 [66]. FREENA [67] was used to calculate null allele frequency whenever detected.