Introduction

Our understanding of the at-sea distribution of marine species has grown exponentially during the last decades, due to the major advances in remote tracking technologies (Greene et al. 2009; Block et al. 2011; Hazen et al. 2012; Hussey et al. 2015; Hays et al. 2016). Data collected by tracking individual animals have become one of the main sources of information to study the activity patterns, foraging behaviour and migratory movements of marine species (e.g. Dias et al. 2011, 2012; Hays et al. 2016). Such studies are also important for the identification of biodiversity hotspots in the ocean (Hazen et al. 2012; Lascelles et al. 2016; Dias et al. 2017). For example, more than 900 marine Important Bird and Biodiversity Areas (mIBAs) have been identified since 2010, now covering over 100 species of seabird and based on animals tagged at more than 70 locations worldwide (Lascelles et al. 2016; Dias et al. 2017).

Most previous studies concerning the identification of marine hotspots based on tracking data have been focused on flying seabirds, sea-mammals, sea-turtles and some large fish species, such as tuna and sharks (e.g. Block et al. 2011; Maxwell et al. 2011; Lascelles et al. 2016). A standardised method to define important areas for marine conservation based on tracking data has been suggested (Lascelles et al. 2016). This method (hereafter mIBA protocol) was, however, designed for (and applied mostly to) flying seabirds, especially species with very large movement ranges that breed at relatively few large colonies and which have been subject to intensive tracking studies (such as for albatrosses and petrels; e.g. Dias et al. 2017).

Recent studies have shown that the mIBA protocol can be easily extended to identify important areas for non-flying animals, such as penguins and pinnipeds (e.g. Augé et al. 2018). However, one major limitation of this method is the lack of tracking data for all colonies in a given region. While for some species and regions this might not be important (e.g. very well-studied species; species that concentrate in few colonies easy to study; e.g. Ramos et al. 2013; Augé et al. 2018), for many others this lack of data can be problematic. For example, many sites and seabird colonies in Antarctica are difficult to track, especially where colonies are spread across large areas, or located in remote, inaccessible locations. The use of correlative habitat models to identify priority sites for conservation can help overcome this limitation (e.g. Raymond et al. 2015; Wakefield et al. 2017). By identifying which ecological factors influence the distribution of foraging penguins, we can predict the most important at-sea areas for birds breeding at colonies where tracking data are not available. Habitat models can also identify areas that are important to different colonies, highlighting areas used by multiple populations, thereby identifying additional areas of high conservation value (Wakefield et al. 2017; Warwick-Evans et al. 2018).

In addition to these positive benefits, the use of habitat models as tools to prioritise areas for conservation can create some complex issues. For example, maps of habitat suitability based purely on statistical models (especially if resulting from extrapolation from data collected elsewhere) can be more difficult for policy-makers to accept, due to the lack of direct, empirical evidence of site use. Furthermore, because habitat models are often developed using a fine spatio-temporal scale, mismatches in the spatio-temporal scales needed for site protection and long-term consistency of site occupation can exist (Lascelles et al. 2012), largely because of the dynamic nature of oceanographic conditions, such as variability in currents and sea surface temperature (e.g. Scheffer et al. 2016).

A plausible approach to resolve these issues is to combine both methods for identifying priority sites for at-sea conservation—that is, a direct, empirically based method such as the existing mIBA approach (Lascelles et al. 2016; Dias et al. 2017), and an inferential method based on predictions from correlative habitat models (e.g. Wakefield et al. 2017; Warwick-Evans et al. 2018). In this study, we evaluated if correlative habitat models can be used to complement the standard mIBA approach, especially in areas where tracking data are not available. We applied both methods to 10 datasets containing tracking data from Chinstrap Penguins Pygoscelis antarcticus breeding at four colonies across the South Orkney Islands, and compared the results. Our approach results in a number of important considerations for both methods (mIBA approach and habitat modelling), facilitating future work on the identification of new IBA boundaries for Chinstrap Penguins even when no tracking data are available to apply the standard mIBA protocols (Lascelles et al. 2016).

Materials and methods

Study area, colony information and tracking data

Tracking data were collected using GPS devices deployed on 186 individual Chinstrap Penguins from 4 colonies located in the South Orkney Islands (− 60.8°, − 45.5°; Fig. 1), corresponding to 24% of the colonies existing in the Archipelago classified as terrestrial IBAs (n = 17; Harris et al. 2015). Data were collected over a period of 4 years, and cover different stages of the breeding cycle: incubation, brood and crèche (Table 1). In combination, the colonies studied hold ca. 110,000 breeding pairs of Chinstrap Penguins (ca. 30% of the population breeding in the South Orkneys; Poncet and Poncet 1985). Tracking data were organised in datasets, each corresponding to a unique combination of data collected in a specific colony, during a unique breeding stage (Lascelles et al. 2016). In some cases (mentioned where appropriate) data for the same colony and stage were available for more than 1 year, so we conducted the analyses separately for each year (Table 1). Details of the deployment procedures can be found in Warwick-Evans et al. (2018).

Fig. 1
figure 1

Location of the study area. Red dots indicate the location of the colonies of Chinstrap Penguins Pygoscelis antarcticus where the tracking data were collected

Table 1 Summary of the tracking data analysed for Chinstrap Penguins Pygoscelis antarcticus in the South Orkney Islands

Data analysis—mIBA approach

The datasets were analysed following the mIBA protocol (Lascelles et al. 2016), but further adapted to incorporate the specificities of the at-sea behaviour of the penguins (Dias et al. in prep, Online Resource 1). The mIBA protocol provides a consistent framework for using animal tracking data to delineate areas of global conservation importance, based on well-established and standardised criteria used worldwide to identify Important Bird and Biodiversity Areas (Donald et al. in press). In summary, the analysis runs through a number of stages, developed in R using common functions and packages (Lascelles et al. 2016), in order to (i) determine hotspots of activity for each individual using kernel density analysis, using a smoothing factor of 7 km and a kernel utilisation distribution (kernel UD%, which reflects the probability density to relocate an animal at any place according to the coordinates of this place; Calenge 2006) varying between 55% and 75%, depending on the breeding stage (the optimum values for Chinstrap Penguins; Dias et al. in prep and Online Resource 1); (ii) identify boundaries of areas of high intensity use by different birds (i.e. areas used by more than 20% of birds from the colony); these areas represent, at this step, mIBA candidate sites, (iii) combine this with information on colony size to predict at-sea abundances (by multiplying the percentage of the population using the mIBA candidate sites by the population size of the colony of origin of the tracked birds), and (iv) test values against IBA criteria to determine if an area may qualify as an IBA.

Data analysis—habitat models and comparison of both approaches

We used the models developed by Warwick-Evans et al. (2018) to evaluate the suitability of correlative habitat models to identify mIBAs for penguins. The details of the modelling procedures are described by Warwick-Evans et al. (2018). In brief, the authors used Generalized Additive Models (GAMs; Wood 2004; R package mgcv) to analyse a set of 15 remotely sensed oceanographic (e.g. mean sea level anomaly, primary productivity and ocean current) and geometric predictors (such as distance from the colony and bearing of the nearest point of the shelf edge from the colony), to model the foraging distribution of Chinstrap Penguins during incubation, brood and crèche, expressed as probability of occurrence in a given at-sea site.

The results revealed that the correlative habitat model with the best statistical support used only bearing and distance from the colony for predicting the locations of foraging dives at any point during the breeding season (Warwick-Evans et al. 2018). Cross-validation tests showed good performance of these models when predicting the locations for other colonies (AUC of 0.89, 0.96 and 0.94 for incubation, brood and crèche, respectively; Warwick-Evans et al. 2018).

The models developed by Warwick-Evans et al. (2018) provide a useful basis for a comparative approach with the mIBA protocol, given that the variables selected do not vary through time (i.e. only static variables, that change only over space but not over time, were selected). We explored this comparison by using the predicted distribution maps for individual colonies to analyse the match with the results of the mIBA sites described previously. For each colony, we compared the results of mIBA protocol with the predicted distribution based on correlative habitat models (created with data from other colonies). By using predicted distributions of Chinstrap Penguins based on habitat models built using data from other colonies, we intend to evaluate the possibility of using habitat modelling approaches to identify mIBAs around colonies where tracking data are not available, or study is not possible.

The predicted distributions created by Warwick-Evans et al. (2018) were provided in the format of raster maps. The values of the raster cells reflect the probability of occurrence of foraging penguins on a scale from quasi-zero to 1. To identify which value of probability of occurrence should be used to identify the most important sites (i.e. to select the cells that have higher values of habitat suitability which should be used to delineate the boundaries of priority sites—hereafter ‘model hotspots’), we tested 20 different values, ranging from the 90% quantile to the 99.5% quantile of the cell values after excluding the quasi-zero values (see details in Online Resource 1). These tests were performed with the aim of informing this and future studies to an appropriate threshold to use in order to extract mIBA boundaries from predicted distributions output from habitat models. The match between the model hotspots (i.e. the areas resulting from using the different thresholds form the models) and the results from the mIBA approach was evaluated using three complementary metrics (see details in Online Resource 1)

  • Model-IBA overlap: the percentage of model hotspots that overlap with the IBA, weighted by the relative importance of the hotspot cells (measured by the value of probability of occurrence given by the model); higher percentages represent a better fit (maximum 100% indicates that the entire model hotspot is included within the IBA);

  • Percentage of less suitable cells in the IBA: the percentage of the less suitable cells (modelled values below the median value of all cells, after excluding the quasi-zero values) that overlap with the IBA; this reflects the quantity of priority areas that were not included in the final results due to poor model performance in identifying them as highly suitable; lower percentages represent a better fit (minimum 0% indicates that the model is not missing any intensively used area);

  • Percentage of birds: average percentage of the core areas of the tracked birds (from the IBA analysis) included in the model hotspot. To estimate this we used the maps with the percentage of birds using each area, and estimated the average percentage values included in the model hotspot. We then compared these values with the average values in the IBA, using a bootstrap test (by randomly selecting the same number of cells in IBA and model hotspot, and repeating the procedure 1000 times); percentages higher than 20% represent a fit better than expected by chance.

All the analyses were performed in R (R Core Team 2016).

Results

The match between the mIBAs and the correlative habitat models was generally high (Table 2, Fig. 2). A high percentage of the model hotspots (i.e. cells of high values of habitat suitability; see methods) fall inside the mIBAs (65%–100% during incubation, 81–95% during brood; variable Model-IBA overlap in Table 2). Also, only a null or negligible percentage of the less suitable cells (modelled values below the median) coincided with mIBAs (always < 1%; variable Less suitable cells in IBA in Table 2).

Table 2 Comparison between the marine IBA approach and the habitat modelling approach applied to Chinstrap Penguins Pygoscelis antarcticus tracked in the South Orkney Islands
Fig. 2
figure 2figure 2

Comparison between the results of the mIBA protocol and the predictive maps based on correlative habitat models. Background maps show the predicted probability of occurrence of Chinstrap Penguin Pygoscelis antarcticus from habitat models; dark blue dashed polygons represent the boundaries of model hotspots; red polygons represent the boundaries of candidate marine IBAs

The areas highlighted by the models (corresponding to 93% to 99.9% higher values; Table 2) were always considerably smaller than the mIBAs (Table 2 and Fig. 2). Nevertheless, and in all cases, these smaller model “hotspots” overlapped with the most important sites within the mIBAs; on average, model “hotspots” encompassed 42% of the core areas of the birds tracked (28%–73%), more than the minimum percentage required to be included in the IBAs (set to cover a minimum of 20% of the birds—see methods and previous results; variable Percentage of birds in Table 2).

Results in most cases had higher congruence during brood and crèche than during incubation (Table 2). The threshold that provided the best fit between both approaches was always higher than 90%, and in most cases (9 in 10) higher than 95% (global average 97%).

Discussion

This study represents a significant advance in the development of new methods to identify priority at-sea sites of conservation for central place foraging seabirds and land-based marine mammals. By comparing and combining two well-established but independent approaches (mIBA identification and habitat modelling), we have made progress in overcoming some of the limitations of each method (i.e. lack of tracking data to apply the mIBA method in all relevant colonies, and how to translate habitat preference surfaces into well-delimited areas for conservation priority), using as case-study mIBA the Chinstrap Penguin. We show that, for this species, correlative habitat models can predict very well the boundaries of mIBAs.

Comparisons between the mIBA approach and the habitat models showed a high overlap between the areas highlighted as most important by both methods (Table 2). The “model hotspots” tended to capture the mIBAs very well and, within this, the sites used by a higher percentage of the birds (Table 2 and Fig. 2). Furthermore, these model hotspots were smaller than the mIBAs, and always corresponded to a very small percentage (< 10%) of the extent of the areas modelled, so do not overestimate the important areas to conserve.

We note, however, that the threshold values used to delineate the model hotpots (i.e. values of habitat suitability considered sufficiently high so as to be included in the hotspots) were chosen based on the quality of the match with the mIBA approach. While this can partially explain the high quality of the overlap between both results (although not totally, as the models could potentially highlight different areas, as they were created using data from different colonies than the one for which the predictions were made), the rationale for this approach was to find the optimum values and provide some guidance for future applications of the method. Threshold values corresponding to the highest 3% ± 2% of most important cells (after excluding the quasi-zero values; see methods) were the best options to coincide with the mIBA sites, and this threshold should be used in future applications of this approach. We note also that the variability of the results between stages (brood vs. incubation) was considerably higher than the variability between colonies. This provides additional confidence when applying a model to a new (untracked) colony, given that the phase of the breeding period is known for the data from which the model was built. Naturally, with the advent of new tracking data, further review of this guidance is feasible.

The models developed for the brood stage matched the mIBA approach better than those developed for the incubation stage. This is probably because foraging trips were shorter during brood (see above; Kato et al. 2009), and thus birds were more constrained, with less flexibility to deviate from foraging trajectories, resulting in higher model performance (Warwick-Evans et al. 2018). The areas identified by the models and by the mIBA protocol during brood were consequently smaller than the ones identified during incubation, and in most cases included within the latter (see Fig. 2). Therefore, we suggest that future work should prioritise modelling the at-sea distribution of birds during brooding in order to identify priority sites, as the areas resulting from these predictions are more likely to correspond to the mIBAs, and also more likely to include the areas used during incubation. Finally, we should highlight that these analyses were only based on data collected for Chinstrap Penguins. The foraging distribution of this species during the breeding period is mostly driven by static factors (such as distance to the colony and bearing of the nearest point of the shelf edge from the colony; Warwick-Evans et al. 2018). The lack of temporal dynamism of these variables can have a positive influence on the efficiency of the extrapolations of the models to other colonies and on the utility of habitat models in predicting marine IBAs. Also, the fact that they have such a predictable foraging strategy in space (as evidenced by the very good performance of the habitat models; Warwick-Evans et al. 2018) and prey mostly on super abundant species (Antarctic krill Euphausia superba; Lishman 1985) can have also an influence on the good results found on this study. However, the same might not hold true to other species more reliant on dynamic variables, with different foraging strategies and/or preying on less predictable types of prey. Therefore, we suggest, as a precautionary approach and as a first step, that only habitat models for Chinstrap Penguins should be used to identify mIBAs. Nevertheless, and given that the distance to the colony is a key factor in shaping the foraging behaviour of seabird species during the breeding season (Wakefield et al. 2009), we anticipate that this results will be mirrored in similar studies with other species.

Finally, we also note that extrapolation performance of the habitat models can be highly variable (e.g. Randin et al. 2006; Torres et al. 2015). The models we have used performed considerably better at more local scale (i.e. when extrapolation to nearby colonies) than to larger scales (Warwick-Evans et al. 2018), so we recommend some caution when using habitat models built from data collected in distant locations and/or contrasting environmental conditions.

Conclusions and recommendations

In this study we showed, for the first time, that maps of predicted distributions from correlative habitat suitability models (built from data collected from other colonies) can be used with a high degree of confidence to identify mIBAs for Chinstrap Penguins. Results obtained with data collected during the brood phase were consistently better than those during incubation.

Given the difficulty in collecting data at many important colonies of penguins, the results shown here open a new possibility for the designation of a complete network of mIBAs for penguins in Antarctic waters. Nevertheless, tracking data should be used where possible to identify mIBAs following the already established protocols (e.g. Lascelles et al. 2016; Dias et al. in prep). Results from Trathan et al. (2018) highlight that tracking data are best derived from the region under consideration. However, in the cases where tracking data are not available (or not possible to collect), correlative habitat models represent a robust alternative, especially if parameterised mainly with ‘static’ variables that can be easily and freely obtained in public databases, and which are not subject to temporal variation (Warwick-Evans et al. 2018). A broader application of this methodology around other important penguin colonies, e.g. Chinstrap Penguins breeding at the South Shetland Island (Trathan et al. 2018), or other central place-foragers (with the necessary testing as developed in this study) could improve the basis for a precautionary—but still evidence-based—management of the fisheries in Antarctica.