Keywords

Conservation Planning

The biodiversity is at risk, therefore decisions must be made in order to tackle the biodiversity crisis. In the process of conservation planning, one or maybe the most important task is to evaluate the quality and importance of a given area . To fulfill this task there are many metrics, from species richness to endemicity, but these two values do not consider the evolutionary uniqueness of a species (Purvis and Hector 2000). Any useful metric must include the evolutionary value of the species (Rolland et al. 2012), where the most important and therefore the selected area is the one that harbors the highest biodiversity, but this does not mean the highest number of species but the highest number of unique species or evolutionary fronts.

There are many approaches in the context of phylogenetic diversity and conservation, from community ecology to taxon or area conservation. Given this broad spectrum, the questions are different and vary a lot. In the context of community ecology and phylogeny, the approach is to evaluate whether there is structure in the community given the phylogeny (Cavender-Bares et al. 2009), and therefore the null model approach is used to present the null hypothesis. The species by area matrix is shuffled (see: Gotelli and Graves 1996), or the species or area labels are shuffled. Here the “support ” is closer to the traditional confidence limits and error evaluation.

To evaluate the diversity of an area using phylogenies as a general frame, two main perspectives could be used, evolutionary distinctiveness (ED) or phylogenetic diversity (PD ). Evolutionary distinctiveness refers to species-specific measures developed to assign scores to the species and therefore the areas they inhabit (Vane-Wright et al. 1991). The measures are topology -based indices, calculated as “the sum of basic taxic weights, Q, and the sum of standardised taxic weights, W.” (Schweiger et al. 2008), and therefore are also known as Taxonomic distinctiveness indices. Phylogenetic diversity (PD) is a distance-based index using minimum spanning path of the subset in the tree (Faith 1992). Redding et al. (2008) identified some of the major differences between ED and PD. PD is effective only if all the species within the optimal subset are protected, otherwise other optimal subsets are possible; unlike ED, PD is not species-specific and thus does not offer priority species rankings, which are important to species conservation approaches as the IUCN Red List of Threatened Species. Furthermore, topologies are more stable than branch length s. Increasing the number of characters or changing the set of characters seldom leads to entire shifts in the relationships among species, whereas branch lengths change considerably from one set of characters to another and permit only to state about the evolution of the data set that generated the topology and the branch lengths (Brown et al. 2010).

Indexes Used

I present the general protocol to evaluate species or areas in a phylogenetic context in Fig. 1. The different indices for each species are calculated to obtain the species phylogenetic values, while the sum of the indices of all species in a given area produces the areal phylogenetic values.

Fig. 1
figure 1figure 1

An example for determining phylogenetic diversity metrics at species and area levels for a hypothetical topology with five species (four widespread ), distributed in eight areas (Modified from Lehman (2006))

I used the traditional I & W indices created by Vane-Wright et al. (1991), along with the modifications introduced by Posadas et al. (2001) to consider endemicity and widespread species (I e/W e), the size of the topology ( I s / W s ) or both variables at the same time (I es/ W es ). The standardization of the indices I and W enables the comparison of topologies with different number of species. In a topology with three species (I (II III)), distributed as taxon I in area A, II in B, and III in C. The taxon I and therefore the area A will have a value of 2.0 for indices I and W, while in a five species topology (I (II (III (IV V)))), the taxon I will have an index value of 8.0 for index I and 4.0 for index W, while the standardized I s for this taxon and the area it inhabits will be 0.5 for both topologies.

If we consider the distributional pattern of the species, it could be endemic or widespread . We could apply the same index value to all areas where the species is present, but areas inhabited by widespread species will be selected, as we will sum the index values for each taxon, while an area inhabited only by an endemic taxon will be valued just for the single taxon it contains.

In a five taxa topology (Fig. 1), with four widespread species in the areas F, G, and H. If we use index I these three areas are as important as the area A, while using W index they are more important than the area A, as each area obtains the final index value because of the sum of all species inhabiting the area. Areas F, G and H are selected not because they are inhabited by unique species as area A but by widespread species. Using I e/W e or I es/ W es the most important area is A, as it contains an evolutionary unique species, which is not found elsewhere.

Given the plethora of indices to choose, Winter et al. (2013) presented an important question: “We also call for a comprehensive guideline through the jungle of available phylogenetic diversity indices, with particular respect to the needs of conservationists – which index helps to protect what?”. Part of the answer to this question is given by the support to the decisions made, but in species or areas prioritization the literature does not present any kind of support measure (Whiting et al. 2000; Posadas et al. 2001; Pérez-Losada et al. 2002; López-Osorio and Miranda-Esquivel 2010; Prado et al. 2010), neither the most recent revisions cite any measure to evaluate the stability, confidence or support to the results (Schweiger et al. 2008; Vellend et al. 2011).

Jack-Knife

In a jack-knife analysis, given a sample of observations and a parameter to evaluate, a subsample is made by eliminating a proportion of the original data and the parameter is calculated for the subsample. This procedure is repeated n times and summarized. Since the introduction of the jack-knife (Quenouille 1949), researchers have used it, to define limits of confidence in many sorts of analyses, from statistics (Efron 1979; Smith and van Belle 1984) and ecology (Crowley 1992) to phylogeny. It has been used not only as a measure of support (Lanyon 1987), but as a way to obtain the best solution for large data sets (Farris et al. 1996), to test competing hypotheses (Miller 2003), to generalize the performance of predictive models or for cross-validation to estimate the bias of a estimator. As the bootstrapping, it could be seen as “a measure of robustness of the estimator with regard to small changes in the data” (Holmes 2003).

I use this re-sampling approach to evaluate the support of the area ranking in the context of conservation and phylogeny. Therefore, some questions could be evaluated quantitatively.

Jack-Knife in Conservation

The use of a meta-criterion to define an optimal parameter value has been used widely in phylogenetic analysis, i.e. the incongruence length difference test to define the ts/tv/gap costs (Wheeler 1995) or jack-knife frequencies to evaluate whether concavity parsimony outperforms linear parsimony (Goloboff et al. 2008).

In conservation biology, there must be a measure of the confidence and robustness of the results. A sensitivity analysis , deleting at random part of the information, helps to understand the support of the data as the persistence of a given area in the ranking. Therefore, jack-knife is the appropriate tool to explore the behavior of the results to perturbations in the data set (Holmes 2003).

In a conservation phylogenetic based analysis, there are three different items to evaluate, as we have three input parameters: the topology , the species in a given topology, and the distribution of a species.

The first question arises when we ask about the distributional pattern of the species -what if a locality (therefore all or some species in that area ) is not included in the analysis? -, A species could not be included in a given locality for three reasons, because (1) it was never present there; (2) it is locally extinct; or (3) it was not sampled, although the species is present in the area. To evaluate such situation, the species can be deleted from a number areas to quantify the effect of missing information.

The second question arises when a species included in the phylogenetic analysis is not considered in the conservation analysis -what if a species is not included?-. A species not included in the analysis will affect the index value as this depends on the species included on the calculation. In this context, the presence of a species is deleted from all the areas it inhabits.

The third question arises when we do not include a given phylogeny -what if a phylogeny is not included?-. The whole topology might not be available for the conservation analysis. We could depend on a limited subset of phylogenies to the ranking of an specific area . Here, the topology, therefore the species and their distributions are deleted.

Given the three questions we can decide whether a phylogeny, a taxon or an area is deleted, with different probability values:

  • j.topol is the probability to choose a topology (= p)

  • j.tip is the probability to choose a species (= q)

  • j.area is the probability to choose an area (= r)

In the first scenario, an area is deleted from the distribution of a species with a probability of p × q × r (0 < p, q, r < 1), that is, the probability to select the topology and then select the species and then select the area. An area could be removed from the whole analysis, and this has to be run only the number of areas times, eliminating a single area each time. It would show the position of the area in the ranking of the areas and is equal to delete the area from the final results.

In the second scenario, a species is deleted from a single topology with a probability p × q (0 < p, q < 1, r = 1.0), therefore all areas inhabited by this species will not be included.

In the third scenario, the whole topology is not included in the analysis with a probability p (0 < p < 1, q = r = 1.0), all the species and areas, belonging to that topology, will not be included in the analysis.

The first decision in the three scenarios, is made on the topology . As the number of topologies NOT included increases with the value of p, the absolute indices values would be small and inversely proportional to the value of p.

Those areas prioritized because of its position in a single or just a few topologies would change, the indices values would be lower, and the position of the area in the ranking might change. If an area is supported by all or most of the topologies, its position in the ranking must be stable, although the index value would be small in all the replicates, therefore the index values per se are meaningless, but the ranking is informative.

There is a fourth question, not considered here, related to the length of the branch. This question is valid in the context of Phylogenetic Diversity [PD ] (Faith 1992), Genetic Diversity [GD] (Crozier 1992), or total lineage divergence (Scheiner 2012) [a metric similar to PD]. These methods require the precise estimation of the length, therefore the accuracy of the index value depends heavily on the length estimation.

Although Krajewski (1994) considers that the debate of the use and calculations of divergence in systematics and conservation are two topics, I consider that the same criticisms to the accuracy estimation of the length in systematics will have a profound impact in the decision made when the topology and its branch length s are used in conservation. And as this quotation from Brown et al. (2010) states, “in any phylogenetic analysis, the biological plausibility of branch-length output must be carefully considered”. Therefore, we must be well aware of the methodological approach used to construct the phylogeny (Rannala et al. 2012).

Additionally, in some cases we must consider the sensitivity of PD value to intra-specific variation (Albert et al. 2012). Therefore, we must take into account the source of the tree (species vs. gene trees) [see for example Spinks and Shaffer (2009)].

Optimal Scenario

Given a data set and n random perturbations on this data, if the index is robust, all (or most) perturbations would yield the same general ranking. Therefore, in the context of conservation in an optimal situation, we would prefer areas that:

  1. 1.

    Have the same position in the ranking (original and re-sampled), no matter if we delete areas, species, or phylogenies

    = same ranking or position, insensitive to changes in the item(s) deleted.

  2. 2.

    if not, at least must be the same position in the ranking but considering just a subgroup (e.g. be first or second, or first to third).

  3. 3.

    Have the same position in the ranking (original and re-sampled), no matter the delete probability used (from 0.01 to 0.5).

    = same ranking or position, insensitive to changes in the delete probability.

  4. 4.

    or, have the same position for most of the probabilities used, but not counting extreme situations as a delete probability of 0.5.

    = not too sensitive to the probability values used.

In a real world, an scenario to meet the requirements of the first and third conditions is too strict and maybe impossible to fulfill. Therefore, my decision rules to select the best index and the best ranking are based in the second and fourth situations. The area must have the same position in the ranking considering just a subgroup, from the first to the third position in the ranking, no matter the type of item deleted, and for most of the probability values.

An alternative measure is to evaluate the behavior of an index and its success as the number of times that a replicate recovers part of the original ranking (e.g. 1st/2nd/3rd), but in any order. The researcher could consider only the first position in the ranking and evaluates the persistence of this area , or could consider the whole ordered ranking. These measures could be too strict and will be sensitive to the smallest perturbation to the data set, while the first to third position would be enough in terms of conservation planning.

Given any measure of success, the re-sampling approach in conservation have some possible applications as:

  1. 1.

    Which is the best index? that will answer also, what do we want to conserve/use to prioritize?

    The best index would be defined as the most supported index, while the area used would be that found for most of the probabilities used.

  2. 2.

    How stable is the ranking (e.g. 1st/2nd/3rd position)?

    This is a variation of the previous question, but focused in the ranking, as we prefer a supported ranking, we might evaluate the support for the original ranking.

Proposed Protocol

Following the expected behavior in an optimal condition, first I evaluated the index. I considered the best index as the one that recovered most times the same original ranking -first to third areas-, as an ordered ranking. Then, using the selected index, I evaluated the best area , as the one found most often in the first place.

I tested six scenarios by modifying j.topol and j.tip values as follows: j.topol values of 0.50 and 0.32, and j.tip values of 1, 0.50 and 0.32. These values are just used to introduce the concept , but they are similar to strong, mild and relaxed tests. A value of 1 to delete a species means that all areas for that species will be deleted, while a value of 0.32 means that one out of three will be deleted. Smaller values as 0.01 are discarded, it would make no difference, as the perturbation to the data would be unimportant.

The effect of deleting areas is related to the number of areas inhabited. If the species is in an endemic area , the effect of deleting an area would be as deleting the whole species, while in a widespread species, the effect should be minimal with indices as I e/W e or I es/ W es , but we can not define which is the best index as the four indices have similar properties. In all cases the probability of deleting areas was 1, therefore I tested the effect of the topology and species but not the effect of the distribution.

Number of Replicates

Hedges (1992) presented the number 1825 as the number of replicates needed to obtain an accuracy of ±1 % for a bootstrapping proportion of 95 %. Although the higher the number of replicates the higher the accuracy of the estimation of the bootstrap or jack-knife value, Pattengale et al. (2010) introduced a stopping criteria that yield lower figures as 500 replicates to get robust bootstrapping values for a 2500 taxa analysis. I randomized each scenario 10,000 times, that could be considered intuitively an appropriate number of replicates to estimate the jack-knife proportion for conservation purposes.

For these analyses, I used a modified version of the program Richness (Posadas et al. 2001) to randomize the data and to perform the index calculations [Jrich: available from https://github.com/Dmirandae/jrich], while the data analyses were performed using the software R (R Core Team 2013) and the figures were prepared using the library ggplot2 (Wickham 2009).

Empirical Examples

First Case: The Original Ranking Does Not Mean Support

Posadas et al. (2001) evaluated the conservation ranking in southern South America areas. They found that depending on the index used, the selected area changed, as the best area could be: Santiago (D), Ñuble (F), Valdivia (H), or the Malvinas islands (K). Also, for a single index, the values could be misleading, as the differences between the W index values are quite small, and the ranking could be an artifact rather than a real result (Table 1). I reanalyzed their dataset and found that the best index for this analysis is I s (Fig. 2) as this index that has the highest jack-knife frequency.

Table 1 First area in the ranking proposed by Posadas et al. (2001). For raw W the index values are 52.62/52.58/52.05. Labels follow Posadas et al. (2001)
Fig. 2
figure 2figure 2

For each index, the number of hits with a delete value of 0.32 or 0.5 for j.topol and three j.tip values of 0.32, 0.5, and 1 (in the last situation, the whole topology is deleted). Species endemism and richness (number of species) are included for comparative purposes (Data from Posadas et al. (2001)). (a) Number of hits with a j.topol value of 0.32 (b) Number of hits with a j.topol value of 0.5

The most stable area using I s or raw W (the second best index), was the Malvinas islands, a candidate to be the best area (Fig. 3). The high uncertainty in the area chosen is eliminated when the support is included in the selection of the best area. Santiago has the highest number of species and harbors the highest number of endemic species, but it was not placed as the highest priority, while Malvinas island, the second most endemic area has the highest priority. The inferences based on the un-sampled data set might be misleading, while jack-knifing could help to decide which is the most supported solution.

Fig. 3
figure 3figure 3

Number of times an area is recovered as the first (number 1), second (number 2), or third area (number 3). These are the lowest (a) and the highest (b) delete probabilities used in this analysis. Acronyms for areas follow Posadas et al. (2001) and table 1. (a) Probabilities of j.topol = 0.32, j.tip = 0.32 (b) Probabilities of j.topol = 0.5, j.tip = 1

Second Case: The Support for the Original Ranking

There are two main approaches to define amazonian areas of endemism, eight areas from Bates et al. (1998) and Da Silva et al. (2005) or 16 areas from Da Silva and Oren (1996). López-Osorio and Miranda-Esquivel (2010), used both ways to establish conservation priorities for Amazonia ’s areas of endemism.

Using Bates et al. (1998) areas, they found that Guiana and Inambari are the first and second priority areas. Inambari is the richest area while Guiana presents the highest endemicity value. Their inferences were based on W es , on theoretical grounds as the index includes endemicity and standardization (López-Osorio and Miranda-Esquivel 2010).

The reanalysis showed that the best index is either W es , W e or W s (Fig. 4). These three indices select Guiana as the first area and Inambari as the second area (Fig. 5), as stated in the original paper. In this example the re-sampling reinforces the original findings, giving a stronger support to the areas chosen as first and second in the ranking.

Fig. 4
figure 4figure 4

Number of hits with a delete value of 0.32 or 0.5 for j.topol and three j.tip values of 0.32, 0.5, and 1 (in the last situation, the whole topology is deleted) (Data from López-Osorio and Miranda-Esquivel (2010). Areas from Bates et al. (1998)). (a) Number of hits with a j.topol value of 0.32 (b) Number of hits with a j.topol value of 0.5

Fig. 5
figure 5figure 5

Number of times an Area is recovered as the first (number 1), second (number 2), or third area (number 3). These are the lowest (a) and the highest (b) delete probabilities used in this analysis. (a) Probabilities of j.topol = 0.32, j.tip = 0.32 (b) Probabilities of j.topol = 0.5, j.tip = 1

Using the areas from Da Silva and Oren (1996), López-Osorio and Miranda-Esquivel (2010) found that depending on the index, either Guiana2 or Rondonia could be the highest priority area , while the second area could be Guiana3, Inamambari2 or even Rondonia or Guiana2. Therefore, the first question is, which is the best index for conservation in Amazonia ? and given that index, which are the areas chosen as the first and second priority?.

López-Osorio and Miranda-Esquivel (2010) found that most indices selected the same area Guiana2, which could be seen as there is no difference given the index. The reanalysis showed that in general I s and W s are more stable than any other index, and I s behaves better than W s. As the size of the topologies is different and some large topologies with more nodes may have more impact than smaller topologies, standard I and W indices are not stable (Fig. 6). The first area is Guiana2 in all indices used, while the second area varies: Rondonia, Guiana3 or Inamambari2 (Fig. 7). These results are similar to those found by López-Osorio and Miranda-Esquivel (2010). Here the re-sampling helped to resolve the initial discrepancy as the highest priority is Guiana2 and not Rondonia, that could be a possible candidate. The second area could be any of the three initially considered, so the evidence is not misleading but inconclusive to define the second area, even after re-sampling the data.

Fig. 6
figure 6figure 6

Number of hits with a delete value of 0.32 or 0.5 for j.topol and three j.tip values of 0.32, 0.5, and 1 (in the last situation, the whole topology is deleted) (Data from López-Osorio and Miranda-Esquivel (2010). Areas from Da Silva and Oren (1996)). (a) Number of hits with a j.topol value of 0.32 (b) Number of hits with a j.topol value of 0.5

Fig. 7
figure 7figure 7

Number of times an Area is recovered as the first (number 1), second (number 2), or third area (number 3). These are the lowest (a) and the highest (b) delete probabilities used in this analysis (Data from López-Osorio and Miranda-Esquivel (2010). Areas from Da Silva and Oren (1996)). (Acronyms A = Guiana3, B = Guiana2, C = Guiana1, D = Imeri2, E = Imeri1, F = Napo3, G = Napo2, H = Napo1, I = Inambari1, J = Inambari2, K = Inambari3, L = Inambari4, M = Rondonia, N = Tapajos, O = Xingu, P = Belen). (a) Probabilities of j.topol = 0.32, j.tip = 0.32 (b) Probabilities of j.topol = 0.5, j.tip = 1

These brief examples show that the confidence of the original ranking should be evaluated using re-sampling, as an un-sampled ranking analysis could be unstable when some information (phylogenies or species) is deleted. The results may render any output, from a different answer from the original ranking to a congruent answer with the original ranking. Only after the re-sampling analysis, the quality of the answer could be stated without hesitation. Even if we only calculate the support for a given ranking, the results after re-sampling would give a clue of the situation when the information is perturbed.