18.1 Introduction

Which facets of biodiversity are changing, and what is the magnitude and direction of these changes? How is biodiversity responding to the variety of human pressures? Are the management policies put into place effective to tackle the impact of those pressures? While the scientific community has been addressing these questions for decades, the information gap in biodiversity science remains a major obstacle for reducing the large uncertainties associated with answering those questions. Technological advances, collection of data by an increasing number of scientists and volunteer citizens, and increased access to Earth observations (EO) should help reduce this gap. Yet quantitative information is still limited, as has been its ability to inform important international commitments such as the Aichi Biodiversity Targets in response to the global biodiversity crisis (Tittensor et al. 2014). Data collection and monitoring protocols are often adopted by scientists, public administrations, and environmental organizations with no effective international coordination, and there is no consensus on adopting priority metrics to quantify biodiversity change. While strengthening efforts to reduce the multiple biases present in biodiversity data remains critical (including spatial, temporal, and taxonomic biases, among others; Meyer et al. 2016; Proença et al. 2017), parallel efforts are needed to consolidate data from in-situ and remote sensing (RS) EO so as to increase their usability and information value.

The concept of essential biodiversity variables (EBVs) was proposed in 2013 as a framework to prioritize, integrate, and consolidate biodiversity observations and monitoring programs worldwide (Pereira et al. 2013). Since then, EBVs have gained acceptance among scientists, along with the interest and endorsement of the policy-making community, including the Convention on Biological Diversity (e.g., Decision XI/3 in UNEP/CBD/COP/11/35) and the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES). By providing an integrative framework for quantifying biodiversity change in time, EBVs also hold great potential for advancing research on biodiversity and responses to pressures and conservation actions. However, the concept is still evolving, and divergent viewpoints have emerged on what actually constitutes an EBV. Here we discuss recent progress in defining an operational EBV framework and the importance of this framework for biodiversity data integration. We start with discussing key attributes of EBVs. We then describe recent conceptual developments in support of their implementation. Finally, we illustrate the role of biodiversity models as a cornerstone for integrating data obtained from in-situ and satellite RS EO to support global assessments of biodiversity change and as a critical component of a global biodiversity monitoring system (Geller et al., Chap. 20).

18.2 The EBV Framework

18.2.1 Definition of Essential Biodiversity Variables

EBVs are defined as a minimum set of complementary measurements needed to detect and document biodiversity change across all levels of biodiversity, from genes to species and ecosystems (Pereira et al. 2013). EBVs are part of a larger family of Essential Variables (EVs) that was first conceptualized by the climate community with the Essential Climate Variables (Box 18.1).

Like all EVs, EBVs must meet the criteria of feasibility, cost-effectiveness, and scientific and policy relevance. Additional characteristics that might be specific to the EBVs are generalization (to the best extent possible) across terrestrial, marine, and freshwater realms and scalability. Importantly, the EBVs evolved from initially covering multiple aspects of the Driver-Pressure-State-Impact-Response (DPSIR) framework to focusing exclusively on biological state variables (i.e., EBVs describe the condition or the status of a particular biological entity). This is not to say that nonbiological variables are irrelevant for EBVs. On the contrary, some of these variables, such as temperature, fire occurrence, or elevation, may be extremely important (e.g., as covariates in biodiversity models); however, they do not constitute EBVs themselves. Furthermore, EBVs can be analyzed in relation to other variables to attribute biodiversity change to specific pressures and drivers (Pereira et al. 2012), to predict how different biodiversity metrics might behave with different scenarios of change (Kim et al. 2018), and to assess the effectiveness of management policies for biodiversity and ecosystem services (Geijzendorffer et al. 2016).

EBVs are best understood as the level of integration between primary observations, including in-situ and RS EO, and indicators of biodiversity change, calculated for a given spatial reporting unit (country, set of protected areas, etc.; Fig. 18.1). The power of EBVs emerges from their flexibility to incorporate new data as technology evolves and/or more exhaustive primary data are collected. This is already the case with the advent of citizen science and the technical progress made with, for instance, metagenomics, metabarcoding, field sensor networks, and RS (Turner 2014; Bush et al. 2017; Haase et al. 2018; Muller-Karger et al. 2018a). This means that the underlying measurement, coverage, and frequency of primary observations are likely to change (Fig. 18.2). Likewise, the needs of end users in terms of biodiversity change indicators have evolved in the past and will continue to do so. However, the EBVs are designed to remain conceptually stable, making them adaptable to different and unforeseen users’ needs. For example, even though the methods used to acquire and integrate data on species occurrence, and the indicators it can inform on, are likely to change in the future, the species distribution variable remains essential.

Fig. 18.1
figure 1

EBVs are intermediate products between primary observations and biodiversity change indicators. Observations obtained with different methods and protocols require different levels of integration, often with the use of biodiversity models, to consolidate the information in an EBV. The EBV cube typically structures biological measurements in a space defined by geographic and temporal references and a biological entity, such as species or ecosystem class. While end users (including scientists, managers, public administrations, and international policy forums and bodies) determine the need for indicators, they also influence the implementation of observation systems. However, the EBV remains the same so that it is complemented with new primary data, e.g., from repeated in-situ surveys or future satellite missions

Fig. 18.2
figure 2

Framework of the six classes of EBVs grouped by species-focused and ecosystem-focused approaches

Box 18.1 Essential Variables

Essential variables (EVs) emerged from the need for openly available data sets with transparent production processes that offer an appropriate spatial and temporal coverage to allow their use in policy- and decision-making (Bojinski et al. 2014). As a result, EVs are meant to allow the development of indicators that can support dynamic users’ needs while being resilient to changing and/or evolving observation systems (Reyers et al. 2017). From a pool of candidate variables, both science and technology will determine which are feasible, cost-effective, and, most importantly, relevant, and thus essential (Bojinski et al. 2014). In practice, although EVs can be interpreted and adapted differently among disciplines, the process of their development and endorsement remains similar, with a community of practice that self-organizes to provide both the scientific foundation (research, data, monitoring) and technical guidance to produce those EVs.

EVs were first adopted by the climate community as the Essential Climate Variables (ECVs) in the early 1990s, to respond to the needs of Parties of the United Nations Framework Convention on Climate Change and the Intergovernmental Panel on Climate Change, but the concept has since been expanded to go beyond climate science, including with the Essential Biodiversity Variables (EBVs) and the Essential Ocean Variables (EOVs, Miloslavich et al. 2018). While there is value in increasingly expanding the concept to other domains, a coordinated approach within disciplines to define and prioritize the EVs and avoid the duplication of efforts is currently being discussed within the Group on Earth Observations (GEO ). One example is the joint effort by the Marine Biodiversity Observation Network of the Group on Earth Observations Biodiversity Observation Network (GEO BON) and the Global Ocean Observing System (GOOS) of the Intergovernmental Oceanographic Commission (IOC) to streamline the marine observations that underpin the EBVs and EOVs (Muller-Karger et al. 2018b). The discussion on EVs is also permeating other domains, such as agriculture, health, and disaster risk reduction (Reyers et al. 2017).

18.2.2 A Space-Time-Biology Cube

The data structure of an EBV can be described as a hypercube and has analogy to a multidimensional data array in computer programming. The first two dimensions of the hypercube are space (latitude and longitude) and time, while the third dimension represents biological entities (Fig. 18.1). The latter dimension can, for instance, describe taxonomy in a species-centered EBV (see below), and values will inform the presence/absence or population abundance (e.g., Kissling et al. 2018a). Unlike the Essential Climate Variables (ECVs), the biological dimension of the EBVs makes them especially challenging in terms of developing the conceptual framework and producing the EBV data products. For ecosystem-level EBVs, this dimension can also inform ecosystem structure metrics (e.g., extent of different habitat types) or functions (e.g., primary productivity) in the time-space coordinates. The hypercube thus provides an intuitive representation of the EBV concept and at the same time has a direct translation in data computing language that suits implementation. Other EBVs are also more challenging to represent with three dimensions, even more so when considering that their value is likely to change depending on the spatial scale and extent, as is the case for the community spatial turnover.

18.2.3 Six EBV Classes

Each EBV measures a particular attribute (property) of a given entity (object). EBVs are grouped into six broad classes based on similarities and differences in the attributes and entities they address (Fig. 18.2). These classes are sets of variables describing the structure, composition, and function of biodiversity across its hierarchical levels (Noss 1990). The entity addressed by an EBV can be of two broad types, distinguished by the approach used to define the set of organisms forming this entity.

In the first approach, entities are formed by grouping organisms primarily on the basis of their species identity. In other words, EBVs of this broad type measure particular attributes of species—i.e., genetic diversity within a species in the case of the Genetic Composition Class; distribution and abundance of a species in the Species Populations Class; and traits of a species in the Species Traits Class.

The second approach to forming entities involves grouping organisms primarily on the basis of where they occur. EBVs of this broad type measure collective attributes of the entire ecosystem formed by all of the organisms occurring within a defined area (most typically an individual cell within a regular grid)—i.e., structural attributes of the ecosystem in the case of the Ecosystem Structure Class; functional attributes of the ecosystem in the Ecosystem Function Class; and various dimensions of compositional diversity (e.g., taxonomic, genetic/phylogenetic, functional) of organisms occurring within the ecosystem in the Community Composition Class.

The relationships between these six EBV classes is depicted in Fig. 18.2. A few key aspects of this overall typology are worth noting. First, the two broad approaches to defining the entity addressed by an EBV, species-focused and ecosystem-focused, essentially work with the same pool of individual organisms but view these organisms from two different perspectives—one grouping organisms according to species identity and the other according to location. While the entity employed in species-focused EBVs will typically be defined primarily on the basis of species identity, this could in some instances be qualified to focus, for example, on the population of a species occurring in a particular area. Likewise, ecosystem-focused EBVs might, in some instances, focus on measuring collective attributes of a particular subset of organisms occurring in an ecosystem rather than all organisms, with this subset defined in terms of taxonomy (e.g., all birds) or any other trait of interest (e.g., all pollinators). Finally, it is important to caution against directly equating the species-focused versus ecosystem-focused typology with major sources of in-situ versus RS observation. Many different sources and types of data can, and should, contribute to the population of EBVs across this entire framework. Any given EBV class can typically be populated using data from multiple sources of in-situ and remote-sensing observation, and any given source will typically contribute data to more than one EBV class. For example, EBVs in the Community Composition Class could be populated with data both from RS of compositional diversity (e.g., Morsdorf et al., Chap. 4) and from aggregation of in-situ species observations and models (e.g., Pinto-Ledezma and Cavender-Bares, Chap. 9), with the latter also contributing data simultaneously to the Species Populations Class.

18.3 Production Workflows for EBVs

The estimation of EBV information products typically involves multiple levels of data integration, from the collection of raw observations to the production of a final, consistent information set that provides comparable measurements in space and time. Data integration procedures need to be customized for almost every EBV, since they need to accommodate highly diverse biological quantities that are often specific to a particular EBV. Designing open, consistent, and fully reproducible workflows is key to support the full operationalization process, from data collection to publication of an EBV product that is ready to use for multiple science and policy purposes.

18.3.1 The Need for Open EBV Workflows

Workflows are defined as precise descriptions of data processing from one analytical step to another in a formal language. In recent years a multiplication of biodiversity data availability, novel analytical capabilities, and virtual infrastructures have laid the foundations for producing better integrated and more detailed information for measuring biodiversity change (e.g., Jetz et al. 2012; Hansen et al. 2013). However, the increasing variety of analytical procedures and project-specific designs also means that analytical standards are difficult to establish (Borregaard and Hart 2016). Open workflows benefit the preservation of processing steps and support data interoperability and the automation of biological and environmental data integration (e.g., via virtual biodiversity e-infrastructures; La Salle et al. 2016). These workflows require provenance of derived products to be also recorded so others can understand the relationships among data, processing, and results (Michener and Jones 2012) and thus facilitate product updating as new data and better processing algorithms become available. All these aspects are critical in the EBV framework since the production of EBVs depends on large research collaborations built on the basis of knowledge transfer and open access to data and production protocols.

At present, fully operational workflows that facilitate the automated and widespread production of EBVs are missing. However, recent efforts have identified critical steps and bottlenecks for the definition of workflows in support of the production of specific EBVs.

18.3.2 From Data Collection to Biodiversity Models

Generic workflows have been outlined so far for the production of a few species-centered EBVs, including species distributions, population abundances, and species traits (Kissling et al. 2018a, b). For example, 11 steps have been identified to build spatially continuous and temporally consistent EBV products for species distributions, from the integration of multiple data sources, including traditional direct species observations collected in many different ways, automated records from sensor networks—such as camera traps and sound detection—and emerging uses of satellite remote sensing (RS) for detecting species (Kissling et al. 2018a). These workflows pay special attention to the integration among in-situ observations and RS data. Other approaches may use in-situ observations only as ground-truth data, while the rest of the process is dominated by image processing (e.g., mapping vegetation cover; Hansen et al. 2013). However, traceability of the ground-truth sampling and processing remains equally important and therefore also applies to the entire process similar principles of annotation, uncertainty reporting, and conformance with data management guidelines (see below).

The key workflow steps can be summarized into three main groups (Fig. 18.3):

  1. 1.

    Standardization of primary biodiversity observations. At the core of the EBV concept is the aggregation of primary observations from multiple sources into a harmonized product that provides more comprehensive and richer information than each individual data set. Before this aggregation can take place, primary data must be curated, standardized, and annotated with appropriate metadata that record characteristics such as location, time, measurement units, and, ideally, sampling designs, collection procedures, and data quality control procedures (Rüegg et al. 2014). For example, for Species Populations EBVs, harmonized observations would consist of sets of species occurrence and abundance data expressed in appropriate units (such as species occupancies and number of individuals per unit area, respectively) complemented with metadata in appropriate standards such as the “Darwin Core Standard” with the “Event Core” extension (Wieczorek et al. 2012), which makes it possible to capture monitoring protocols and sampling efforts together with the data (https://www.gbif.org/darwin-core). Full documentation of sampling events using adequate metadata standards not only is critical for facilitating reuse of data by secondary users but also provides important information for quantifying the associated uncertainty and eventually applying correction techniques in subsequent steps. In practice, for a decade or so, the critical importance of annotating data with standard metadata has guided data management practices (e.g., in the context of long-term ecological research networks; Michener et al. 2011). However, poor data practices that ignore the annotation of metadata or that fail to adopt interoperable formats are still common for many biodiversity data sets, including those accessible through public data archives of scientific journals (Roche et al. 2015). These deficiencies constitute a major bottleneck for building EBVs (Hugo et al. 2017).

  2. 2.

    Primary data aggregation. A second set of steps leads to the production of consolidated data products that typically conform to all or most of the following characteristics: They contain consistent biological quantities expressed in the same measurement unit; other relevant biological attributes have been checked and harmonized (e.g., into a harmonized taxonomy or a consistent typology of traits or of ecosystem types); spatial and temporal references are matched; and data uncertainties have been quantified. Standardized observations need to be checked at this stage using quality control (QC) mechanisms that are documented transparently (Rüegg et al. 2014), for example, looking at outliers to ensure data quality. Collation of data in support of user requirements will ideally be automated using virtual infrastructures that are able to map the different (standardized) data sets with metadata into fully interoperable formats (Hugo et al. 2017). As detailed in Kissling et al. (2018b), an excellent example of this for the Species Traits EBVs is the Global Plant Phenology Data Portal (https://www.plantphenology.org), a platform that integrates phenology observations from three different networks using disparate data frameworks (Stucky et al. 2018). Key for this integration was the design of a new “Plant Phenology Ontology” that was able to provide a semantic framework as a basis to overcome interoperability problems produced by network-specific terminologies for data recording. Finally, data integration needs to deal with, and report on, uncertainties resulting from errors that may propagate throughout the different EBV production steps, including uncertain geographic locations of in-situ data, heterogeneous sampling methods and efforts (Proença et al. 2017), and measurement errors.

  3. 3.

    Model-based estimation. Final EBV products ideally provide continuous information in space and at different time periods so biodiversity change can be measured throughout the entire spatial domain. This is the case for EBVs that can be directly estimated using algorithms applied to satellite RS imagery with complete area coverage. On the contrary, for many EBVs that are primarily estimated from in-situ data, an additional level of integration is required to overcome the sparsity of data. Biodiversity models provide this level of integration by combining the strengths of in-situ observations and state-of-the-art RS products based on correlative or deductive approaches (Jetz et al. 2012; Ferrier et al. 2017). For instance, species distribution models are often based on a correlative relationship between environmental variables and the probability of the occurrence of a species. These models are calibrated or trained using species occurrence and sometimes absence data as response variables and environmental variables as predictors. The probability of occurrence of a species can be spatially interpolated between the observation points because environmental variables are available as continuous surfaces (i.e., wall-to-wall), which are themselves generated from models using in-situ and EO data. In deductive habitat modeling, expert-based assessment of the habitat preference and environmental constraints of species is used to refine the potential species distribution. When habitat predictors are available in high resolution, this makes it possible to go from coarse potential species distributions to fine-grain species distributions because species also respond more locally to habitat variables than to, for instance, climate (Triviño et al. 2011; Martins et al. 2014). Other EBVs can also be projected with models that integrate in-situ observations with RS data and other environmental data. Community composition variables such as the beta diversity between two sites can be projected from climate and other variables using generalized dissimilarity models (Ferrier et al. 2007), while alpha and gamma diversity can be projected from land-use using the countryside species-area model (Pereira and Borda-de-Água 2013). Hence, environmental predictors derived from RS constitute the backbone of higher-resolution EBV products that are consistent in space and time. However, it is important to note that such model-based EBVs provide information that is fundamentally different from the aggregated data sets described in the preceding steps and that while it improves the spatial and temporal coverage of the data set, it also introduces additional uncertainties that need to be documented.

Fig. 18.3
figure 3

Outline of an EBV production workflow for the integration of in-situ and RS data from disparate primary sources of data to final modeled information and publishing. Some authors consider the result of the intermediate data integration level also as an EBV-ready data set from which some indicators can be calculated, even from sparse observations in space and time (Kissling 2018a), while fully continuous coverage in the spatial and temporal dimensions is typically obtained only in the last level of integration

Massive integration of biodiversity data based on the EBV framework and workflows requires implementation via interoperable informatics infrastructures (Hugo et al. 2017). Projects aligned with the mission and concepts of EBVs, such as Map of Life (www.mol.org) or the Biogeographic Infrastructure for Large-scaled Biodiversity Indicators (BILBI) (Hoskins et al. 2018), already constitute a proof of concept of the potential of virtual infrastructures for developing a biodiversity-modeling framework that delivers global information from multi-sourced EO data integration. While the technological implementation of these infrastructures should not constitute a major limitation, redoubled efforts are needed, first, on making the large amounts of in-situ data being collected available and interoperable and, second, on developing and adapting biodiversity models that are able to ingest massive and novel sources of data, both in-situ (e.g., eDNA data) and RS (e.g., imaging spectroscopy).

18.3.3 Access Principles

The open publication of intermediate and final processed products and the adherence to open data-sharing principles is key to maximize scientific and policy benefits of the EBV framework. The Group on Earth Observations (GEO) has established a set of Data Management Principles to support publication of information using open standards and to ensure discoverability and accessibility through GEOSS, the Global Earth Observation System of Systems (Fig. 18.4). These principles allow full traceability, ensuring accessible information on data sources and processing history via provenance information. All of these management principles are directly applicable to EBVs. For example, traceability is critical for facilitating the updating of the information contained in an EBV product with new data (e.g., from new monitoring and/or observation systems) and the timely incorporation of new biodiversity model developments.

Fig. 18.4
figure 4

The ten GEOSS Data Management Principles promote the practical implementation of openness in scientific data and best practices ensuring that data are easily discoverable, accessible, and (re)usable. Data providers may assess conformance with each of the principles, in which case a labeling system helps the user to recognize such conformance. For detailed guidelines on the implementation of these principles, see www.geolabel.info.

In addition, GEO BON is developing an “Essential Biodiversity Variables Portal” that supports this process and enhances accessibility to EBV products. Open distribution of these products is complemented by reporting on their compliance with a set of “EBV Minimum Information Standards”. Besides ensuring good data management practices, these information standards aim to provide a guideline for the standardized description of EBV products. The purpose is to ensure consistent information about the EBV hypercube (i.e., the attributes of space, time, biological entity, and uncertainties) among the different EBV classes so that final users can easily access the relevant information (e.g., when searching for suitable EBVs for specific indicators).

18.4 Seamless Integration of Past Trends to Future Scenarios Using EBVs

Besides providing spatial interpolation of EBVs, biodiversity models can project changes in EBVs over time based on the relationship between drivers of biodiversity change and state variables of biodiversity. This means that, when historical data on drivers is available, past trends for an EBV can be backcast. In other words, a single snapshot of biodiversity and driver data at a given moment in time can be used to establish the relationship between driver variables and biodiversity variables across points in space (Fig. 18.5). Then, in order to project for other moments in time, these spatially inferred relationships are assumed to also hold when drivers evolve over time, using space-for-time replacement. When scenarios exist for the future trajectories of the drivers, the future trends in the EBV can be forecast as well (Fig.18.5; Ferrier et al. 2017). Estimated EBVs allow for seamless comparison of historical trends of biodiversity to future scenarios of biodiversity change. Indicators aggregating spatial information can be easily calculated from the spatially explicit EBV and plotted in time for any spatial unit of interest, such as a country or region (GEO BON 2015; Navarro et al. 2017).

Fig. 18.5
figure 5

Estimation of EBVs using biodiversity models. In-situ observations of an EBV often are sparse in space, and only a few time series exist. Drivers are often modeled continuously in space for a given moment in time and can be used by biodiversity models to project an EBV continuously in space after calibration and validation with the in-situ observations. When driver layers exist for other moments in time, either from RS observations or from scenario projections using models, the EBV can be estimated over time

Recently, a set of EBVs was historically reconstructed and projected into the future in an inter-model comparison study carried out by the Expert Group on Scenarios and Models of the Intergovernmental Platform on Biodiversity and Ecosystem Services (Kim et al. 2018). Species distribution, community composition, ecosystem function variables, and ecosystem services variables were reconstructed since 1900 and projected to 2050 globally, at a 0.5° resolution, using harmonized land-use data sets developed for the shared-socioeconomic pathways and climate data sets for the relative concentration pathway scenarios. In this exercise, a space-for-time substitution was used in the biodiversity models. In other words, no explicit time series biodiversity data were used to calibrate the models. Instead, current spatial patterns of biodiversity and drivers were used to infer how biodiversity changes over time when driver variables change. A future challenge for biodiversity modelers is to use biodiversity time series to fully model biodiversity across space and time (Ferrier et al. 2017).

18.5 Concluding Remarks

Since EBVs were first defined, there have been significant advances in the consolidation of the framework, substantial conceptual work on implementation, and increasing enthusiasm about their many potential applications in biodiversity science and policy. Now the scientific community needs to face the challenge of delivering EBV products and workflows that provide estimates of changes for the main facets of biodiversity and support our understanding of the driving mechanisms and the consequences of such changes. There are important opportunities for mobilizing primary data offered by the open-access movement, which continue to permeate the biodiversity community. These include public institutions responsible for promoting scientific and technological advancement. Data gaps will be covered by combining technological development with appropriate biodiversity models. For example, spaceborne sensors such as the Global Ecosystem Dynamics Investigation (GEDI) Lidar launched in 2018 are providing unprecedented global coverage in vertical measurements of vegetation and topography and will most likely support model-based integration of information for biodiversity variables in unforeseen ways.

Fulfilling the EBV vision requires renewed efforts, first, in continued scientific and technological support for the mobilization of in-situ data and for designing more comprehensive and better coordinated monitoring schemes and, second, in the implementation of workflows and interoperable infrastructures that support wall-to-wall integration of biodiversity data. GEO BON as a network defined at multiple levels, from scientific to institutional and infrastructure support, is instrumental for this endeavor (Hugo et al. 2017; Navarro et al. 2017). Key priorities are the implementation of mechanisms that enhance data mobilization as exemplified by the Darwin Event Core; a common understanding between the biodiversity research community and the space agencies of the processes to develop the technology required for detecting biodiversity change (Paganini et al. 2016); global informatics infrastructure support that meets the requirements for operationalizing EBVs (Hardisty et al. 2019); and broad scientific cooperation in implementing and enhancing biodiversity models that integrate all types of Earth observations.