1 Data in Freshwater Research

Species observed in freshwaters are typically good indicators of the health/status of these ecosystems and are therefore frequently analyzed as part of ecological monitoring programs (see Chap. 19). The biodiversity data generated during such monitoring routines, in combination with data from other ecological studies in freshwaters, can form an invaluable source of information to support sustainable management and conservation of aquatic ecosystems (see Chap. 15). However, a large part of these data still remains scattered on individual researchers’ computers and institute servers. Pressured by funding agencies such as the EU, the call for open access to data (e.g., Reichman et al. 2011), which enables the reuse of data for addressing large-scale and/or transdisciplinary research problems, is becoming increasingly important. Additionally, new data-intensive technologies in science, such as remote sensing and next-generation sequencing, demand effective management of the increasingly vast amount of data.

In addition to monitoring data, observational data generated in freshwater research typically also comprise experimental data (Fig. 20.1). If such data are adequately described through metadata (see Sect. 20.2), they can be integrated into processed data products and tools that can support management decisions, conservation priorities, or other policy-relevant issues.

Fig. 20.1
figure 1

Data types arising in freshwater research and management (inspired by Thessen and Patterson 2011), including the importance of metadata in the data flow

In this chapter we discuss the importance of documenting and describing data and making these metadata available to improve the understanding and discoverability of datasets and specifically examine different facets of biodiversity data. We provide an overview of existing freshwater (biodiversity) information systems that enable data holders to adequately publish their data and find appropriate data for their research. Finally, we offer recommendations on how to implement open data publishing practices as a way to support sustainable management and conservation.

2 Documenting Generated Data and Information: The Concept of Metadata

In order to appropriately reuse biological data, it is important to understand the context in which they were acquired or generated (Thessen and Patterson 2011). In this connection, the generation of metadata plays a significant and absolutely essential role.

Metadata are loosely defined as “data about data”. More specifically, metadata should document and describe all aspects of a specific dataset (i.e., the who, why, what, when, and where) that would allow understanding of the physical format, content, and context of the data, as well as how to acquire, use, and cite the data (e.g., Michener 2006). For the data producer or provider, generating metadata presents an opportunity to document a dataset for possible own future use as well as for informing prospective users of its existence and characteristics. From the perspective of the data consumer or user, metadata enable both discovering data and assessing their appropriateness for particular use—their so-called fitness for purpose (Schmidt-Kloiber et al. 2012).

Basic and applied ecological research requires the availability of “high-quality” data, the definition of which varies and often depends on the specific purpose of a study. Scientists frequently reuse their own old data, but the use of data created by others and/or shared within large work groups (e.g., within EU-funded projects) remains limited. This is due to the fact that these research datasets are often not made publicly available nor deposited in permanent archives and therefore risk being lost over time (Shorish 2010; Vines et al. 2014). The scientific value of reusing a dataset for multiple (other) purposes than foreseen by the data originator(s) exceeds the perceived value by far. Documenting primary research datasets in metadata collections allows people to discover and understand these data and is therefore an important step forward to increase the usefulness and prolong the lifespan of a dataset.

In ecological sciences, the role of metadata in facilitating a scientist’s work has been increasingly recognized since the 1980s (Michener 2006), and collecting metadata datasets in dedicated databases is becoming more and more common. This is especially true for biodiversity-related (occurrence) datasets for which the importance of broad data compilations is already widely accepted. The Global Biodiversity Information Facility (GBIF; see Sect. 20.4.2.1), for example, collates and centralizes not only primary biodiversity data but also offers standards and tools for (meta)data collection. More specifically, for surface waters, the Freshwater Metadatabase was developed in the framework of a series of EU-funded freshwater research projects. In connection with this metadatabase, the Freshwater Metadata Journal was founded, in order to give the publication of metadata more scientific weight and bring about a change of perception in the freshwater community (Schmidt-Kloiber et al. 2014). The aforementioned database and journal offer an easy publishing process that—together with the possibility for citation—should make data generation and compilation efforts more visible to other researchers. Both resources are available through the Freshwater Information Platform (FIP; see Sect. 20.4.1).

3 Biodiversity Data

Biodiversity—beside its intrinsic value—supports essential ecosystem functions and, consequently, many ecosystem services that are key to human well-being (Cardinale et al. 2012). It is well known that freshwater ecosystems harbor a rich diversity of species and habitats, despite their comparatively small share of the world’s surface (less than 1%). On the other hand, there is also evidence that the decline in freshwater biodiversity has been greater during the last few decades than that of marine or terrestrial counterparts (Garcia-Moreno et al. 2014; Darwall et al. 2009). The high level of connectivity of freshwater systems implies that fragmentation can have profoundly negative effects (Revenga et al. 2005). Multiple other interacting stresses, combining effects of intense agriculture, industry, or domestic activities, form a further, compounded risk for freshwater ecosystems. These impacts include water extraction, the introduction of exotic species, alteration of hydrological dynamics through the construction of dams and reservoirs, channelization, overexploitation, and increasing levels of organic and inorganic pollution (Dudgeon et al. 2006; Strayer and Dudgeon 2010; Vörösmarty et al. 2010). Climate change is anticipated to increase the intensity of these threats to freshwaters (Garcia-Moreno et al. 2014; Woodward et al. 2010).

A current estimate states that freshwater ecosystems provide suitable habitats for at least 126,000 plant and animal species (Balian et al. 2007). These species contribute to a wide range of critical goods and services for humans, including flood protection and food or water filtration (see Chap. 21) to name just a few. Securing these ecosystem services and understanding the underlying ecosystem processes require knowledge about the taxonomic, phylogenetic, genetic, and functional diversity of nature (e.g., Kissling et al. 2015). The urgency of this matter was recognized by the Parties to the United Nations Convention on Biological Diversity (CBD) who established the Aichi Targets for 2020, which aim to halt biodiversity loss, protect various levels of life forms, and implement sustainable use of natural resources (http://www.cbd.int/sp/elements/).

A recent review of these targets shows that many of them are unlikely to be met (Tittensor et al. 2014), leading to an increasing demand for comprehensive, sound, and up-to-date biodiversity data (Wetzel et al. 2015). Key gaps were identified in the knowledge about status and trends of biodiversity and associated ecosystem services. These gaps mostly arise because of barriers that prevent existing data from being discoverable, accessible, and digestible (Wetzel et al. 2015). The importance of the availability of large-scale datasets for analyzing and understanding the broad-scale patterns of spatial variation in richness and endemism is highlighted by Collen et al. (2013). This again is central to understanding the origin of diversity and the potential impacts of environmental change on current biodiversity patterns and allows for prioritization of conservation areas (Collen et al. 2013).

An earlier review of the Aichi Targets already found that considerable data on freshwater species and populations are available, but often are not accessible or harmonized in a way that they could be appropriately used to support management decisions (Revenga et al. 2005). This calls for an urgent paradigm shift with regard to how biodiversity data are collected, stored, published, and streamlined, so that many sustainable development challenges ahead can be successfully tackled (Wetzel et al. 2015). The authors therefore suggest that biodiversity data should be discoverable, accessible, and digestible in order to—together with a certain expertise—more effectively inform and implement environmental policies (see Fig. 20.2).

Fig. 20.2
figure 2

Biodiversity data requirements (inspired by Wetzel et al. 2015)

3.1 Biodiversity Observation Networks and Essential Biodiversity Variables

So-called Biodiversity Observation Networks (BONs) can contribute to address these challenges by helping to coordinate data collections across large areas. They play a major role towards mobilizing biodiversity information for use by policy and decision-makers. In 2013 the Group on Earth Observation (GEO) BON introduced the concept of essential biodiversity variables (EBVs) with the aim to identify key measurements that are required to study, report, and manage biodiversity changes (Pereira et al. 2013; Turak et al. 2016b). The six broad EBV classes include genetic composition (1), species populations (2), species traits (3), community composition (4), ecosystem structure (5), and ecosystem function (6). Each of these classes needs a different approach to how data are collected and structured. In the following we give a few examples for selected EBV classes and their respective data availability and accessibility.

Knowledge of the genetic composition (1) plays an important role in freshwater research, as river catchments and lakes can be spatially separated and isolated from each other, which might limit gene flow such that populations of the same species may vary considerably in their genetic composition. This variability has, for example, particular applications for the management of freshwater fisheries, where loss of genetic variants may have major consequences for ecosystem service provision (Turak et al. 2016a). Most genetic data for freshwater species are accessible through the International Nucleotide Sequence Database Collaboration initiative (INSDC; see Sect. 20.4.2.4), of which GenBank is the best-known database. DNA barcoding and recent advances in environmental DNA technology greatly increase the potential to assess genetic biodiversity in freshwaters, especially in relation to conservation of rare and threatened species (Thomsen et al. 2011).

Information about the distribution and, as a consequence, the size of populations (2) of freshwater species has greatly improved in recent years. The activities of the Global Biodiversity Information Facility (GBIF; see Sect. 20.4.2.1) constitute an essential contribution to increase this knowledge. Specifically focused on freshwaters, the Freshwater Animal Diversity Assessment (FADA; see Sect. 20.4.2.2) for the first time assessed the diversity of fauna in inland waters. FADA provides a comprehensive overview on freshwater species richness of major taxa groups, also revealing many obvious taxonomic and geographic gaps, and hence the need to collect more data (Balian et al. 2007, 2008). Based on these insights, the EU-funded BioFresh project (“Biodiversity of Freshwater Ecosystems: Status, Trends, Pressures, and Conservation Priorities”; http://freshwaterbiodiversity.eu) created an online data portal focused on the mobilization of freshwater occurrence data, the Freshwater Biodiversity Data Portal (see Sect. 20.4.2.1). Another major initiative dealing with the EBV “species population” is the Freshwater Biodiversity Unit of the International Union for the Conservation of Nature (IUCN), which has been developing a global assessment of the distribution and conservation status of freshwater organisms (Darwall et al. 2009). Generally, the IUCN Red Lists (see Sect. 20.4.2.5) summarize the current knowledge on the state (including population size) and threat condition of species belonging to selected organism groups.

Species traits (3) seek to functionally classify taxa grouped by comparable biological profiles. In general, they are a powerful approach to understand community functioning through characterizing assemblages according to aspects of morphology, function, physiology, behavior, habitat use, reproduction, or life history. Databases collecting trait information were first established in the terrestrial realm, e.g., the TRY database for global plant traits (Kattge et al. 2011) or the PanTHERIA mammal database (Jones et al. 2009). While the concept of species traits is already widely recognized in freshwater ecological assessment too, comprehensive and publicly available databases only exist in some regions and for restricted taxa groups. These compilations frequently provide information at the genus or family levels, since the knowledge about traits on the species level often remains limited. In America the US EPA developed a trait database for about 3800 macroinvertebrate taxa (https://www.epa.gov/risk/freshwater-biological-traits-database-traits). In Europe the freshwaterecology.info database offers trait information on about 20,000 taxa (with focus on species) belonging to five aquatic organism groups (fishes, macroinvertebrates, macrophytes, diatoms, phytoplankton). Freshwaterecology.info serves as base for bioassessment and monitoring and is a service for basic research, applied scientists, water managers, or other stakeholders. However, such trait-specific data are still lacking for many taxa and for most parts of the world, and even fundamental facts about the ecology of many common species are not known and require more basic research (Turak et al. 2016a; Schmidt-Kloiber and Hering 2015).

Regarding community composition (4), information on the structure of freshwater assemblages is already used with success for assessing freshwater ecosystems (Friberg et al. 2011). In Europe, environmental legislation aiming to protect and restore freshwater ecosystems and their biodiversity is based both on the Habitats Directive (HD; Council Directive 92/43/EEC) and the Water Framework Directive (WFD; Directive 2000/60/EC). The latter has placed aquatic organisms in a unique position, as the composition of freshwater biota defines the status of surface waterbodies, and thus determines the needs for restoration and associated investments. As part of implementing the WFD, monitoring of waterbodies generates much more data than the information on the “ecological status”, since most of the national assessment systems also provide a variety of measures (so-called metrics) such as the community composition. WFD data, therefore, could contribute significantly to other objectives (e.g., monitoring the effects of emerging stressors, improving the knowledge of species distributions and species invasions, or understanding broad-scale drivers shaping species assemblages). However, the lack of detailed data in central storage systems makes accessibility difficult. Central availability is hampered by the impracticality of combining data stemming from different collection methods and formulated following different data structures and storage methods. It is also hampered by concerns about a consistent data quality regarding the underlying taxonomy, identification, and taxonomic resolution (Hering et al. 2010). Currently, the Water Information System for Europe (WISE; see Sect. 20.4.1) produces Europe-wide maps of water quality and ecological status of waterbodies, but original data (e.g., taxa lists) are not stored centrally so far.

For the two remaining EBVs—the ecosystem structure (5) and ecosystem function (6)—the availability of centrally stored data is rather limited. Observations of ecosystem structure for tracking change in freshwater ecosystems include measuring changes in the extent of inland-water habitats such as wetlands, lakes, rivers, and aquifers (Turak et al. 2016a). Though remote sensing technology for mapping the extent of wetlands and lakes is advancing rapidly, a central repository or entry point to consult the processed information of such analyses is not available yet. The use of indicators of ecosystem function other than those that may result from water quality or ecological status assessment data is rare. Analyzing the relationship between biodiversity and ecosystem functioning is still a growing research area but will need considerable further development before it will be possible to include measures of ecosystem function in freshwater biodiversity observations (Turak et al. 2016a, b).

Alongside these six EBV categories, there are also other widely used methods to assess components of freshwater biodiversity or indicate conditions of freshwater ecosystems that do not fit neatly into these classes (see review by Friberg et al. 2011). As these do not underlie any common Europe-wide legislation, data are even more scattered and inaccessible.

4 Main Freshwater-Related Information Systems

In many cases, data and information relevant for freshwater science, management, and policy can be found on platforms that sometimes also cover other realms. In the following we introduce three rough categories: “general data portals”, “biodiversity-related data sources”, and “spatial data sources”. We further subdivide “biodiversity-related data sources” into those dealing with occurrence data, taxonomy, traits, genes, and others. An overview including examples is given in Table 20.1.

Table 20.1 Overview on different freshwater information systems including a rough classification, examples, and web links

4.1 General Data Portals

Water Information System for Europe

The Water Information System for Europe (WISE; http://water.europa.eu) is a partnership between the European Commission (formed by DG Environment, Joint Research Centre, and Eurostat) and the European Environment Agency (EEA). It is a gateway to information on European water issues, divided into four sections: EU water policies (e.g., directives, implementation reports, and supporting activities), data and themes (e.g., reported datasets, interactive maps, statistics, indicators), modeling (e.g., current and forecasting services across Europe), and projects and research (e.g., inventory of links to recently completed and ongoing water-related projects and research activities). WISE aims at reaching a wide audience covering EU, national, regional, and local administrations working in water policy development, as well as scientists, professionals, and the general public interested in water issues.

The WISE portal comprises a wide range of data and information collected by EU institutions and redirects visitors to three portals: the EEA Water Data Centre (see below), the Eurostat Water Statistics website, and the FATE website related to pollutants monitoring campaigns.

Biodiversity Information System for Europe

The Biodiversity Information System for Europe (BISE; http://www.biodiversity.europa.eu) offers information on biological diversity in general and covers all realms including freshwater. It is a partnership between the European Commission (DG Environment) and the European Environment Agency and is supported by the collaboration of the European Clearing House Mechanism network and the CBD Secretariat.

BISE is a gateway for data and information on biodiversity supporting the implementation of the EU 2020 Biodiversity Strategy and the Aichi Targets in Europe. It focuses on bringing together facts and figures about biodiversity and ecosystem services as well as linking to related policies, environmental data centers, assessments, and research findings from various sources. The portal offers five entry points: policy (e.g., policy, legislation, and supporting activities related to the Common Implementation Framework of the EU strategy), topics (e.g., state of species, habitats, ecosystems, genetic diversity, threats to biodiversity, impacts of biodiversity loss), data (e.g., data sources, statistics, and maps related to land, water, soil, air, marine), research (e.g., important EU-wide research projects related to biodiversity and ecosystem services), countries (e.g., links to information available from European countries), and networks (e.g., links to Europe-wide networks supporting information sharing across national borders).

Also BISE does not host actual data, but links to major sources of data and information including the EEA Biodiversity Data Centre (see below) and the European Nature Information System (EUNIS).

European Environment Agency Data Centres

The European Environment Agency hosts the main data sources linked to from WISE and BISE, namely, the Water Data Centre and the Biodiversity Data Centre.

Both data centers are major sources for a variety of datasets with relevance to managers and policy makers. In addition to raw data and metadata, several data products are made available in a more digestible way, such as interactive maps and summary graphs. For both domains, water and biodiversity, users can browse through and access a wide range of spatial data using the “EEA interactive maps and data viewers”. Additionally, datasets are linked with “related content” if available.

Datasets in the Water Data Centre include the “ECRINS (European Catchments and Rivers Network System)”, the “Waterbase”, or the “WISE WFD Database”. One of the main datasets hosted by the Biodiversity Data Centre is the “Natura 2000 data—the European network of protected sites”. In addition and more generally, the EEA web portal also provides several reference datasets, such as “biogeographical regions”, “CORINE land cover”, “ecosystem types of Europe”, or “nationally designated areas” (for more examples and links, see Table 20.1).

Most EEA spatial data are offered as map service on DiscoMap (http://discomap.eea.europa.eu/).

Joint Research Centre Science Hub

The Joint Research Centre provides a Water Portal (http://water.jrc.ec.europa.eu/) with visualization and download options for JRC products on freshwater and marine water resources and offers tools to calculate summary statistics for the available data. Additionally, JRC also maintains the “WFD Ecological Methods Database”, which gives access to information about the national assessment methods used to classify the ecological status of rivers, lakes, and coastal and transitional waters as applied by the member states of the European Union in their monitoring programs according to the EU Water Framework Directive.

Freshwater Information Platform

The Freshwater Information Platform (FIP; http://freshwaterplatform.eu) represents an effort to regroup web products from several freshwater-related European research projects addressing freshwater biodiversity, ecology, and water management. It was initiated by four leading partners from the FP7 EU BioFresh project, which focused on raising awareness around freshwater biodiversity data, collating and mobilizing freshwater occurrence data, and using those data in large-scale analyses. The platform serves as one single gateway to different resources relevant for water managers, policy makers, scientists, and the interested public. It contains several complementary sections, either providing access to original data or summarizing research results in an easily explorable way. The most relevant sections are the Freshwater Biodiversity Data Portal and connected to it the Freshwater Metadatabase (see Sect. 20.4.2.1), the Global Freshwater Biodiversity Atlas (see Sect. 20.4.3), and the European freshwater species traits database freshwaterecology.info (see Sect. 20.4.2.3). Another relevant part of the FIP is the widely read Freshwater Blog, which publishes features, research highlights, interviews, and podcasts on freshwater science, policy, and conservation. The remaining sections of the FIP focus on freshwater resources (e.g., a glossary, fact sheets, “how-to” guides, etc.), freshwater-related policies (e.g., policy briefs), and freshwater networks. A specific section presents freshwater stressor-related tools. The FIP is designed as an open platform inviting contributions from a variety of aquatic ecology research fields and will be updated continuously.

Group on Earth Observations Biodiversity Observation Network

The Group on Earth Observations Biodiversity Observation Network (GEO BON; http://geobon.org) is a voluntary partnership of governments and organizations, which aims at improving the acquisition, coordination, and delivery of biodiversity observations and related services to users including decision-makers and the scientific community. The European contribution to GEO BON, the FP7 EU BON project (Building the European Biodiversity Observation Network; http://www.eubon.eu/), is developing a data platform (currently in beta stage) targeting at being a central access point for biodiversity data from different sources, e.g., processed data and remote sensing imaginary products. In addition to data from the GBIF network, it will link to the Long-Term Ecological Research (LTER) network, the Global Earth Observation System of Systems (GEOSS), and the Pan-European Species directories Infrastructure (PESI; see Sect. 20.4.2.2).

4.2 Biodiversity-Related Data Sources

4.2.1 Occurrence Data

Global Biodiversity Information Facility

The Global Biodiversity Information Facility (GBIF; http://www.gbif.org) is an international open data infrastructure, which is funded by governments and supported by member countries and other associated participants. GBIF started its efforts to collate global diversity data back in 2001 with the aim to provide free and open access to species occurrence data from one single online gateway. Currently GBIF offers more than 680 million files of occurrence data related to 1.6 million species provided by about 810 data publishers. The data portal covers all realms and represents a major source of occurrence data. Freshwater data can be extracted via species search.

Freshwater Metadatabase and Freshwater Biodiversity Data Portal

The Freshwater Metadatabase and Freshwater Biodiversity Data Portal provide access to information on freshwater datasets, species, and occurrence data. The metadatabase collects descriptions of datasets, thus making them discoverable regardless whether the data are publicly available or not. The database provides an overview on hundreds of major data sources related to freshwater research and management, and it offers the option to easily explore access rights of relevant datasets. Connected to it, the Freshwater Metadata Journal (www.freshwaterjournal.eu) allows straightforward metadata publishing. The data portal focuses on species and occurrence data. For species data, it links with the Freshwater Animal Diversity Assessment database (see below), whereas for occurrence data, it provides access to freshwater data on GBIF and acts as a data publishing platform for freshwater data.

Both the metadatabase and data portal are meant to help scientists in advertising and publishing their datasets, and they provide tools for the discovery, integration, and analysis of open and freely accessible freshwater biodiversity data. Both parts are integrated into the Freshwater Information Platform.

4.2.2 Taxonomy Data

Freshwater Animal Diversity Assessment

The Freshwater Animal Diversity Assessment (FADA; http://fada.biodiversity.be) is an informal network of scientists specialized in freshwater biodiversity. The FADA database is an information system dedicated to freshwater animal species diversity. The system provides access to authoritative species lists and global distributions compiled by world experts. The data are also integrated in the Freshwater Biodiversity Data Portal, to which it acts as a taxonomic backbone.

Pan-European Species directories Infrastructure

The Pan-European Species directories Infrastructure (PESI; http://eu-nomen.eu/) aims at delivering an integrated, annotated checklist of species occurring in Europe. The PESI checklist (also called EU-nomen) serves as a taxonomic standard and backbone for Europe. Databases from Euro+Med PlantBase, Fauna Europaea, European Register of Marine Species, and Species Fungorum Europe are the base of the PESI web portal. PESI includes interactions with the geographic focal point networks, a network of taxonomic experts, and global species databases. Freshwater information is available via dedicated species search. Results link to GBIF, the Biodiversity Heritage Library (BHL, http://www.biodiversitylibrary.org), GenBank, and BOLDSYSTEMS (see below).

4.2.3 Trait Data

freshwaterecology.info

The freshwaterecology.info database (http://www.freshwaterecology.info) has been established during several EU-funded research projects and compiles information on taxonomy, ecology, and distribution of species based on extensive literature surveys performed by experts for the targeted organism groups. For five aquatic organism groups (fishes, macroinvertebrates, macrophytes, diatoms, phytoplankton), ecological preferences and biological traits (such as habitat preferences, pollution tolerance, life cycle, etc.) are available online with various options and tools for extracting these data. The freshwaterecology.info database is a part of the Freshwater Information Platform.

4.2.4 Genetic Data

Barcode of Life Data Systems

The Barcode of Life Data Systems (BOLDSYSTEMS; http://www.boldsystems.org) aims at supporting the generation and application of DNA barcode data by aiding the acquisition, storage, analysis, and publication of DNA barcode records. It assembles molecular, morphological, and distributional data. The platform consists of four main modules: a data portal, a database of barcode clusters, an educational portal, and a data collection workbench. Freshwater data are available through species names.

International Nucleotide Sequence Database Collaboration

The International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) is a long-standing, foundational initiative that operates between three major genetic resources, namely, the DNA Data Bank of Japan (DDBJ), the European Molecular Biology Laboratory (EMBL), and the GenBank at the National Center for Biotechnology Information (NCBI) in the United States. These three organizations exchange data on a daily basis. Freshwater data can be extracted through species search.

4.2.5 Other Data

IUCN Red List

The IUCN has been working on its Red List of Threatened Species (http://www.iucnredlist.org) to assess the conservation status of species, subspecies, and varieties on a global scale for the past 50 years in order to highlight taxa threatened with extinction and thereby promote their conservation. It provides taxonomic, conservation status and distribution information on plants, fungi, and animals that have been globally evaluated using specifically defined categories and criteria. The Red List assessments bring together extensive knowledge of thousands of regional experts regarding the status of and threats to species. Regarding freshwaters, most comprehensive assessments are currently available for fishes, molluscs (mainly unionid bivalves), decapods (crabs, crayfish, and shrimps), Odonata (dragonflies and damselflies), and selected plant families.

European Alien Species Information Network

The European Alien Species Information Network (EASIN; http://easin.jrc.ec.europa.eu) is a platform developed by the European Commission’s Joint Research Centre that enables easy access to data on alien species reported in Europe. It facilitates the exploration of existing alien species information from a variety of distributed sources through freely available tools and interoperable web services. It aims to assist policy makers and scientists in their efforts to tackle alien species invasions.

4.3 Spatial Data Sources

4.3.1 Freshwater Ecoregions of the World

The Freshwater Ecoregions of the World (FEOW; http://www.feow.org) represents a global biogeographic regionalization of the earth’s freshwater biodiversity. The FEOW were developed by the WWF Conservation Science Program in partnership with the Nature Conservancy and 200 freshwater scientists from institutions around the world. The FEOW can be used for underpinning global and regional conservation planning efforts, particularly to identify outstanding and threatened freshwater systems, for serving as a logical framework of large-scale conservation strategies as well as for providing a global-scale knowledge base for increasing freshwater biogeographic knowledge.

4.3.2 Freshwater Key Biodiversity Areas

The Freshwater Key Biodiversity Areas website (KBAs; http://www.birdlife.org/datazone/freshwater), which is part of the BirdLife data zone, was supported through the BioFresh project and includes the results of assessments in Europe, the Mediterranean hotspot, and Kerala and Tamil Nadu (India). Key Biodiversity Areas are globally important areas for the persistence of biodiversity as identified using standard criteria. For freshwaters they are developed through spatial analysis of species information as assessed for the IUCN Red List of Threatened Species.

4.3.3 Global Freshwater Biodiversity Atlas

The Global Freshwater Biodiversity Atlas (http://atlas.freshwaterbiodiversity.eu) is another major component of the Freshwater Information Platform. The atlas features spatial information generated through freshwater research. It provides a series of interactive maps with different data layers on freshwater biodiversity richness, threats to freshwaters, and the effects of global change on freshwater ecosystems.

4.3.4 Protected Planet: World Database on Protected Areas

The Protected Planet webpage (http://www.protectedplanet.net) is the online interface for the World Database on Protected Areas (WDPA), which is a joint project of IUCN and UNEP. The WDPA features the most comprehensive global database on terrestrial and marine protected areas, whereby freshwaters are included in the terrestrial areas. The Protected Planet webpage provides maps and searching options with additional information from the WDPA, photos from Panoramio, and text descriptions from Wikipedia.

5 Challenges of Data Mobilization and the Way Forward

Freshwater data, especially biodiversity data, often remain difficult to access, despite the wide range of freshwater-specific information platforms and data portals, such as the ones mentioned above. This is due to the fact that a large number of smaller datasets or individual observations of occurrences are not integrated into public repositories, even though these data may already have been used in scientific papers.

While in molecular sciences, open access to primary data is already common practice, as sequences must be submitted to GenBank prior to publication, this is not the case in other areas of freshwater research. Reasons for the reluctance to publish data or even metadata include time and financial constraints, the concern that data could be used for impropriate purposes and the fear of abandoning intellectual property rights (Schmidt-Kloiber et al. 2012).

Recent efforts to make freshwater data easily available (see Penev et al. 2011 for an overview) include an initiative together with editors of leading freshwater journals encouraging the deposition of occurrence data in public repositories when publishing in one of the participating journals (De Wever et al. 2012). Similar efforts are undertaken by Dryad (http://datadryad.org), an international repository of data underlying peer-reviewed articles in basic and applied biosciences. Another approach to mobilizing biodiversity data was the creation of a dedicated journal to encourage data publication and to specifically address small datasets (“Biodiversity Data Journal”; Smith et al. 2013). Several authors extensively summarize the benefits of online data publication or the value of “data papers” as incentive for data publishing (e.g., Chavan and Penev 2011; Costello 2009), including the argument that papers connected to publicly available data get significantly more citations (Piwowar et al. 2007).

The importance of small datasets and their significance when compiled together can be illustrated by an initiative started within the EU-funded BioFresh project: more than 80 caddisfly experts from all over Europe were invited to contribute their personal species records to the “Distribution Atlas of European Trichoptera.” This resulted in the collection of more than 450,000 point data of adult caddisflies (Schmidt-Kloiber et al. 2015, Neu et al. 2018). Only such a broad, common effort can develop such a holistic picture of the distribution patterns of different species. Biodiversity hotspots as well as sensitive areas of endemic species can be identified (Schmidt-Kloiber et al. 2017). Trying to include a broad time line in such a compilation offers views on the effects of historic events, such as glaciation, the origin of species, or the establishment of refugial areas. This may finally provide the baseline data against which to definitively measure changes in biodiversity due to different anthropogenic stressors or climate change and to establish effective management and/or conservation strategies such as the establishment of an IUCN Red List (Fig. 20.3).

Fig. 20.3
figure 3

Occurrence data of Trichoptera, initiated through BioFresh and collected by European caddisfly experts (Schmidt-Kloiber et al. 2017, Neu et al. 2018)

As recovering data becomes increasingly difficult and resource-intensive with age, we advocate the adoption of data management practices that envisage data publication right from the planning and data generation stage onward. Only a wide implementation of open data publishing practices, which includes the generation of metadata and the use of community standards, will enable us to fully exploit the potential of the existing data for supporting management and conservation activities.