1 Introduction

1.1 Background

For over 40 years, Earth Observation (EO) satellites developed or operated by ESA and NASA have provided users access to a wealth of data. In the coming years, access to an unprecedented amount of data will continue to grow as new missions will further extend the routine monitoring of the Earth system at the global scale. Exponential data growth is a significant factor in the Earth sciences and carbon monitoring community with the launch of the several high data volume missions including the ESA BIOMASS mission (Le Toan et al. 2011), the NASA-ISRO SAR (NISAR) mission (Rosen et al. 2015), and the NASA Global Ecosystem Dynamics Investigation (GEDI) mission (Stavros et al. 2017). The data from these missions, which expand the operational capability of global monitoring from space, combined with data from long-term EO archives (e.g., Landsat, ERS, ENVISAT), in situ networks, and models, will provide users with unprecedented insight into how the oceans, atmosphere, land, and cryosphere function and interact as part of an interconnected Earth System.

While this growing volume of environmental data from space represents a unique opportunity for both research science and applications, it also poses a major challenge for these communities to exploit the full potential of these data. First, the emergence of large volumes of data (Petabytes era) raises new issues related to the discovery, access, exploitation, visualization, and cost of these data with profound implications on how users conduct “data-intensive” Earth Science (Hey 2009) research. For example, the NISAR mission will generate around 40 PB of data per year, forcing users to consider new ways of exploiting these data.

Second, the growing diversity and complexity of both data and users demands cooperation among various communities. Each community has different needs, methods, languages, and protocols in order to understand a wealth of heterogeneous data provided in different structures and formats. Although the mission of space-based data providers is to provide stand-alone information on bio-geophysical parameters, and there have been some advances in terms of data sharing (Duncanson et al. 2019), further collaborations are needed to improve those estimates. Providers/programme funding organizations should work collaboratively to provide complementary, relevant, transparent, and validated in situ biomass data and information (Herold et al. 2019). Uncertainties prediction in biomass maps propagate to larger area estimates and can lead to substantial uncertainties in national emissions estimation if not properly considered, particularly in relation to the effect of spatial autocorrelation (McRoberts et al. 2019). However, whether used alone or in combination, the calibration of the products is likely to rely on pre-stratification of forested lands, with enough reference data in each stratum, as it is unlikely that a single universal model can be transferable across forest types and regions without biases (Réjou-Méchain et al. 2019). The researchers could easily address these current limitations of the data and error issues with the help of a dedicated tool. In addition, bridging the gap between these various communities of practice will be essential in order to fully exploit these valuable both ground and remotely sensed data (Chave et al. 2019). The scientific community is eager to address this gap and has discussed this issue of international collaboration, not seeing each biomass mission as an individual exercise but highlighting the combination and integration in one way or another at multiple conferences and science meetings, including the recent workshop on Space-Based Measurement of Forest Properties held at the International Space Science Institute in Bern, Switzerland (Scipal et al. 2017).

Responding to both these technological and community challenges requires the development of new ways of working, capitalizing on Information and Communication Technology (ICT) developments to facilitate the exploitation, analysis, sharing, mining, and visualization of massive EO datasets and high-level products within Europe, the USA, and beyond. Evolution in information technology and the consequent shifts in user behaviour and expectations provide new opportunities for increasing exploitation of EO data. In particular, Infrastructure-as-a-Service (IaaS)—providing infrastructure as shared scientific collaboration platforms across large communities—enables data and resource sharing and allows for massively scalable ICT infrastructure under pay-per-use models; this facilitates access to these services by users who would otherwise not be able to afford them. Additionally, social networking and related tools make a new level of online collaboration among communities of practice not only possible but also mainstream.

1.2 Proposed Solution

In light of these challenges, ESA and NASA are collaborating together to lower obstacles related to increased data volumes and to encourage open data policies by collaboratively developing a Multi-Mission Algorithm and Analysis Platform (MAAP) to help improve research of global aboveground terrestrial carbon dynamics (Albinet et al. 2018). This initiative aims at capitalizing on Ground Segment (GS) capabilities and ICT technologies in order to maximize the exploitation of EO data from the BIOMASS, GEDI, and NISAR missions, the data which represent a highly complementary set of measurements to determine Earth’s aboveground woody biomass with unprecedented resolution and accuracy. The capabilities envisioned for the MAAP will enable for the first time efficient and customizable inter-comparison, cross-calibration, and fusion of data from these missions by a broad community of scientists and other users.

The principal idea underpinning the MAAP is to bring the user to the collocated data and processing, rather than the data to the users, thereby enabling ultra-fast data access and processing (i.e., transferring a few megabytes of results rather than several tera/petabytes of data to the user). Its utility has been proven by ESA and NASA with, for example, the ESA Grid Processing on Demand (G-POD)Footnote 1 environment or by private players with some services like Google Earth Engine (GEE) (Gorelick et al. 2016) or Amazon Web Services (AWS).Footnote 2

Leveraging collocated data and processing is a goal for many scientific users in ecosystems research. However, the supporting infrastructure required for such research is currently limited. In order to lower these limitations and to accelerate scientific research, the MAAP seeks to develop a collaboration framework between ESA and NASA in order to more easily share date, science algorithms and computer resources for both ESA and NASA scientists.

As stated in Albinet et al. (2018), the MAAP will meet this goal by:

  1. 1.

    Enabling scientists and other users to easily discover, process, visualize and analyse large volumes of biomass relevant data from both ESA and NASA;

  2. 2.

    Providing data in the same coordinate reference frame in order to facilitate comparison, analysis, data evaluation, and data generation;

  3. 3.

    Providing a version-controlled science algorithm development environment that leverages tools, co-located data, and the provided processing resources;

  4. 4.

    Addressing intellectual property and sharing concerns related to both collaboratively developing algorithms and the subsequent sharing of those algorithms and the supporting data.

2 Platform Overview

2.1 Objective

The goal of this joint ESA-NASA project is to create a virtual working environment that shall enable (Albinet et al. 2018):

  • Rapid data access by avoiding the movement of large amounts of data on the network.

  • Increased data usability by easing the exploitation process so that users do not spend time on ICT matters.

  • Synergistic use of different EO data sources (i.e, in situ data for validation or airborne campaign data, ancillary data (Total Electron Content, land cover), Digital Terrain Models or spaceborne data from other sensors) in order to better assess and exploit the mission data.

  • Community building by fostering a spirit of resource and knowledge sharing.

  • Improved science by exposing the official processing algorithm to every user and making them fully transparent.

  • Rapid benchmarking of processing algorithms (more representative validation based on a wider dataset without having to host the data and faster validation made possible by a scalable ICT).

  • Collaborative processing algorithm repository enabling algorithm module management, sharing, and deployment for production. This forms the basis for a processing algorithm library for rapid and iterative development and deployment of algorithms in the MAAP.

  • A fully automated data processing framework allowing generation of products.

  • Managed services providing expert support (thematic and ICT/GS) for complex exploitation tasks.

  • Replicability of results and traceability of workflow and processes.

  • A cost-effective approach for scalable ICT resources capitalizing on economies of scale through infrastructure pooling (generally cheaper than an “in-house” user investment); development of new business models such as “data rental” and new pricing models such as pay-per-use.

2.2 ESA-NASA Multi-mission Algorithm and Analysis Platform Approach

The joint ESA-NASA Multi-Mission Algorithm and Analysis Platform approach is described in Fig. 1.

Fig. 1
figure 1

Multi-mission algorithm and analysis platform functions and services

The approach of the MAAP is to facilitate the exploitation of the commonalities described in Fig. 1 by providing to users a common virtual working environment (Albinet et al. 2018). This commonality does not imply a single system but rather a federation of two (or more) platforms which would allow both USA and European users transparent access to the content of both platforms. This project will allow users to access up-to-date data and algorithms for biomass estimation and to increase the cohesion of the transatlantic P/L-band Synthetic Aperture Radar (SAR) and biosphere communities. It will also further strengthen the collaboration between ESA and NASA around the Earth Observation communities, notably with respect to the ground segment and Calibration/Validation (Cal/Val) activities.

2.3 Typology of ESA-NASA MAAP Users

Five types of users have been defined for the MAAP. The first three types of users (i.e., EO Data Explorers, EO Specialists, and Algorithm Developers) are based on the generic typology approach of users of Exploitation Platforms, as described in footnote 3.Footnote 3 The other two user types (Operators and Other Systems) are focused on platform operations, administration, and resource access.

Whenever possible, the MAAP shall enforce a principle of non-discriminatory data access so that all users will be treated equally. For data products supplied from an international partner or other agency, ESA and NASA will restrict access only to the extent required by the appropriate Memorandum of Understanding. The MAAP shall implement the following user roles and data access permissions, which accumulate at each level (see Table 1).

Table 1 MAAP user types and data access permissions. Data access permissions are accumulative

Non-remote sensing experts will be able to use the platform as “EO Data Explorers.” They will have access to basic functionalities, like data discovery and visualization. More advanced users, such as specialists in remote sensing, will want to switch to “EO Specialists” in order to access additional functionalities like data generation or download. Finally, the specialists in SAR or LIDAR will switch to “Algorithm developers” to be able to modify and generate algorithms and to validate the corresponding product.

2.4 MAAP Data and Information Policy

All data and algorithms made available in the MAAP shall conform to ESA’s Revised Earth Observation Data PolicyFootnote 4 and NASA’s Data and Information policy.Footnote 5 This includes data and algorithms from ESA, NASA, and other data providers. The MAAP promotes the full and open sharing of all data with the research and applications communities, private industry, academia, and the general public. The greater the availability of the data, the more quickly and effectively the user communities can utilize the information to address basic Earth science questions and provide the basis for developing innovative practical applications to benefit the general public. For data access, the MAAP’s data policy includes the following:

  • The MAAP commits to the full and open sharing of Earth science data obtained from Earth observing satellites, sub-orbital platforms, and field campaigns with all users as soon as such data become available.

  • There will be no period of exclusive access to MAAP Earth science data. Following a post-launch checkout period, all science data will be made available to the user community.

  • The MAAP will make available all standard products along with the source code for algorithm software, coefficients, and ancillary data used to generate these products.

  • The MAAP will enforce a principle of non-discriminatory data access so that all users will be treated equally.

  • The MAAP will engage in ongoing partnerships with other institutions to increase the effectiveness of the MAAP. This interagency cooperation shall include: sharing of data from satellites and other sources, mutual validation and calibration data, and consolidation of duplicative capabilities and functions.

  • The MAAP will negotiate and implement arrangements with its international partners, with an emphasis on meeting the data acquisition, distribution, and archival needs of the MAAP.

3 Platform Implementation

ESA and NASA will jointly develop technical and data requirements for the MAAP. Based on these requirements, ESA and NASA will develop and manage independent systems that will be interoperable with each other and present users with a unified user interface. This joint approach aligns ground system development activities without impacting each agency’s independent efforts and timelines (Albinet et al. 2018).

3.1 Phasing Approach

The development of the MAAP will consist of two phases: A Pilot Phase (phase 1) and a Full Phase (phase 2).

The Pilot Phase will focus on the use of AfriSAR campaign data (Wasik et al. 2018; Pourshamsi et al. 2018), other key campaign data, including INDREX-2 (Hajnsek et al. 2009), BioSAR (Tebaldini 2009), BioSAR-2 (Tebaldini and Rocca 2012), TropiSAR (Dubois-Fernandez et al. 2012), and BioSAR-3 (Sandberg et al. 2014), and relevant ancillary data. During this phase, pre-defined functionalities, including deployment of Level-2 (CEOS) algorithms for biomass based on the airborne campaign data, will be added sequentially to the MAAP. Additionally, user requirements will be defined during the Pilot Phase that will feed into the Full Phase of platform deployment. Initially, use of the Pilot platform will be open only to a limited community of science users.

During the Full Phase, additional, related airborne, field, and satellite data from both ESA and NASA will be incorporated into the MAAP. During development of the Full platform, data from the BIOMASS, GEDI, and NISAR missions will be deployed in the MAAP and access to the platform will be extended beyond the science users to a broader user community.

3.2 Guiding Principles

Collaborative development of the ESA and NASA MAAP system will be based on the following core principles (Albinet et al. 2018):

  1. 1.

    ESA and NASA will develop separate infrastructures of the MAAP system according to jointly developed requirements and interface specifications.

  2. 2.

    An Application and Programing Interface (API) will be used to provide interoperability.

  3. 3.

    The MAAP will provide an entry point for users that will enable access to the ESA and the NASA infrastructures.

  4. 4.

    All data, information, and results, including satellite, airborne, field, and Cal/Val data from Level 1A to Level 4, will be fully open and shared between the two implementations.

  5. 5.

    All source code will be developed as open source software. Official algorithms that are archived by NASA and/or ESA and included in the MAAP Algorithm Store (MAS) will be subject to the open source requirement. MAAP user-developed algorithms that are shared and published to the MAAP will also be subject to the open source requirement.

  6. 6.

    Requirements and system architectures will be jointly developed and approved to ensure interoperability. This includes protocols and standards between different components of the system. Future modifications of requirements will be agreed to by both agencies.

  7. 7.

    Organizations, researchers, and developers will be given attribution for contributions.

  8. 8.

    Metrics on users, use, and performance will be captured and shared.

4 Benefits

4.1 Concept of Product Algorithm Laboratory

The concept of Product Algorithm Laboratory (PAL) is a new type of governance for the evolution of the EO mission algorithms. This concept will be allowed by the MAAP and be experimented, at least for the BIOMASS mission.

As described in Fig. 2, the Product Algorithm library begins with a traditional initial definition and implementation of a mission algorithm. Any user can modify this algorithm in its working environment, generate the corresponding dataset and validate it using the validation tools and in situ measurements provided within the MAAP. The user can then submit the new version of the algorithm for verification and approval to ESA/NASA who will then consider whether the algorithm will become the new baseline of the mission algorithm.

Fig. 2
figure 2

The concept of product algorithm laboratory (PAL) to support algorithm evolution

The processing algorithms evolution is easier within the MAAP since the development and implementation are made within the same environment. This rapid evolution should allow scientists to more quickly develop stable algorithms for R&D missions by leveraging this user cooperative approach. In addition, users outside the core science team will be able to contribute to the product improvement cycle. Finally, this approach aims at “breaking the wall between the world of the Science and the world of the operations.”

4.2 Benefits to Users

The goal of this analysis platform addressing EO science missions is to better address scientific community needs in term of:

  • Ease of data access.

    The users will have direct access to all the remote sensing data from BIOMASS, NISAR, and GEDI missions (and complementary/similar missions), together with relevant ground data from ESA/NASA campaigns (field data and Cal/Val).

  • Ease of data sharing.

    The users will have the possibility of easily sharing large amounts of official data from ESA/NASA missions and from communities/projects that may have complementary data.

  • Ease of data transport.

  • Joint code/algorithm development (Concept of PAL), addressing intellectual property rights issues.

  • Enable interoperability of data/code/algorithms.

  • Supported transparency in research, development, and validation.

In addition, such a platform providing EO data, algorithms, processing resources, and documentation could be an incredible tool to support education through training materials, webinars, summer schools, etc.

4.3 Benefits to ESA and NASA

To ensure joint development, implementation, and platform interoperability, ESA and NASA are creating a framework for collaboration. This activity will enable both organizations to gain a better understanding of each other’s programmatic processes as well as identify and develop best practices relating to interagency collaboration. Thus, it will help lay a foundation for joint projects in the future.

In addition to assisting the global scientific community, this project supports many of both NASA and ESA’s strategic goals. The MAAP supports the NASA strategic goal to “advance knowledge of Earth as a system to meet the challenges of environmental change and to improve life on our planet,” by providing an opportunity to advance new technology related to data sharing and analysis,Footnote 6 that focuses on improving and even more especially our understanding of global carbon dynamics. The MAAP also meets the goals of the ESA Space 4.0 concept and strategic goals of evolving the ground segment and data management of Earth Observation (EO) missions towards increased possibilities for users.Footnote 7 And through the coordination between both agencies for both development and management of the platform, it will facilitate the expansion of “partnerships with international, intergovernmental, academic, industrial, and entrepreneurial communities, recognizing them as important contributors of skill and creativity to our missions and for the propagation of our results.”Footnote 8

Current and future efforts of ESA’s Directorate of Earth Observation Programmes (EOP) and NASA’s Earth Science Data Systems Program (ESDS) will directly benefit from the work required to complete the development and management of the MAAP. Due to the diversity of EO missions under ESA management, EOP must adapt to be able to serve a wide variety of user types (research, operational, general public). Specifically, for R&D EO missions which focus on innovative sensors (Earth Explorers), there is a need to further support the collaboration between remote sensing scientists. Although, the ESA/NASA MAAP focuses specifically on the Global Aboveground Terrestrial Carbon Dynamics, these efforts could help identify other geophysical areas and EO missions where ESA and NASA could collaborate along the same concept. This could include, but is not limited to possibly expanding to other institutions in the CEOS framework.

The NASA Earth Science Data Systems (ESDS) archives provide a long-term, continuously updated global environmental record, which currently contain over 24 PB of Earth observation data. As new missions are launched and instruments commissioned, especially over the next 5 years, the archive is anticipated to grow to over 150 PB of data, expanding at an annual rate of around 50 PB of data per year. This immense growth in data presents unique science opportunities while at the same time provides significant challenges for data management and access. To continue providing support for NASA researchers and data users worldwide, ESDS is conducting a variety of activities, including developing prototype infrastructure and analysis platforms. To help further these activities, the ESA-NASA MAAP will leverage and evaluate components and infrastructure through its development and implementation process (Albinet et al. 2018).

5 Conclusion

The MAAP is being developed in preparation for the BIOMASS, NISAR, and GEDI missions. Prior to the launch of these missions, scientists are collecting airborne and field data to support Cal/Val analyses and develop algorithms (e.g., the AfriSAR campaign). Initial MAAP development activities will be focused on supporting the science teams’ preparatory efforts by developing capabilities to share and evaluate airborne, field and Cal/Val data. These early activities will help inform platform development and pave the way for an operating platform in the BIOMASS, NISAR, and GEDI mission timeframe.

This platform will give the opportunity, for the first time, to build from a community of user of this new Earth Observation mission around this innovative concept and explore new concepts of data and information access by scientists across agencies without the need to mirror data across institutions.