Skip to main content

Bringing Structure to Research Data Management Through a Pervasive, Scalable and Sustainable Research Data Infrastructure

  • Chapter
  • First Online:
The Art of Structuring

Abstract

One of the key fields of digitalization at universities is the management of an ever increasing amount of digital research data. Based on several surveys amongst researchers, demands and knowledgeability on the subject are varying widely. Services for research data management and underlying infrastructures are called for and are a currently very actively discussed subject. To create demand oriented, future proof, scalable and financially and operationally sustainable infrastructures and services, a structured approach to demand assessment and infrastructure architecture is key. Based on user surveys and conceptual (technical) design workshops of university IT infrastructure providers starting in 2016, a consortium of five universities in North Rhine-Westphalia (NRW) has formed to pursue together an open source and joint operations approach for creating a multisite integrated storage and compute platform (primarily using open source/freeware community standards Ceph and OpenStack) as a research data infrastructure, providing the operational basis for the actual research data services and software (envisioned to be containerized or virtualized appliances). A joint funding proposal, set in the context of the German National Research Data Infrastructure Initiative (NFDI) has been submitted, aiming at the creation of this 33 Petabyte storage and 4,500 CPU core compute environment, with the joint operations team already having been formed. Additionally, tools for data management and curation shall be made available on this infrastructure. The development of these tools is progressing under the project title sciebo. RDS (Research Data Services), which aims at adding research data management workflows to the well-established sciebo sync and share cloud storage platform, which is already widely used for collaboration on research data and will, in the future, also be operated in an OpenStack/Ceph setting. With DFG providing funding for this development, the aim is for a wider adoption of these tools in the German research community. Workpackages within this project for empiric analysis of user demands have been designed to ensure that these research data services will find users beyond the project partners’ institutions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Genomic Data Commons Data Pool (https://portal.gdc.cancer.gov).

  2. 2.

    The OCC Environmental Data Commons (http://edc.occ-data.org).

  3. 3.

    Jetstream (http://jetstream-cloud.org).

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raimund Vogl .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Vogl, R., Rudolph, D., Thoring, A. (2019). Bringing Structure to Research Data Management Through a Pervasive, Scalable and Sustainable Research Data Infrastructure. In: Bergener, K., Räckers, M., Stein, A. (eds) The Art of Structuring. Springer, Cham. https://doi.org/10.1007/978-3-030-06234-7_47

Download citation

Publish with us

Policies and ethics