Bringing Structure to Research Data Management Through a Pervasive, Scalable and Sustainable Research Data Infrastructure

  • Raimund VoglEmail author
  • Dominik Rudolph
  • Anne Thoring


One of the key fields of digitalization at universities is the management of an ever increasing amount of digital research data. Based on several surveys amongst researchers, demands and knowledgeability on the subject are varying widely. Services for research data management and underlying infrastructures are called for and are a currently very actively discussed subject. To create demand oriented, future proof, scalable and financially and operationally sustainable infrastructures and services, a structured approach to demand assessment and infrastructure architecture is key. Based on user surveys and conceptual (technical) design workshops of university IT infrastructure providers starting in 2016, a consortium of five universities in North Rhine-Westphalia (NRW) has formed to pursue together an open source and joint operations approach for creating a multisite integrated storage and compute platform (primarily using open source/freeware community standards Ceph and OpenStack) as a research data infrastructure, providing the operational basis for the actual research data services and software (envisioned to be containerized or virtualized appliances). A joint funding proposal, set in the context of the German National Research Data Infrastructure Initiative (NFDI) has been submitted, aiming at the creation of this 33 Petabyte storage and 4,500 CPU core compute environment, with the joint operations team already having been formed. Additionally, tools for data management and curation shall be made available on this infrastructure. The development of these tools is progressing under the project title sciebo. RDS (Research Data Services), which aims at adding research data management workflows to the well-established sciebo sync and share cloud storage platform, which is already widely used for collaboration on research data and will, in the future, also be operated in an OpenStack/Ceph setting. With DFG providing funding for this development, the aim is for a wider adoption of these tools in the German research community. Workpackages within this project for empiric analysis of user demands have been designed to ensure that these research data services will find users beyond the project partners’ institutions.


Research data management Cloud computing Open source 


  1. Apel, J. (2018). FAIR data principles. Retrieved September 24, 2018, from
  2. López, A., Vogl, R., & Roller, S. (2017). Research data infrastructures–A perspective for the state of North Rhine-Westphalia in Germany. In EUNIS 2017, Münster, Book of Proceedings (pp. 105–112). Münster. Retrieved from
  3. Mons, B., & Tochtermann, K. (2016). Realising the European open science cloud. First report and recommendations of the commission high level expert group on the European open science cloud. (European Commission & Directorate-General for Research and Innovation, Eds.). Luxembourg: Publications Office of the European Union. Retrieved from
  4. Open Commons Consortium. (2018). About. Retrieved September 24, 2018, from
  5. Paten, B. (2017). A data biosphere for biomedical research. Retrieved September 24, 2018, from
  6. Vogl, R., Rudolph, D., Thoring, A., Angenent, H., Stieglitz, S., & Meske, C. (2016). How to build a cloud storage service for half a million users in higher education: Challenges met and solutions found. In Proceedings of the 2016 49th Hawaii International Conference on System Sciences (HICSS) (pp. 5328–5337). Washington, DC: IEEE Computer Society.Google Scholar
  7. WWU Münster. (2017). Grundsätze zum Umgang mit Forschungsdaten an der Westfälischen Wilhelms-Universität Münster. Münster. Retrieved from

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.University of MünsterMünsterGermany

Personalised recommendations