Project Details
Projekt Print View

Data storage system for NFDI (NFDI-Storage 2025)

Term Funded in 2026
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 579652764
 
This proposal is part of a set coordinated by the National High Performance Computing (NHR) centers. This collaborative effort is rooted in the NHR’s established expertise in providing nationwide, cross-university services and managing large-scale research infrastructures. We aim to support NFDI with an integrated and reliable storage landscape that, similar to the existing NHR services, will be operated sustainably and can thus mark the beginning of the development of a coherent research data infrastructure in Germany. The NHR centers collectively propose to build a nationally orchestrated data backbone encompassing both physical storage resources, and a layer for transforming data for research data services directly targeting the NFDI consortia. The storage resources will be closely interlinked with existing HPC systems to allow more data-intensive processing, e.g. as required for Artificial Intelligence (AI) training and applications. The infrastructure will be consistently supported by identity and access management, distribution, and provisioning processes that were developed together with NFDI consortia and will meet the guidelines for the EOSC Federation. This coordinated approach leverages the diverse technological strengths across individual NHR sites to meet the data storage requirements expressed by the NFDI consortia. This proposal addresses the long-term storage requirements of ten NFDI consortia associated with Lower Saxony, specifically DAPHNE4NFDI, KonsortSWD, NFDI4Biodiversity, NFDI4Culture, NFDI4Energy, NFDI4Health, NFDI4Objects, NFDIxCS, PUNCH4NFDI and Text+. These consortia have articulated their needs in terms of three complementary storage classes—hot, warm, and cold—each defined by distinct performance characteristics and usage patterns. To address these needs, we propose the deployment of a cost-efficient Ceph-based storage infrastructure for the hot and warm classes. This solution is designed to provide the necessary performance, capacity and functionalities while incorporating geo-replication to ensure data security, resilience, and availability. For the cold storage class, we propose a geo-replicated tape-based system, offering a cost-effective and energy-efficient approach particularly well-suited to the long-term preservation of large-scale scientific datasets. The proposed architecture integrates seamlessly not only with existing NHR high-performance computing (HPC) resources through high-bandwidth interconnects but also provides native compatibility with GWDG’s virtualization and Kubernetes platforms. By delivering cost-effective, technically robust, and future-oriented storage solutions, this proposal directly supports the establishment of a sustainable and federated data infrastructure within the NFDI.
DFG Programme Major Research Instrumentation
Major Instrumentation Datenspeichersystem für die NFDI (NFDI-Speicher 2025)
Instrumentation Group 7000 Datenverarbeitungsanlagen, zentrale Rechenanlagen
Applicant Institution Georg-August-Universität Göttingen
 
 

Additional Information

Textvergrößerung und Kontrastanpassung