The NIRD Central Data Library (CDL) is a complimentary service offered for users of the Sigma2’s national research infrastructure, in particular users of NIRD and HPC-resources. CDL is offered on the top of NIRD Data Lake.
The CDL serves as a centralized repository for research projects and facilitates the storage of various data types which are intended to be shared across multiple projects, such as: input datasets, libraries, AI-models, and AI-training data.
Please refer to the Central Data Policy and the Data Policy for more information on the CDL and data classification.
Service description
With CDL you can make decisions regarding the level of access and roles of users. The default access protocol to the CDL is S3, but POSIX access is granted to specific roles, such as data owners and data depositors.
The CDL is not meant for storing permanent or persistent data, and it shall not be confused with or considered a replacement for the NIRD Research Data Archive.
Central Data Library is specifically suited for you who need:
- Long-term storage of non-persistent input datasets enriched with metadata
- Sharing input data and data libraries with multiple projects
- Share AI-models and AI-training data with multiple projects and collaborators
- Access to data via the S3 protocol and S3 API
Interfaces
CDL is available on the NIRD Data Lake and can be accessed easily through multiple protocols, such as POSIX, NFS, and S3. You can directly access your data from the NIRD login nodes, Sigma2’s national HPC systems, and even connect it seamlessly to third-party storage systems, computing facilities, or your desktop.
Data Integrity
Data integrity is maintained through daily snapshots, built-in redundancy, and error-correcting mechanisms, ensuring efficient, reliable, and cost-effective storage.
While CDL currently does not include a backup option, this may be introduced in the future depending on user demand and infrastructure developments.
How to get access
You can request the CDL service at any time of the year by selecting it in the application form for storage resources.
For details about the eligibility, and governance questions, please visit the Central Data Library policy.
Technical implementation details are described in the user documentation.
You might also need
Sigma2 is offering a range of high-performance computing and storage services designed to support scientific research in every step of the research data lifecycle. Below, you can see a few selected ones, and if you visit our services overview, you will find all we have on offer.