The NIRD Central Data Library (CDL) is a complimentary service offered for users of the Sigma2’s national research infrastructure, in particular users of NIRD and HPC-resources. CDL is offered on the top of NIRD Data Lake.
The CDL serves as a centralised repository for research projects and facilitates the storage of various data types which are intended to be shared across multiple projects, such as: input datasets, libraries, AI-models, and AI-training data.
Please refer to the Central Data Policy and the Data Policy for more information on the CDL and data classification.
Service description
With CDL you can make decisions regarding the level of access and roles of users. The default access protocol to the CDL is S3, but POSIX access is granted to specific roles, such as data owners and data depositors.
The CDL is not meant for storing permanent or persistent data, and it shall not be confused with or considered a replacement for the NIRD Research Data Archive.
Central Data Library is specifically suited for you who need:
- Long-term storage of non-persistent input datasets enriched with metadata
- Sharing input data and data libraries with multiple projects
- Share AI-models and AI-training data with multiple projects and collaborators
- Access to data via the S3 protocol and S3 API
Interfaces
CDL is available on the NIRD Data Lake and can be accessed easily through multiple protocols, such as POSIX, NFS, and S3. You can directly access your data from the NIRD login nodes, Sigma2’s national HPC systems, and even connect it seamlessly to third-party storage systems, computing facilities, or your desktop.
Data Integrity
Data integrity is maintained through daily snapshots, built-in redundancy, and error-correcting mechanisms, ensuring efficient, reliable, and cost-effective storage.
While CDL currently does not include a backup option, this may be introduced in the future depending on user demand and infrastructure developments.
How to get access
You can request the CDL service at any time of the year by selecting it in the application form for storage resources.
For details about the eligibility, and governance questions, please visit the Central Data Library policy.
Technical implementation details are described in the user documentation.
You might also need
Sigma2 is offering a range of high-performance computing and storage services designed to support scientific research in every step of the research data lifecycle. Below, you can see a few selected ones, and if you visit our services overview, you will find all we have on offer.
The Central Data Library going forward
Take a look at our services roadmaps to follow development and progress.
Now
-
Support the Data Rescue initiative
We will assist the Norwegian research community in safeguarding datasets currently stored abroad, which may be at risk of deletion or loss.
-
Open, confidential and restricted data services
The data landscape is becoming more complex with the explosion of new technologies and research frontiers. AI/ML is just paradigmatic of this revolution. A variety of data with different requirements with regard to access and protection is produced. We will design and adapt the services to data which require confidentiality while still be free from regulations related to health personal sensitivity.
Next
-
Streamlined quota management
We will simplify and automate storage quota management, improving transparency and ensuring better control over resource usage on NIRD services (For eg: Data Peak, Data Lake).
-
Achieving the ISO 27001 certification
We will strengthen our information security practices and build customer trust by obtaining ISO 27001 certification, an internationally recognized standard for Information Security Management Systems.
-
Introduction of Research Objects
We will address data management needs arising from steadily increasing data volumes in close collaboration with user communities and key stakeholders. We will focus on adopting Research Objects (ROs) as a foundation for improved, future-oriented data management. ROs will first be introduced on NIRD for selected pilot projects and later expanded to all projects.
Later
-
Introduction of data management solution for the Central Data Library
We will enhance the NIRD Central Data Library (CDL) service by integrating a robust data management solution that improves data accessibility, consistency, and lifecycle governance. This effort will extend the service’s capabilities, streamline data operations, and provide a foundation for future scalability and advanced analytics.