NIRD Central Data Library

The NIRD Central Data Library (CDL) is a complimentary service offered for users of the Sigma2’s national research infrastructure, in particular users of NIRD and HPC-resources. CDL is offered on the top of NIRD Data Lake.

The CDL serves as a centralised repository for research projects and facilitates the storage of various data types which are intended to be shared across multiple projects, such as: input datasets, libraries, AI-models, and AI-training data.

Please refer to the Central Data Policy and the Data Policy for more information on the CDL and data classification.

Who

Students and researchers at Norwegian universities and research institutions.

More info

How

Apply at any point in time through the year.

More info

Price

The price depends on the funding that supports your research project.

More info

Service description

With CDL you can make decisions regarding the level of access and roles of users. The default access protocol to the CDL is S3, but POSIX access is granted to specific roles, such as data owners and data depositors.

The CDL is not meant for storing permanent or persistent data, and it shall not be confused with or considered a replacement for the NIRD Research Data Archive

Central Data Library is specifically suited for you who need:

  • Long-term storage of non-persistent input datasets enriched with metadata 
  • Sharing input data and data libraries with multiple projects 
  • Share AI-models and AI-training data with multiple projects and collaborators 
  • Access to data via the S3 protocol and S3 API 

Interfaces

CDL is available on the NIRD Data Lake and can be accessed easily through multiple protocols, such as POSIX, NFS, and S3. You can directly access your data from the NIRD login nodes, Sigma2’s national HPC systems, and even connect it seamlessly to third-party storage systems, computing facilities, or your desktop.

Data Integrity

Data integrity is maintained through daily snapshots, built-in redundancy, and error-correcting mechanisms, ensuring efficient, reliable, and cost-effective storage.

While CDL currently does not include a backup option, this may be introduced in the future depending on user demand and infrastructure developments.

How to get access

You can request the CDL service at any time of the year by selecting it in the application form for storage resources.

For details about the eligibility, and governance questions, please visit the Central Data Library policy.

Technical implementation details are described in the user documentation.

You might also need

Sigma2 is offering a range of high-performance computing and storage services designed to support scientific research in every step of the research data lifecycle. Below, you can see a few selected ones, and if you visit our services overview, you will find all we have on offer.

The Central Data Library going forward

Take a look at our services roadmaps to follow development and progress.

Now

  • Support the Data Rescue initiative

    We will assist the Norwegian research community in safeguarding datasets currently stored abroad, which may be at risk of deletion or loss.

  • Open, confidential and restricted data services

    The data landscape is becoming more complex with the explosion of new technologies and research frontiers. AI/ML is just paradigmatic of this revolution. A variety of data with different requirements with regard to access and protection is produced. We will design and adapt the services to data which require confidentiality while still be free from regulations related to health personal sensitivity.

Next

  • Streamlined quota management

    We will simplify and automate storage quota management, improving transparency and ensuring better control over resource usage on NIRD services (For eg: Data Peak, Data Lake).

  • Achieving the ISO 27001 certification

    We will strengthen our information security practices and build customer trust by obtaining ISO 27001 certification, an internationally recognized standard for Information Security Management Systems.

  • Introduction of Research Objects

    We will address data management needs arising from steadily increasing data volumes in close collaboration with user communities and key stakeholders. We will focus on adopting Research Objects (ROs) as a foundation for improved, future-oriented data management. ROs will first be introduced on NIRD for selected pilot projects and later expanded to all projects.

Later

  • Introduction of data management solution for the Central Data Library

    We will enhance the NIRD Central Data Library (CDL) service by integrating a robust data management solution that improves data accessibility, consistency, and lifecycle governance. This effort will extend the service’s capabilities, streamline data operations, and provide a foundation for future scalability and advanced analytics.