Data Centric Model

The data-centric model is Sigma2's strategic response to the convergence of , cloud, and AI workloads and leads with the Sigma2’s strategic vision of user in focus. Rather than attaching dedicated storage to each compute facility, NIRD (Norwegian Infrastructure for Research Data) becomes the shared, persistent data plane for all compute systems.

This eliminates data silos, removes the need for manual copying between systems, and ensures that data produced anywhere is immediately consumable everywhere with no manual copying or siloed file systems. A researcher running a simulation on next-generation HPC can consume the same dataset from the same isolated namespace as a cloud workflow or an AI pipeline, transparently and at scale.

This architecture aligns with industry best practice for disaggregated, multi-tenant, shared storage infrastructure, where compute and storage scale independently and all workloads access a common, policy-governed data layer.

Single unified namespace

One logical namespace spanning all storage tiers (NVMe, HDD, tape) and all physical locations, presented identically to all attached compute systems. There are no per-system or per-tier file system boundaries. Data placement, migration, and recall across tiers are transparent to tenants and governed by automated information lifecycle management policies defined at the project, dataset, or file or object level.

Multi-tenancy and secure tenant segregation

Hundreds of tenants, thousands of projects, and user groups share the same physical infrastructure. The platform enforces per-tenant isolation, per-project storage quotas, tiering and data placement policies, access control boundaries, and chargeback/reporting at the storage layer, not through administrative workarounds.

Noisy-neighbor effects are mitigated through QoS mechanisms (I/O bandwidth and metadata operation rate limiting per tenant or project).

The platform also provides a secure environment for sensitive and classified data, with encryption at rest and in transit, fine-grained access control (POSIX permissions, ACLs, RBAC/ABAC at file, object, directory, dataset, and project level), and delegated management for project owners within their allocated scope.

Unified high-speed fabric

All compute facilities (HPC, cloud, AI) connect to NIRD over a shared high-speed (high-bandwidth, low-latency) network fabric, removing the boundary between compute and storage.

Disaggregating storage from compute means network throughput and latency become first-class design parameters. The fabric must sustain the aggregate I/O demands of thousands of concurrent HPC clients (POSIX/parallel I/O), cloud workloads (S3), and AI pipelines simultaneously, without head-of-line blocking or noisy-neighbor interference between protocol classes.

Unified multi-protocol access

The same dataset is simultaneously accessible via parallel POSIX (HPC jobs, thousands of compute nodes), NFS (login nodes, interactive services), and S3-compatible object storage (cloud workflows, AI pipelines).

Consistency across protocol boundaries is maintained by the storage platform.

One dataset, simultaneously accessible via POSIX (parallel I/O from HPC jobs), NFS (mounted access from login nodes and services), and S3-compatible object storage (cloud workflows, AI pipelines, web portals), with consistency maintained across protocol boundaries.

No copies, no conversion, no protocol-specific silos.

Unified identity and access management

A single IAM layer enforces consistent access control across all protocols and all compute and storage systems. This spans national federated identity providers (e.g. Feide), institutional directories, and local accounts, with OAuth2/OIDC for modern service-to-service authentication alongside traditional POSIX UID/GID semantics. Access decisions are made once at the platform level and honored uniformly regardless of whether a client arrives via POSIX, NFS, or S3.

Today’s Infrastructure

Physical infrastructure at Lefdal Mine Datacenter

17°C inlet water
21°C ambient temperature
2N redundant power
5000 kg/m² floor load

2nd generation NIRD reference specifications

NIRD Data Peak: 24 PB
NIRD Data Lake: 51 PB
~1 billion files
Protocol support: POSIX, NFS and S3
Interconnect: 100 Gbit/s Ethernet

NIRD Documentation

National HPC systems (overview of all systems)

Olivia

HPE Cray EX system. 252 CPU nodes (64 512 AMD Epyc Turin cores) and 112 accelerated nodes with 448 NVIDIA Grace Hopper GH200 superchips. HPE Slingshot interconnect. For HPC and AI workloads. Hosted at Lefdal Mine Datacenter.

Olivia Documentation

Betzy

BullSequana XH2000 system. 1344 CPU nodes with 172 032 cores.
InfiniBand HDR100, Dragonfly+ topology

Betzy Documentation

Saga

HPE Apollo 2000/6500 Gen10. 16 064 cores across 364 nodes. Mid-range system for sequential and smaller parallel jobs, with large-memory nodes (up to 6 TiB) and GPU nodes.

Saga Documentation