HPC and storage systems

Access to the national supercomputing system empowers Norwegian research and innovation across various scientific fields. These high-performance machines accelerate progress in climate modelling, pharmaceutical research, astrophysics, materials science, and more. Supercomputers (HPC systems) not only perform tasks faster but also tackle once-impossible problems. They enable deeper exploration of cosmic mysteries, enhance climate prediction accuracy, facilitate innovative materials design, and advance life-saving drug development. In essence, these machines drive the frontiers of knowledge and technology.

systems

Norway's e-infrastructure comprises 3 robust supercomputer systems and a state-of-the-art storage solution. Additionally, as part-owner of Europe's most powerful supercomputer, LUMI Sigma2 enriches the landscape for the benefit of Norwegian researchers.

Each of the HPC facilities consists of a compute resource (several compute nodes, each with several processors and internal shared memory, plus an interconnect that connects the nodes), a central storage resource that is accessible by all the nodes, and a secondary storage resource for back-up (and in few cases also for archiving). All facilities use variants of the UNIX operating system (Linux, AIX, etc.).

Storage systems

The NIRD provides data storage capacity for research projects with a data-centric architecture. Hence, it is also used for storage capacity for the HPC systems for the national data archive and other services requiring a storage backend. It consists of two geographically separated storage systems. The leading storage technology combined with the powerful network backbone underneath allows the two systems to be geo‐replicated asynchronously, thus ensuring high availability and security of the data.

The NIRD storage facility, differently from its predecessor NorStore, is strongly integrated with the HPC‐ systems, thus facilitating the computation of large datasets. Furthermore, the NIRD storage offers high-performance storage for post-processing, visualization, GPU‐computing and other services on the NIRD Service Platform.

Services overview

Upcoming systems

Olivia — HPE Cray EX4000 system

Olivia represents a significant advancement in the national high-performance computing infrastructure, specifically designed to drive Artificial Intelligence (AI) advancements.

The system is located at the Lefdal Mine Datacenter and is named after the mineral Olivine, which was historically extracted from the mine that now houses Olivia.

This state-of-the-art system, built on the HPE Cray EX supercomputer platform, integrates advanced hardware components to manage massive HPC and AI workloads. Olivia features 304 of the NVIDIA Grace Hopper GH200 Superchips, which consist of the tightly connected ARM-based CPU Grace and the NVIDIA H100 GPU.

In addition to these powerful accelerators, the system features AMD Turin CPUs, providing 64 512 physical CPU cores across 252 compute nodes. All nodes are interconnected by an HPE Slingshot network, a high-speed, low-latency interconnect utilising a Dragonfly topology.

Olivia is well suited to handle diverse workloads, including memory-distributed jobs using up to 2048 cores and GPU-accelerated HPC and AI/Machine Learning workloads. The system is integrated with the NIRD storage system located within the same data centre, facilitating efficient handling of large datasets for computation.

Olivia will open for projcets in the autumn 2025.

Systems in production

Betzy — BullSequana XH2000

The supercomputer is named after Mary Ann Elizabeth (Betzy) Stephansen, the first Norwegian woman with a PhD in mathematics. Betzy is a BullSequana XH2000, provided by Atos, with a theoretical peak performance of 6.2 PetaFlops.

Betzy is placed at in Trondheim and has been in production since 24 November 2020.

Betzy offers mainly CPU-compute capacity and some GPU compute capacity. The latest GPU and AI compute capacity are provided through NVIDIA accelerators. While the system is mainly suited for highly parallel MPI jobs, utilising from 512 cores up to 65 536 cores, it also offers smaller pre-and post-processing capabilities through dedicated nodes. Compared with Saga and Fram, Betzy is the system best suited for highly parallel jobs.

Technical specifications

Some common applications running on Betzy:

OpenFoam
NorESM
Bifrost
FluTAS
MGLET
ABINIT

The most common users on the machine (in descending order) are from the following fields of science:

Geosciences
Computational Fluid Dynamics (CFD)
Physics
Chemistry, and
Marine technology

Betzy use cases

Some examples of use cases conducted on Betzy:

Fram — Lenovo NeXtScale nx360

Named after the Norwegian arctic expedition ship Fram, this machine started production on 1 November 2017. The computer is hosted at the The Arctic University of Norway and is provided by Lenovo.

For technical details, please refer to our technical documentation.

This distributed memory system offers CPU-compute capacity interconnected with a high-bandwidth low-latency Infiniband network. The interconnect network is organized in an island topology, with 9216 cores on each island. The machine also has some nodes with more memory, enabling support for jobs demanding up to 512 GiB per core.

The machine is well suited for distributed memory jobs using between 32 and 512 cores by using MPI. Fram serves as our “mid-range” system in terms of recommended job size.

Techical specifications

Some common applications run on Fram are:

VASP
CESM
ROMS
WRF
Python scripts
LAMMPS
Gaussian

The most common users on the machine (in descending order) are from the following fields of science:

Geosciences
Material science
Chemistry
Physics
Computational Fluid Dynamics (CFD), and
Marine technology

Saga — Apollo 2000/6500 Gen10

The national supercomputer Saga is named after the goddess in Norse mythology associated with wisdom. Saga is also a term for Icelandic epic prose literature.

This supercomputer open to users in 2019, and is located at NTNU in Trondheim.

Technical specifications

Saga offers the latest GPU and AI compute capacity through NVIDIA accelerators. This machine has several large memory nodes and can serve jobs requiring up to 6 TiB of RAM per core. This machine is well suited for running single-core applications, shared memory applications (OpenMP) and applications utilising up to 256 cores.

Saga serves the majority of our HPC projects. Some common applications which run on Saga are:

Orca
Python scripts
VASP
Gaus ian
LAMMPS

The primary users of saga are within the following fields of science:

Chemistry
Material Science
Biosciences
Geosciences
Physics
Medical Science

Saga use cases

Some examples of research activities conducted on Saga:

LUMI — HPE Cray EX supercomputer

LUMI (Large Unified Modern Infrastructure) is the first of three pre-exascale supercomputers built to ensure that Europe is among the world leaders in computing capacity. Norway, through Sigma2, owns part of the LUMI supercomputer which is funded by the EU and consortium countries.

Key figures (Sigma2`s share):

CPU-core hours	34 003 333
GPU-hours	1 771 000
TB-hours	16 862 500
Central disk	A share of the total 117 PB
Theoretical Performance (Rpeak)	~11 PFLOPS

Using 1 core of LUMI-C for 1 hour costs 1 CPU-core-hour, and using 1 GPU of the LUMI-G partition for 1 hour costs 1 GPU-hour. Storing 1 terabyte on LUMI-F consumes 10 terabyte-hours in 1 hour, on LUMI-P 1 terabyte-hour per hour, and on LUMI-O 0.5 terabyte-hours per hour.

NIRD — National Infrastructure for Research Data

NIRD offers storage services, archiving services, cloud services, and processing capabilities for stored data. It provides these services and capacities to scientific disciplines requiring access to advanced, large-scale, or high-end resources for storage, data processing, research data publication, or digital database and collection searches. NIRD is a high-performance storage system capable of supporting AI and analytics workloads, enabling simultaneous multi-protocol access to the same data.

NIRD provides storage resources with yearly capacity upgrades, data security through backup services and adaptable application services, multiple storage protocol support, migration to third-party cloud providers and much more. Alongside the national high-performance computing resources, NIRD forms the backbone of the national e-infrastructure for research and education in Norway, connecting data and computing resources for efficient provisioning of services.
Technical Specifications
Hardware

NIRD consists of two separate storage systems, namely Tiered Storage (NIRD TS) and Data Lake (NIRD DL). The total capacity of the system is 49 PB (24 PB on NIRD TS and 25 PB on NIRD DL).

NIRD TS has several tiers spanned by single filesystem and designed for performance and used mainly for active project data.

NIRD DL has a flat structure, designed mainly for less active data. NIRD DL provides a unified access, i.e., file- and object storage for sharing data across multiple projects, and interfacing with external storages.

NIRD is based on the IBM Elastic Storage System, built using ESS3200, ESS3500 and ESS5000 building blocks. I/O performance is ensured with IBM POWER9 servers for I/O operations, having dedicated data movers, protocol nodes and more.

System information

Category	Description	Details
System	Building blocks	IBM ESS3200 IBM ESS3500 IBM ESS5000 IBM POWER9
Clusters	Two physically separated clusters	NIRD TS NIRD DL
Storage media	NIRDS TS NIRD DL	NVMe SSD & NL-SAS NL-SAS
Capacity	Total capacity: 49 PB	NIRD TS: 24 PB NIRD DL: 25 PB
Performance	Aggregated I/O throughput	NIRD TS: 209 GB/s NIRD DL: 66 GB/s
Interconnect	100 Gbit/s Ethernet	NIRD TS: balanced 400 Gbit/s NIRD DL: balanced 200 Gbit/s
Protocol nodes	NFS S3	4 x 200 Gbit/s 5 x 50 Gbit/s

Software

IBM Storage Scale (GPFS) is deployed on NIRD, providing a software-defined high-performance file- and object storage for AI and data-intensive workloads.
Insight into data is ensured by IBM Storage Discover.
Backup services and data integrity are ensured with IBM Storage Protect.

NIRD Service Platform Hardware

The NIRD Service Platform is a Kubernetes-based cloud platform providing persistent services such as web services, domain- and community specific portals, as well as on-demand services through the NIRD Toolkit.

The cloud solution on the NIRD Service Platform enables researchers to run microservices for pre/post-processing, data discovery and analysis as well as data sharing, regardless of dataset sizes stored on NIRD.

The NIRD Service Platform was designed with high-performance computing and artificial intelligence capabilities in mind, to be robust, scalable and having the ability of running AI and ML workloads.

The technical specifications of the NIRD Service Platform are listed below:

Workers	12
CPUs	2368 cores	8 workers with 256 cores 4 workers with 80 cores
GPUs	30	Nvidia V100
RAM	9 TiB	4 workers with 512 GiB 4 workers with 1024 GiB 4 workers with 768 GiB
Interconnect	Ethernet	8 workers with 2 x 100 Gbit/s 4 workers with 2 x 10 Gbit/s

To the NIRD Service Platform service description

Betzy and Saga are located at NTNU in Trondheim, and Fram at UiT in Tromsø. LUMI is located in a data centre facility in Kajaani, Finland. NIRD (Norwegian Infrastructure for Research Data) is located in Lefdal Mine Data Centers, where Sigma2`s future HPC systems will also be placed.

Live hardware status

Decommissioned systems

Gardar (2012-2015)

Gardar was an HP BladeCenter cluster consisting of one frontend (management, head) node, 2 login nodes and 288 compute nodes running Centos Linux managed by Rocks. Each node contains two Intel Xeon Processors with 24 GB memory. The compute nodes were located in HP racks. Each HP rack contained three c7000 Blade enclosures and each enclosure contained 16 compute nodes.

Gardar had a separate storage system. The X9320 Network storage system was available to the entire cluster and used the IBRIX Fusion software. The total usable storage of X9320 was 71.6TByte. The storage system was connected to the cluster with an Infiniband QDR network.

Technical details

System	HP BI280cG6 Servers
Number of cores	3456
Number of nodes	288
CPU type	Intel Xeon E5649 (2.53GHz) - Westmere -EP
Number of teraflops	35TFlops
Total storage capacity	71.6TByte

Hexagon (2008-2017)

Significant use of Hexagon came traditionally from areas of science such as computational chemistry, computational physics, computational biology, geosciences and mathematics. Hexagon was installed at the High Technological Center in Bergen (HiB) and was managed and operated by the University of Bergen. Hexagon was upgraded from Cray XT4 in March 2012.

Technical details

System	Cray XE6-200
Number of cores	22272
Number of nodes	696
Cores per node	32
CPU type	Cray Gemini Interconnect
TFlops peak performance	204.9 Teraflops/s
Operative system	Cray Linux Environment

Abel (2012-2020)

Named after the famous Norwegian mathematician Niels Henrik Abel, the Linux cluster at the University of Oslo was a shared resource for research computing capable of 258 TFLOP/s theoretical peak performance. At the time of installation on 1 October 2012, Abel reached position 96 on the Top500 list of the most powerful systems in the world.

Abel was an all-around, all-purpose cluster designed to handle multiple concurrent jobs/users with varying requirements. Instead of massively parallel applications, the primary application profile was for moderately to smaller parallel applications with high IO and/or memory demand.

Technical details

System	MEGWARE MiriQuid 2600
Number of cores	10000+
Number of nodes	650+
CPU type	Intel E5 2670
Max Floating-point performance, double	258 Teraflops/s
Total memory	40 TiB
Total disc capasity	400 TiB

Stallo (2007-2021)

The Linux Cluster Stallo was a compute cluster at the University of Tromsø, which was installed on 1 December 2007, and included in NOTUR on 1 January 2008. The supercomputer was upgraded in 2013.

Stallo was intended for a distributed-memory MPI application with low communication requirements between the processors, a shared-memory OpenMP application using up to eight processor cores, and parallel applications with moderate memory requirements (2-4 GB per core) and embarrassingly parallel applications.

Technical details

System	HP BL 460c Gen 8
Number of cores	14116
Number of nodes	518
CPU type	Intel E5 2670
Peak performance	104 Teraflops/s
Total memory	12.8 TB
Total disc capacity	2.1 PB

Vilje (2012-2021)

Vilje was a cluster system procured by NTNU, in cooperation with the Norwegian Meteorological Institute and Sigma in 2012. Vilje was used for numerical weather prediction in operational forecasting by Meteorologisk institiutt as well as for research in a broad range of topics at NTNU and other Norwegian universities, colleges and research institutes. The name Vilje was taken from Norse Mythology.

Vilje was a distributed memory system that consisted of 1440 nodes interconnected with a high-bandwidth low-latency switch network (FDR Infiniband). Each node had two 8-core Intel Sandy Bridge (2.6 Ghz) and 32 GB memory. The total number of cores was 23040.

The system was well-suited (and intended) for large-scale parallel MPI applications. Access to Vilje was in principle only allowed for projects that had parallel applications that used a relatively large number of processors (≥ 128).

Technical details

System	SGI Altix 8600
Number of cores	22464
Number of nodes	1404
CPU type	Intel Sandy Bridge
Total memory	44 TB
Total disc capacity

Olivia, Norways new supercomputer — Olivia is Norway's most powerful supercomputer.

Olivia becomes the name of Norway's next supercomputer

10.09.2024

Sigma2 is pleased to announce that Norway's next supercomputer now has its official name: Olivia. Olivia will play a significant role in developing artificial intelligence (AI), especially in developing and improving Norwegian language models.

All you need to know about Olivia, Norway's most powerful supercomputer yet:

Meet Olivia