Now supercomputer LUMI’s full system architecture is revealed
– The AMD MI250X GPU is in a class of its own now and for a long time to come. The technical supremacy and performance per watt were the primary reasons why AMD’s MI250X GPUs were selected for LUMI, explains Pekka Manninen, Director of LUMI Leadership Computing Facility.
The full system architecture of LUMI
- The LUMI system is supplied by Hewlett Packard Enterprise (HPE), based on an HPE Cray EX supercomputer.
- The GPU partition will consist of 2560 nodes, each node with one 64 core AMD Trento CPU and four AMD MI250X GPUs.
- Each GPU node features four 200 Gbit/s network interconnect cards, i.e. has 800 Gbit/s injection bandwidth.
- Each MI250X GPU consists of two compute dies, each with 110 compute units each, and each compute unit has 64 stream processors for a total of 14080 stream processors.
- The committed Linpack performance of LUMI-G is 375 Pflop/s.
- The MI250X GPU comes with a total of 128 GB of HBM2e memory offering over 3.2 TB/s of memory bandwidth.
- A single MI250X card is capable of delivering 42.2 TFLOP/s of performance in the HPL benchmarks. More in-depth performance results for the card can be found on AMD’s website.
- In addition to the GPUs in LUMI there is another partition (LUMI-C) using CPU only nodes, featuring 64-core 3rd-generation AMD EPYC™ CPUs, and between 256 GB and 1024 GB of memory. There are 1,536 dual-socket CPU nodes in total. LUMI-C was #5 on the November 2021 Graph500 list and #76 on the November 2021 Top500 list.
- LUMI also has a partition with large memory nodes, with a total of 32 TB of memory in the partition.
- For visualization workloads, LUMI has 64 Nvidia A40 GPUs.
- LUMI’s storage system will consist of three components. First, there will be a 7 petabyte all-flash Lustre system for short term fast access. Next, there is a longer-term more traditional 80 petabyte Lustre system based on mechanical hard drives.
- For easy data sharing and project lifetime storage, LUMI has 30 petabytes of Ceph based storage.
- LUMI will also have an OpenShift/Kubernetes container cloud platform for running microservices.
- All the different compute and storage partitions are connected to the very fast Cray Slingshot interconnect of 200 Gbit/s.
- When completed LUMI will take over 150m2 of space, which is about the size of a tennis court. The weight of the system is nearly 150 000 kilograms (150 metric tons).
Two Norwegian Pilots
Half of the LUMI resources belong to the EuroHPC Joint Undertaking, and the other half of the resources belong to the participating countries. For Norway, this means around 2 % of the machine capacity.
We have currently two Norwegian pilot projects testing the LUMI CPU system, with two more on the way. The pilot testing of LUMI-G is expected to start in March 2022.
(Image by: Juha Torvinen, CSC).