Skip to main content

National supercomputing cluster "Alem.Cloud" included in the ranking of the world’s most powerful computing systems

The "Alem.Cloud" national supercomputer ranked 86th in the TOP500, the international ranking of the world’s most powerful computing systems. The ranking includes only those clusters that undergo independent testing based on the HPL (High-Performance Linpack) standard, the global benchmark for computing performance.

The "Alem.Cloud" national supercomputing cluster was established on the instruction of the Head of State, based on the infrastructure of NIT JSC under the Ministry of Artificial Intelligence and Digital Development. The supercomputer is intended for implementing projects in artificial intelligence, high-performance computing, and big data analytics.

The cluster’s architecture is built on 64 HPE Cray servers integrated into a single high-density GPU complex. Each compute node is equipped with modern NVIDIA H200 accelerators, and inter-node communication is handled by a high-speed 400 GbE / RoCE v2 network that provides minimal latency and high bandwidth, which is particularly important for large-scale HPL workloads and distributed model training.

Ranking 86th in the TOP500 confirms that Kazakhstan possesses world-class scientific and technological infrastructure and has significant potential for developing its own AI models. This opens up opportunities for the country to conduct breakthrough research, as well as to develop the export of digital services and technologies in the future.

The system delivers performance in the tens of petaflops, making Alem.Cloud the most powerful computing platform in Kazakhstan and the Eurasian region.

The supercomputer’s software employs advanced enterprise-level technologies:

  • SUSE Harvester – a hyperconverged infrastructure (HCI) platform and GPU node virtualization based on KVM;
  • SUSE Rancher – a centralized platform for managing Kubernetes clusters and AI workloads;
  • Kubernetes – the foundation for containerization and orchestration of machine learning computational tasks;
  • Support for AI frameworks and distributed stacks: PyTorch, TensorFlow, JAX, NCCL, UCX/UCC, MPI.

During the preparation of the cluster for testing, the following were carried out:

  • optimization of NUMA and CPU/GPU affinity;
  • configuration of RoCE v2 RDMA channels;
  • optimization of inter-node communication and HPL parameters;
  • calibration of GPU parallelism for maximum performance.

The supercomputer’s security is built on modern Zero-Trust principles:

  • SUSE NeuVector – container environment protection, network policies, and runtime monitoring;
  • Palo Alto Networks – network segmentation, next-generation firewall, and protection of the perimeter and inter-zone communications;
  • Thales – cryptographic solutions and hardware modules for secure key management and data encryption.

Previously, NIT JSC’s technical specialists, together with international partners, conducted comprehensive testing using the HPL (Linpack) methodology. The purpose of the test was to measure the system’s performance and confirm its readiness for submission to the TOP500 ranking. Based on the results of this test, the supercomputer achieved performance metrics that enabled Kazakhstan to officially submit an application for inclusion in the global TOP500 ranking.