HPC Installation at ZIH
ZIH offers several HPC components tailored towards different usage scenarios
- Data-intensive and compute-intensive HPC applications
- HPC Data Analytics and Machine Learning
- Processing of extremely large data sets
Access is free of charge for academic researchers from TU Dresden, Saxony, and all over Germany as well as their project partners. HPC project proposals and resource allocations are subject to a scientific review.
Computing center and HPC components
LZR computing center
The LZR computing center at TU Dresden hosts the main campus IT services as well as the HPC installations. It combines state-of-the-art infrastructure with a highly energy efficient operation and cooling concept including warm water cooling an energy reuse for heating nearby campus buildings.
It was inaugurated in 2015 and was awarded the 1st place of Deutscher Rechenzentrumspreis 2014 (German Award for Computing Centers) in the category for energy- and resource-efficient computing centers.
The High Performance Computing and Storage Complex (HRSK-II) by Bull/Atos provides the major part of the computing capacity available at ZIH, especially for highly parallel, data-intensive and compute-intensive HPC applications.
Typical applications: FEM simulations, CFD simulations with Ansys or OpenFOAM, molecular dynamics with GROMACS or NAMD, computations with Matlab or R
- over 40 000 CPU cores, mostly Intel Haswell and Broadwell
- 2.6 to 36 GB main memory per core
HPC Data Analytics and Machine Learning partition
The HPC-DA partition combines different hardware components and allows to combine them into tailored research environments. For machine learning and deep learning it offers 192 powerful Nvidia V100 GPUs. For data analytics it has a CPU partition with high memory bandwidth. To efficiently access large active data sets 2 petabytes of NVME memory with a nominal bandwidth of 2 Terabytes/s are available.
In May 2021, we installed and setup the new multi-GPU cluster „AlphaCentauri“. It mainly addresses the AI-related research projects of ScaDS/AI. The cluster consists of 34 nodes, each equipped with 8 NVidia A100 GPUs, 2 AMD EPYC processors, and 1 TB of RAM.
Typical applications: training of neural nets with Tensorflow, data analytics with Big Data frameworks such as Apache Spark
- 32 IBM Power 9 nodes with 6 Nvidia V100 GPUs each for machine learning
- 192 AMD Rome nodes, each with 128 cores, 512 GB RAM
- 2 PB fast NVME storage
- 10 PB warm archive storage
The shared memory system HPE Superdome Flex is especially well suited for data intensive application scenarios, for example to process extremely large data sets completely in main memory or in very fast NVMe memory.
Typical applications: applications that require a very large shared memory, such as genome analysis
- Shared memory system with 32 Intel CascadeLake CPUs, 896 cores in total
- 48 TB main memory in a shared address space
- 400 TB NVMe cards as very fast local storage