This audit examined strong scaling of HemeLB on the SuperMUC-NG Lenovo ThinkSystem with up to 309,696 MPI processes (on 6,452 compute nodes) using a 21.15 GiB "circle of Willis" testcase. Although the code always ran correctly when sufficient memory was available, several compute nodes with notably inferior memory access performance were identified and needed to be explicitly avoided. Compared to the smallest configuration that could be run using 864 processes (on 18 `fat' compute nodes each with 768 GiB), 190x speed-up was delivered with 80% scaling efficiency maintained to over 100,000 processes for the simulation phase. Non-blocking MPI point-to-point message-passing ensures excellent communication efficiency, whereas load balance is so...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
Abstract—This work focuses on tools for investigating algorithm performance at extreme scale with mi...
This online course organised in cooperation with NHR@FAU covers performance engineering approaches o...
Performance measurement and analysis of parallel applications is often challenging, despite many exc...
The CompBioMed HPC CoE flagship application HemeLB was run with a 6.4 micron resolution "circle of W...
The CompBioMed HPC CoE flagship application HemeLB was run with a 6.4 micron resolution "circle of W...
This audit examined strong scaling of a GPU-enabled development version of the HemeLB application on...
In recent years, it has become increasingly common for high performance computers (HPC) to possess s...
We investigate the performance of the HemeLB lattice-Boltzmann simula-tor for cerebrovascular blood ...
The CompBioMed HPC CoE flagship application HemeLB (prototype GPU version) was run with a (patched) ...
AbstractWe investigate the performance of the HemeLB lattice-Boltzmann simulator for cerebrovascular...
HPC application developers encounter significant challenges getting their codes to run correctly on ...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
In recent years, it has become increasingly common for high performance computers (HPC) to possess s...
As supercomputers scale to 1,000 PFlop/s over the next decade, investi-gating the performance of par...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
Abstract—This work focuses on tools for investigating algorithm performance at extreme scale with mi...
This online course organised in cooperation with NHR@FAU covers performance engineering approaches o...
Performance measurement and analysis of parallel applications is often challenging, despite many exc...
The CompBioMed HPC CoE flagship application HemeLB was run with a 6.4 micron resolution "circle of W...
The CompBioMed HPC CoE flagship application HemeLB was run with a 6.4 micron resolution "circle of W...
This audit examined strong scaling of a GPU-enabled development version of the HemeLB application on...
In recent years, it has become increasingly common for high performance computers (HPC) to possess s...
We investigate the performance of the HemeLB lattice-Boltzmann simula-tor for cerebrovascular blood ...
The CompBioMed HPC CoE flagship application HemeLB (prototype GPU version) was run with a (patched) ...
AbstractWe investigate the performance of the HemeLB lattice-Boltzmann simulator for cerebrovascular...
HPC application developers encounter significant challenges getting their codes to run correctly on ...
International audienceFinely tuning MPI applications (number of processes, granularity, collectiveop...
In recent years, it has become increasingly common for high performance computers (HPC) to possess s...
As supercomputers scale to 1,000 PFlop/s over the next decade, investi-gating the performance of par...
Many/multi-core supercomputers provide a natural programming paradigm for hybrid MPI/OpenMP scientif...
Abstract—This work focuses on tools for investigating algorithm performance at extreme scale with mi...
This online course organised in cooperation with NHR@FAU covers performance engineering approaches o...