We study codes deploying multiple MPI ranks to one node where each rank is parallelised with TBB. A static assignment of cores to ranks here is disadvantageous if the load is not perfectly balanced, the runtime is subject to fluctuations or one MPI rank runs through phases with low concurrency. We propose an extension to TBB where developers manually annotate which code parts could exploit further cores. The cores are then dynamically associated with ranks. Our approach is decentralised, lightweight and minimally invasive w.r.t. code modifications. Some brief performance studies suggest that a flexible, permanently changing assignment of cores to compute ranks can outperform a static distribution, while greedily haggling over cor...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
Journal ArticleClustered microarchitectures are an attractive alternative to large monolithic super...
High Performance Computing (HPC) has always been a key foundation for scientific simulation and disc...
time library [1] is a popular C++ parallelization environment [2][3] that offers a set of methods an...
Modern computers are based on manycore architectures, with multiple processors on a single silicon ...
The computational resources required in scientific research for key areas, such as medicine, physics...
Heterogeneous processors such as Arm’s big.LITTLE have become popular as they offer a choice betwee...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
International audienceComputing hardware, from mobile devices to supercomputer clusters, is undergoi...
International audienceDuring the past 10 years, the clock frequency of high-end superscalar processo...
This paper presents COMPROF and COMPLACE, a novel profiling tool and thread placement technique for ...
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), ...
Operating Systems have been considered as a cor-nerstone of the modern computer system, and the con-...
Many-core computing has surfaced as a promising solution to satisfy the rapidly increasing computati...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
Journal ArticleClustered microarchitectures are an attractive alternative to large monolithic super...
High Performance Computing (HPC) has always been a key foundation for scientific simulation and disc...
time library [1] is a popular C++ parallelization environment [2][3] that offers a set of methods an...
Modern computers are based on manycore architectures, with multiple processors on a single silicon ...
The computational resources required in scientific research for key areas, such as medicine, physics...
Heterogeneous processors such as Arm’s big.LITTLE have become popular as they offer a choice betwee...
Supercomputing applications rely on strong scaling to achieve faster results on a larger number of p...
International audienceComputing hardware, from mobile devices to supercomputer clusters, is undergoi...
International audienceDuring the past 10 years, the clock frequency of high-end superscalar processo...
This paper presents COMPROF and COMPLACE, a novel profiling tool and thread placement technique for ...
As high-performance computing (HPC) systems advance towards exascale (10^18 operations per second), ...
Operating Systems have been considered as a cor-nerstone of the modern computer system, and the con-...
Many-core computing has surfaced as a promising solution to satisfy the rapidly increasing computati...
We study the performance behaviour of a seismic simulation using the ExaHyPE engine with a specific ...
International audienceTo amortize the cost of MPI collective operations, nonblocking collectives hav...
Journal ArticleClustered microarchitectures are an attractive alternative to large monolithic super...
High Performance Computing (HPC) has always been a key foundation for scientific simulation and disc...