International audienceThe increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder and programmers need ways to understand the performance behavior. We present an innovative approach for predicting the performance of memory-bound multi-threaded applications. It relies on micro-benchmarks and a compositional model, combining measures of micro-benchmarks in order to model larger codes. Our memory model takes into account cache sizes and cache coherence protocols, having a large impact on performance of multi-threaded codes. Applying this model to real world HPC kernels shows that it...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
Benchmarking high performance computing systems is crucial to optimize memory consumption and maximi...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
AbstractThe performance of OpenMP applications executed in multisocket multicore processors can be l...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
Performance modeling, the science of understanding and predicting application performance, is import...
Systems for high performance computing are getting increasingly complex. On the one hand, the number...
A method is presented for modeling application performance on parallel computers in terms of the per...
AbstractSparse scientific codes face grave performance challenges as memory bandwidth limitations gr...
HPC applications usually run at a low fraction of the computer's peak performance. Empirical perform...
As the number of compute cores per chip continues to rise faster than the total amount of available ...
We have developed a hierarchical performance bounding methodology that attempts to explain the perfo...
In recent years the High Performance Computing (HPC) industry has benefited from the development of ...
International audienceThe complexity of memory systems has increased considerably over the past deca...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
Benchmarking high performance computing systems is crucial to optimize memory consumption and maximi...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...
AbstractThe performance of OpenMP applications executed in multisocket multicore processors can be l...
To increase performance, modern processors employ complex techniques such as out-of-order pipelines ...
International audienceTo amortize the cost of MPI communications, distributed parallel HPC applicati...
Performance modeling, the science of understanding and predicting application performance, is import...
Systems for high performance computing are getting increasingly complex. On the one hand, the number...
A method is presented for modeling application performance on parallel computers in terms of the per...
AbstractSparse scientific codes face grave performance challenges as memory bandwidth limitations gr...
HPC applications usually run at a low fraction of the computer's peak performance. Empirical perform...
As the number of compute cores per chip continues to rise faster than the total amount of available ...
We have developed a hierarchical performance bounding methodology that attempts to explain the perfo...
In recent years the High Performance Computing (HPC) industry has benefited from the development of ...
International audienceThe complexity of memory systems has increased considerably over the past deca...
High performance computing (HPC) demands huge memory bandwidth and computing resources to achieve ma...
Benchmarking high performance computing systems is crucial to optimize memory consumption and maximi...
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-96983-1_10Des...