International audienceEfficient implementations of parallel applications on hetero-geneous hybrid architectures require a careful balance between compu-tations and communications with accelerator devices. Even if most of the communication time can be overlapped by computations, it is es-sential to reduce the total volume of communicated data. The litera-ture therefore abounds with ad hoc methods to reach that balance, but these are architecture and application dependent. We propose here a generic mechanism to automatically optimize the scheduling between CPUs and GPUs, and compare two strategies within this mechanism: the classical Heterogeneous Earliest Finish Time (HEFT) algorithm and our new, parametrized, Distributed Affinity Dual Appro...
Aiming to fully exploit the computing power of all CPUs and all graphics processing units (GPUs) on ...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
In many sciences, processing costly computations has become frequent and the execution time of an ap...
International audienceEfficient implementations of parallel applications on hetero-geneous hybrid ar...
Abstract. Efficient implementations of parallel applications on hetero-geneous hybrid architectures ...
International audienceMore and more computers use hybrid architectures combining multi-core processo...
International audienceIn this paper, we present a comparison of scheduling strategies for heterogene...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
Best PaperInternational audienceMore and more computers use hybrid architectures combin-ing multi-co...
International audienceMost recent HPC platforms have heterogeneous nodes com- posed of a combination...
International audienceSUMMARY Multi-core architectures comprising several GPUs have become mainstrea...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
International audienceIn a parallel computing context, peak performance is hard to reach with irregu...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
International audienceIn High Performance Computing, heterogeneity is now the norm with specialized ...
Aiming to fully exploit the computing power of all CPUs and all graphics processing units (GPUs) on ...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
In many sciences, processing costly computations has become frequent and the execution time of an ap...
International audienceEfficient implementations of parallel applications on hetero-geneous hybrid ar...
Abstract. Efficient implementations of parallel applications on hetero-geneous hybrid architectures ...
International audienceMore and more computers use hybrid architectures combining multi-core processo...
International audienceIn this paper, we present a comparison of scheduling strategies for heterogene...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
Best PaperInternational audienceMore and more computers use hybrid architectures combin-ing multi-co...
International audienceMost recent HPC platforms have heterogeneous nodes com- posed of a combination...
International audienceSUMMARY Multi-core architectures comprising several GPUs have become mainstrea...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
International audienceIn a parallel computing context, peak performance is hard to reach with irregu...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
International audienceIn High Performance Computing, heterogeneity is now the norm with specialized ...
Aiming to fully exploit the computing power of all CPUs and all graphics processing units (GPUs) on ...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
In many sciences, processing costly computations has become frequent and the execution time of an ap...