International audienceIn this paper, we present a comparison of scheduling strategies for heterogeneous multi-CPU and multi-GPU architectures. We designed and evaluated four scheduling strategies on top of XKaapi runtime: work stealing, data-aware work stealing, locality-aware work stealing, and Heterogeneous Earliest-Finish-Time (HEFT). On a heterogeneous architecture with 12 CPUs and 8 GPUs, we analysed our scheduling strategies with four benchmarks: a BLAS-1 AXPY vector operation, a Jacobi 2D iterative computation, and two linear algebra algorithms Cholesky and LU. We conclude that the use of work stealing may be efficient if task annotations are given along with a data locality strategy. Furthermore, our experimental results suggests th...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
International audienceEfficient implementations of parallel applications on hetero-geneous hybrid ar...
International audienceIn this paper, we present a comparison of scheduling strategies for heterogene...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
International audienceMost recent HPC platforms have heterogeneous nodes com- posed of a combination...
Abstract. Efficient implementations of parallel applications on hetero-geneous hybrid architectures ...
In this study, we provide an extensive survey on wide spectrum of scheduling methods for multitaskin...
International audienceThe race for Exascale computing has naturally led the current technologies to ...
Heterogeneous many-core computing resources are increasingly popular among users due to their improv...
Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms ...
International audienceThe use of accelerators such as GPUs has become mainstream to achieve high per...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
Modern high-performance computers engage a variety of computing devices. Underutilization and oversu...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
International audienceEfficient implementations of parallel applications on hetero-geneous hybrid ar...
International audienceIn this paper, we present a comparison of scheduling strategies for heterogene...
With the emergence of General Purpose computation on GPU (GPGPU) and corresponding programming fram...
International audienceMost recent HPC platforms have heterogeneous nodes com- posed of a combination...
Abstract. Efficient implementations of parallel applications on hetero-geneous hybrid architectures ...
In this study, we provide an extensive survey on wide spectrum of scheduling methods for multitaskin...
International audienceThe race for Exascale computing has naturally led the current technologies to ...
Heterogeneous many-core computing resources are increasingly popular among users due to their improv...
Heterogeneous systems consisting of multiple CPUs and GPUs are increasingly attractive as platforms ...
International audienceThe use of accelerators such as GPUs has become mainstream to achieve high per...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
Modern high-performance computers engage a variety of computing devices. Underutilization and oversu...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
International audienceEfficient implementations of parallel applications on hetero-geneous hybrid ar...