This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 workshopsInternational audienceHybrid computing platforms are now commonplace, featuring a large number of CPU cores and accelerators. This trend makes balancing computations between these heterogeneous resources performance critical. In this paper we propose ag-gregating several CPU cores in order to execute larger parallel tasks and improve load balancing between CPUs and accelerators. Additionally, we present our approach to exploit internal parallelism within tasks, by combining two runtime system schedulers: a global runtime system to schedule the main task graph and a local one one to cope with internal task parallelism. We demonstrate ...
Load balancing increases the efficient usage of existing resources for parallel and distributed appl...
International audienceProgramming paradigms in High-Performance Computing have been shifting towards...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
International audienceHybrid computing platforms are now commonplace, featuring a large number of CP...
This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 ...
International audienceComputing platforms are now extremely complex providing an increasing number o...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
In this paper we introduce a model for the total energy consumption of the Cholesky factorization on...
We consider the problem of allocating and scheduling dense linear application on fully heterogeneous...
In a general-purpose computing system, several parallel applications run simultaneously on the same ...
Individual processor frequencies have reached an upper physical and practical limit. Processor desig...
International audienceAlthough the hardware has dramatically changed in the last few years, nodes of...
International audience—To face the advent of multicore processors and the ever increasing complexity...
Our goal is to provide an analysis and comparison of static and dynamic strategies for task graph sc...
Computing systems have undergone a fundamental transformation from single core devices to devices wi...
Load balancing increases the efficient usage of existing resources for parallel and distributed appl...
International audienceProgramming paradigms in High-Performance Computing have been shifting towards...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...
International audienceHybrid computing platforms are now commonplace, featuring a large number of CP...
This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 ...
International audienceComputing platforms are now extremely complex providing an increasing number o...
In this paper, we consider task-based dense linear algebra applications on a single heterogeneous no...
In this paper we introduce a model for the total energy consumption of the Cholesky factorization on...
We consider the problem of allocating and scheduling dense linear application on fully heterogeneous...
In a general-purpose computing system, several parallel applications run simultaneously on the same ...
Individual processor frequencies have reached an upper physical and practical limit. Processor desig...
International audienceAlthough the hardware has dramatically changed in the last few years, nodes of...
International audience—To face the advent of multicore processors and the ever increasing complexity...
Our goal is to provide an analysis and comparison of static and dynamic strategies for task graph sc...
Computing systems have undergone a fundamental transformation from single core devices to devices wi...
Load balancing increases the efficient usage of existing resources for parallel and distributed appl...
International audienceProgramming paradigms in High-Performance Computing have been shifting towards...
To help shrink the programmability-performance efficiency gap, we discuss that adaptive runtime syst...