Recent studies have shown the potential of task-based programming paradigms for implementing robust, scalable sparse direct solvers for modern computing platforms. Yet, designing task flows that efficiently exploit heterogeneous architectures remains highly challenging. In this paper we first tackle the issue of data partitioning using a method suited for heterogeneous platforms. On the one hand, we design task of sufficiently large granularity to obtain a good acceleration factor on GPU. On the other hand, we limit that size in order to both fit the GPU memory constraints and generate enough parallelism in the task graph. Secondly we handle the task scheduling with a strategy capable of taking into account workload and architecture heterog...
International audienceThe ever growing complexity and scale of parallel architectures imposes to rew...
Whereas most parallel High Performance Computing (HPC) numerical libaries havebeen written as highly...
Modern computers can no longer rely on increasing CPU speed to improve their performance as further ...
Recent studies have shown the potential of task-based programming paradigms for implementing robust,...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
International audienceThe advent of multicore processors requires to reconsider the design of high p...
International audienceThe advent of multicore processors represents a disruptive event in the histor...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
Afin de s'adapter aux architectures multicoeurs et aux machines de plus en plus complexes, les modèl...
Task parallelism is omnipresent these days; whether in data mining or machine learning, for matrix f...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
International audienceAccelerator-enhanced computing platforms have drawn a lot of attention due to ...
We have ported the numerical factorization and triangular solve phases of the sparse direct solver S...
This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 ...
International audienceThe ever growing complexity and scale of parallel architectures imposes to rew...
Whereas most parallel High Performance Computing (HPC) numerical libaries havebeen written as highly...
Modern computers can no longer rely on increasing CPU speed to improve their performance as further ...
Recent studies have shown the potential of task-based programming paradigms for implementing robust,...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
International audienceThe advent of multicore processors requires to reconsider the design of high p...
International audienceThe advent of multicore processors represents a disruptive event in the histor...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
Afin de s'adapter aux architectures multicoeurs et aux machines de plus en plus complexes, les modèl...
Task parallelism is omnipresent these days; whether in data mining or machine learning, for matrix f...
International audienceMost recent HPC platforms have heterogeneous nodes composed of multi-core CPUs...
International audienceAccelerator-enhanced computing platforms have drawn a lot of attention due to ...
We have ported the numerical factorization and triangular solve phases of the sparse direct solver S...
This paper is submitted for review to the Parallel Computing special issue for HCW and HeteroPar 16 ...
International audienceThe ever growing complexity and scale of parallel architectures imposes to rew...
Whereas most parallel High Performance Computing (HPC) numerical libaries havebeen written as highly...
Modern computers can no longer rely on increasing CPU speed to improve their performance as further ...