Best paperInternational audienceRecent studies have shown the potential of task-based programming paradigms for implementing robust, scalable sparse direct solvers for modern computing platforms. Yet, designing task flows that efficiently exploit heterogeneous architectures remains highly challenging. In this paper we first tackle the issue of data partitioning using a method suited for heterogeneous platforms. On the one hand, we design task of sufficiently large granularity to obtain a good acceleration factor on GPU. On the other hand, we limit that size in order to both fit the GPU memory constraints and generate enough parallelism in the task graph. Secondly we handle the task scheduling with a strategy capable of taking into accoun...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
AbstractThis paper analyzes the use of a multicore+multiGPU system for solving Simultaneous Equation...
Best paperInternational audienceRecent studies have shown the potential of task-based programming pa...
Task parallelism is omnipresent these days; whether in data mining or machine learning, for matrix f...
International audienceThe advent of multicore processors represents a disruptive event in the histor...
To face the advent of multicore processors and the ever increasing complexity of hardware architectu...
We have ported the numerical factorization and triangular solve phases of the sparse direct solver S...
QR decomposition is a computationally intensive linear al-gebra operation that factors a matrix A in...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
International audienceThe advent of multicore processors requires to reconsider the design of high p...
For many finite element problems, when represented as sparse matrices, iterative solvers are found t...
Whereas most parallel High Performance Computing (HPC) numerical libaries havebeen written as highly...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
AbstractThis paper analyzes the use of a multicore+multiGPU system for solving Simultaneous Equation...
Best paperInternational audienceRecent studies have shown the potential of task-based programming pa...
Task parallelism is omnipresent these days; whether in data mining or machine learning, for matrix f...
International audienceThe advent of multicore processors represents a disruptive event in the histor...
To face the advent of multicore processors and the ever increasing complexity of hardware architectu...
We have ported the numerical factorization and triangular solve phases of the sparse direct solver S...
QR decomposition is a computationally intensive linear al-gebra operation that factors a matrix A in...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
GPUs (Graphics Processing Units) have become one of the main co-processors that contributed to deskt...
International audienceThe advent of multicore processors requires to reconsider the design of high p...
For many finite element problems, when represented as sparse matrices, iterative solvers are found t...
Whereas most parallel High Performance Computing (HPC) numerical libaries havebeen written as highly...
International audienceOne of the major trends in the design of exascale architectures is the use of ...
International audienceTo face the advent of multicore processors and the ever increasing complexity ...
In this paper we investigate the use of distributed graphics processing unit (GPU)-based architectur...
AbstractThis paper analyzes the use of a multicore+multiGPU system for solving Simultaneous Equation...