International audienceThe polyhedral model permits to automatically improve data locality and enable parallelism of regular linear algebra kernels. In previous work we have proposed a new data structure, 2d-packed layout, to store only the non-zeros elements of regular sparse (triangular and banded) matrices dynamically allocated for different basic linear algebra operations, and used Pluto to parallelize and optimize them. To our surprise, there were huge discrepancies in our measures of these kernels execution times that were due to the allocation mode: as statically declared arrays or as dynamically allocated arrays of pointers.In this paper we compare the performance of various linear algebra kernels, including some linear algebra kerne...
In this paper we conduct a detailed analysis of the sources of power dissipation and energy consumpt...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
The tremendous increase in the size and heterogeneity of supercomputers makes it very difficult to p...
International audienceThe polyhedral model permits to automatically improve data locality and enable...
The tremendous increase in the size and heterogeneity of supercomputers makes it very difficult to p...
(eng) We study the implementation of dense linear algebra computations, such as matrix multiplicatio...
In this paper, we deal with redistribution issues for dense linear algebra kernels on heterogeneous ...
(eng) In this paper, we deal with redistribution issues for dense linear algebra kernels on heteroge...
We consider the problem of data allocation when performing matrix multiplication on a heterogeneous ...
The performance portability of OpenCL kernel implementa-tions for common memory bandwidth limited li...
International audienceWe study the implementation of dense linear algebra computations, such as matr...
In this paper, we study the implementation of dense linear algebra kernels, such as matrix multiplic...
AbstractIn this work the behavior of the multithreaded implementation of some LAPACK routines on PLA...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
In this paper we conduct a detailed analysis of the sources of power dissipation and energy consumpt...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
The tremendous increase in the size and heterogeneity of supercomputers makes it very difficult to p...
International audienceThe polyhedral model permits to automatically improve data locality and enable...
The tremendous increase in the size and heterogeneity of supercomputers makes it very difficult to p...
(eng) We study the implementation of dense linear algebra computations, such as matrix multiplicatio...
In this paper, we deal with redistribution issues for dense linear algebra kernels on heterogeneous ...
(eng) In this paper, we deal with redistribution issues for dense linear algebra kernels on heteroge...
We consider the problem of data allocation when performing matrix multiplication on a heterogeneous ...
The performance portability of OpenCL kernel implementa-tions for common memory bandwidth limited li...
International audienceWe study the implementation of dense linear algebra computations, such as matr...
In this paper, we study the implementation of dense linear algebra kernels, such as matrix multiplic...
AbstractIn this work the behavior of the multithreaded implementation of some LAPACK routines on PLA...
This dissertation incorporates two research projects: performance modeling and prediction for dense ...
Abstract: Few realize that, for large matrices, many dense matrix computations achieve nearly the sa...
In this paper we conduct a detailed analysis of the sources of power dissipation and energy consumpt...
International audienceThis paper discusses some algorithmic issues when computing with a heterogeneo...
The tremendous increase in the size and heterogeneity of supercomputers makes it very difficult to p...