This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matrices distributed in a two-dimensional (2D) cyclic layout. The algorithm is based on a new pivoting strategy, referred to as ca-pivoting, that is shown to be stable in practice. The ca-pivoting strategy leads to a significant decrease in the number of messages exchanged during the factorization of a block-column relatively to conventional algorithms, and thus CALU overcomes the latency bottleneck of the LU factorization as in current implementations like ScaLAPACK and HPL. The experimental part of this paper focuses on the evaluation of the performance of CALU on two computational systems, an IBM POWER 5 system with 888 compute processors distr...
This paper presents some works on the LU factorization from the ScaLAPACK library. First, a complexi...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
International audienceWe present block LU factorization with panel rank revealing pivoting (block LU...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
Cette thèse traite d une routine d algèbre linéaire largement utilisée pour la résolution des systèm...
Cette thèse traite d’une routine d’algèbre linéaire largement utilisée pour la résolution des systèm...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
Abstract. Since the cost of communication (moving data) greatly exceeds the cost of doing arithmetic...
International audienceSince the cost of communication (moving data) greatly exceeds the cost of doin...
The impact of the communication on the performance of numerical algorithms increases with the number...
The impact of the communication on the performance of numerical algorithms increases with the number...
We present the LU decomposition with panel rank revealing pivoting (LU_PRRP), an LU factorization al...
There is a growing performance gap between computation and communication on modern computers, making...
International audienceWe illustrate how linear algebra calculations can be enhanced by statistical t...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
This paper presents some works on the LU factorization from the ScaLAPACK library. First, a complexi...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
International audienceWe present block LU factorization with panel rank revealing pivoting (block LU...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
Cette thèse traite d une routine d algèbre linéaire largement utilisée pour la résolution des systèm...
Cette thèse traite d’une routine d’algèbre linéaire largement utilisée pour la résolution des systèm...
This dissertation focuses on a widely used linear algebra kernel to solve linear systems, that is th...
Abstract. Since the cost of communication (moving data) greatly exceeds the cost of doing arithmetic...
International audienceSince the cost of communication (moving data) greatly exceeds the cost of doin...
The impact of the communication on the performance of numerical algorithms increases with the number...
The impact of the communication on the performance of numerical algorithms increases with the number...
We present the LU decomposition with panel rank revealing pivoting (LU_PRRP), an LU factorization al...
There is a growing performance gap between computation and communication on modern computers, making...
International audienceWe illustrate how linear algebra calculations can be enhanced by statistical t...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
This paper presents some works on the LU factorization from the ScaLAPACK library. First, a complexi...
We propose two novel techniques for overcoming load-imbalance encountered when implementing so-calle...
International audienceWe present block LU factorization with panel rank revealing pivoting (block LU...