In this paper, we make efficient use of asynchronous communications on the LU decomposition algorithm with pivoting and a column-scattered data decomposition to derive precise computational complexities. We then compare these results with experiments on the Intel iPSC/860 and Paragon machines and show that very good performances can be obtained on a ring with asynchronous communications
Abstract. Since the cost of communication (moving data) greatly exceeds the cost of doing arithmetic...
We study several solvers for the solution of general linear systems where the main objective is to r...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
In this paper, we make efficient use of pipelining on LU decomposition with pivoting and a column-sc...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
This paper presents some works on the LU factorization from the ScaLAPACK library. First, a complexi...
The paper proposes an analytical model for estimating the performance of Pipelined Ring algorithm fo...
This paper presents a parallel LU factorization algorithm designed to take advantage of physical bro...
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
Most supercomputers are shipped with both a CPU and a GPU. With the powerful parallel computing capa...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
AbstractWe study several solvers for the solution of general linear systems where the main objective...
AbstractThis paper considers key ideas in the design of out-of-core dense LU factorization routines....
Abstract. Since the cost of communication (moving data) greatly exceeds the cost of doing arithmetic...
We study several solvers for the solution of general linear systems where the main objective is to r...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...
In this paper, we make efficient use of pipelining on LU decomposition with pivoting and a column-sc...
This paper presents CALU, a Communication Avoiding algorithm for the LU factorization of dense matri...
This paper presents some works on the LU factorization from the ScaLAPACK library. First, a complexi...
The paper proposes an analytical model for estimating the performance of Pipelined Ring algorithm fo...
This paper presents a parallel LU factorization algorithm designed to take advantage of physical bro...
This paper considers key ideas in the design of out-of-core dense LU factorization routines. A left...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
Abstract—LU factorization with partial pivoting is a canonical numerical procedure and the main comp...
Most supercomputers are shipped with both a CPU and a GPU. With the powerful parallel computing capa...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
AbstractWe study several solvers for the solution of general linear systems where the main objective...
AbstractThis paper considers key ideas in the design of out-of-core dense LU factorization routines....
Abstract. Since the cost of communication (moving data) greatly exceeds the cost of doing arithmetic...
We study several solvers for the solution of general linear systems where the main objective is to r...
We present an out-of-core sparse nonsymmetric LU-factorization algorithm with partial pivoting. We h...