AbstractUsing a recently proposed communication optimal variant of TSQR, weak scalability of the least squares solver (LS) with multiple right hand sides is studied. The communication for TSQR based LS solver for multiple right hand sides remains optimal in the sense that no additional messages are necessary compared to TSQR. However, LS has additional communication volume and flops compared to that for TSQR. Additional flops and words sent for LS is derived. A PGAS model, namely, global address space programming framework (GPI) is used for inter- nodal one sided communication. Within NUMA sockets, C++-11 threading model is used. Scalability results of the proposed method up to a few thousand cores are shown
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
We present a parallel algorithm for the QR factorization with column pivoting of a sparse matrix by ...
International audienceThe Tall-Skinny QR (TSQR) algorithm is more communication efficient than the s...
Using a recently proposed communication optimal variant of TSQR, weak scalability of the least squar...
For matrix with full column rank, QR algorithm is among the best approach to solve wider class of le...
AbstractA new algorithm is presented for the efficient solution of large least squares problems in w...
International audienceWe present parallel and sequential dense QR factorization algorithms that are ...
The increasing complexity of modern computer architectures has greatly influenced algorithm design. ...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
In this paper we study the performance of two classical dense linear algebra algorithms, the LU and ...
Least squares problems occur in many branches of science. Typically there may be a large number of d...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
The impact of the communication on the performance of numerical algorithms increases with the number...
In this paper we study how to update the solution of the linear system Ax = b after the matrix A is ...
The impact of the communication on the performance of numerical algorithms increases with the number...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
We present a parallel algorithm for the QR factorization with column pivoting of a sparse matrix by ...
International audienceThe Tall-Skinny QR (TSQR) algorithm is more communication efficient than the s...
Using a recently proposed communication optimal variant of TSQR, weak scalability of the least squar...
For matrix with full column rank, QR algorithm is among the best approach to solve wider class of le...
AbstractA new algorithm is presented for the efficient solution of large least squares problems in w...
International audienceWe present parallel and sequential dense QR factorization algorithms that are ...
The increasing complexity of modern computer architectures has greatly influenced algorithm design. ...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
In this paper we study the performance of two classical dense linear algebra algorithms, the LU and ...
Least squares problems occur in many branches of science. Typically there may be a large number of d...
The solution of dense systems of linear equations is at the heart of numerical computations. Such sy...
The impact of the communication on the performance of numerical algorithms increases with the number...
In this paper we study how to update the solution of the linear system Ax = b after the matrix A is ...
The impact of the communication on the performance of numerical algorithms increases with the number...
Due to the evolution of massively parallel computers towards deeper levels of parallelism and memory...
We present a parallel algorithm for the QR factorization with column pivoting of a sparse matrix by ...
International audienceThe Tall-Skinny QR (TSQR) algorithm is more communication efficient than the s...