Using a recently proposed communication optimal variant of TSQR, weak scalability of the least squares solver (LS) with multiple right hand sides is studied. The communication for TSQR based LS solver for multiple right hand sides remains optimal in the sense that no additional messages are necessary compared to TSQR. However, LS has additional communication volume and flops compared to that for TSQR. Additional flops and words sent for LS is derived. A PGAS model, namely, global address space programming framework (GPI) is used for inter- nodal one sided communication. Within NUMA sockets, C++-11 threading model is used. Scalability results of the proposed method up to a few thousand cores are shown.
The multilinear least-squares (MLLS) problem is an extension of the linear least-squares problem. Th...
nonnegative least squares, length-constraints, constrained optimization, alternating least squares, ...
method to compute LLRs with low complexity for soft-input channel decoding. Analysis shows that the ...
AbstractUsing a recently proposed communication optimal variant of TSQR, weak scalability of the lea...
For matrix with full column rank, QR algorithm is among the best approach to solve wider class of le...
AbstractA new algorithm is presented for the efficient solution of large least squares problems in w...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
In this paper we study the parallelization of PCGLS, a basic iterative method whose main idea is to ...
In this paper we study the performance of two classical dense linear algebra algorithms, the LU and ...
International audienceWe present parallel and sequential dense QR factorization algorithms that are ...
. In this paper we mainly focus on the study of the parallelization of PCGLS, a basic iterative meth...
In this paper we study the parallelization of CGLS, a basic iterative method for large and sparse le...
In this paper we study the parallel aspects of PCGLS, a basic iterative method whose main idea is to...
This paper initiates the study of communication complexity when the processors have limited work spa...
High-dimensional simulations pose a challenge even for next-generation high-performance computers. H...
The multilinear least-squares (MLLS) problem is an extension of the linear least-squares problem. Th...
nonnegative least squares, length-constraints, constrained optimization, alternating least squares, ...
method to compute LLRs with low complexity for soft-input channel decoding. Analysis shows that the ...
AbstractUsing a recently proposed communication optimal variant of TSQR, weak scalability of the lea...
For matrix with full column rank, QR algorithm is among the best approach to solve wider class of le...
AbstractA new algorithm is presented for the efficient solution of large least squares problems in w...
This study focuses on the performance of two classical dense linear algebra algorithms, the LU and t...
In this paper we study the parallelization of PCGLS, a basic iterative method whose main idea is to ...
In this paper we study the performance of two classical dense linear algebra algorithms, the LU and ...
International audienceWe present parallel and sequential dense QR factorization algorithms that are ...
. In this paper we mainly focus on the study of the parallelization of PCGLS, a basic iterative meth...
In this paper we study the parallelization of CGLS, a basic iterative method for large and sparse le...
In this paper we study the parallel aspects of PCGLS, a basic iterative method whose main idea is to...
This paper initiates the study of communication complexity when the processors have limited work spa...
High-dimensional simulations pose a challenge even for next-generation high-performance computers. H...
The multilinear least-squares (MLLS) problem is an extension of the linear least-squares problem. Th...
nonnegative least squares, length-constraints, constrained optimization, alternating least squares, ...
method to compute LLRs with low complexity for soft-input channel decoding. Analysis shows that the ...