The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decomposition of a matrix of size \(m \times n\) as well as the storage-efficient representation of the orthogonal factor \(Q=I-VTV^T\). A GPU-accelerated algorithm is presented that expands a blocked CPU-GPU hybrid QR decomposition to compute the triangular matrix \(T\). The storage-efficient representation is used in particular to access blocks of the matrix \(Q\) without having to generate all of it. The algorithm runs on one GPU and aims to use memory efficiently in order to process matrices as large as possible. Via the reuse of intermediate results the amount of necessary operations can be reduced significantly. As a result the algorithm out...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eige...
The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decom...
QR decomposition is a computationally intensive linear al-gebra operation that factors a matrix A in...
Hardware accelerators are getting increasingly important in heterogeneous systems for many applicati...
The least squares problem is an extremely useful device to represent an approximate solution to over...
Low-rank matrices arise in many scientific and engineering computations. Both computational and stor...
n this paper we propose new stable parallel algorithms based on Householder transformations and comp...
The least squares problem is an extremely useful device to represent an approximate solution to over...
We show how both the tridiagonal and bidiagonal QR algorithms can be restructured so that they be- ...
Linear least squares problems are commonly solved by QR factorization. When multiple solutions have ...
Sparse matrix–vector multiplication (SpMV) is of singular importance in sparse linear algebra, which...
We propose and benchmark a modified time evolution block decimation (TEBD) algorithm that uses a tru...
The processing of digital sound signals often requires the computation of the QR factorization of a ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eige...
The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decom...
QR decomposition is a computationally intensive linear al-gebra operation that factors a matrix A in...
Hardware accelerators are getting increasingly important in heterogeneous systems for many applicati...
The least squares problem is an extremely useful device to represent an approximate solution to over...
Low-rank matrices arise in many scientific and engineering computations. Both computational and stor...
n this paper we propose new stable parallel algorithms based on Householder transformations and comp...
The least squares problem is an extremely useful device to represent an approximate solution to over...
We show how both the tridiagonal and bidiagonal QR algorithms can be restructured so that they be- ...
Linear least squares problems are commonly solved by QR factorization. When multiple solutions have ...
Sparse matrix–vector multiplication (SpMV) is of singular importance in sparse linear algebra, which...
We propose and benchmark a modified time evolution block decimation (TEBD) algorithm that uses a tru...
The processing of digital sound signals often requires the computation of the QR factorization of a ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
The QR algorithm is one of the three phases in the process of computing the eigenvalues and the eige...