GPU-accelerated implementation of the storage-efficient QR decomposition

Benner, Peter
Köhler, Martin
Penke, Carolin

Publication date

July 2017

DOI

Abstract

The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decomposition of a matrix of size \(m \times n\) as well as the storage-efficient representation of the orthogonal factor \(Q=I-VTV^T\). A GPU-accelerated algorithm is presented that expands a blocked CPU-GPU hybrid QR decomposition to compute the triangular matrix \(T\). The storage-efficient representation is used in particular to access blocks of the matrix \(Q\) without having to generate all of it. The algorithm runs on one GPU and aims to use memory efficiently in order to process matrices as large as possible. Via the reuse of intermediate results the amount of necessary operations can be reduced significantly. As a result the algorithm out...

Extracted data

We use cookies to provide a better user experience.

Data Protection

GPU-accelerated implementation of the storage-efficient QR decomposition

Abstract

Extracted data

GPU-accelerated implementation of the storage-efficient QR decomposition

Abstract

Extracted data

Related items

Related items