Block p-cyclic matrices is an important class of block structured matrices with broad applications in numerical analysis of differential equations, Markov chains theory, computational quantum chem-istry and beyond. We present a method for structured orthogonal inversion of p-cyclic matrices with a set of algorithmic improvements for gaining better performance on multicores with GPU accelerators. Benchmarking has shown that our codes for hybrid CPU+GPU systems maintain relatively sustainable performance for different values of parameter n, and attain up to 90 % of peak performance with suf-ficiently large matrices. The solutions proposed here can be easily extended to the target architectures with multiple GPUs, and to the problems with more...
We study the use of massively parallel architectures for computing a matrix inverse. Two different ...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
An iterative inversion algorithm for a class of square matrices is derived and tested. The inverted ...
In this paper, we tackle the inversion of large-scale dense matrices via conventional matrix factori...
Block-structured matrices arise in several contexts in circuit\ud simulation problems. These matrice...
Dense matrix inversion is a basic procedure in many linear algebra algorithms. Any factorization-bas...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
The objective of this paper is to extend and redesign the block matrix reduction applied for the fam...
The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decom...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
Low-rank matrices arise in many scientific and engineering computations. Both computational and stor...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
We study the use of massively parallel architectures for computing a matrix inverse. Two different ...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
An iterative inversion algorithm for a class of square matrices is derived and tested. The inverted ...
In this paper, we tackle the inversion of large-scale dense matrices via conventional matrix factori...
Block-structured matrices arise in several contexts in circuit\ud simulation problems. These matrice...
Dense matrix inversion is a basic procedure in many linear algebra algorithms. Any factorization-bas...
AbstractOne-sided dense matrix factorizations are important computational kernels in many scientific...
The objective of this paper is to extend and redesign the block matrix reduction applied for the fam...
The LAPACK routines \( \texttt{GEQRT2}\) and \(\texttt{GEQRT3}\) can be used to compute the QR decom...
We present a new approach to utilizing all CPU cores and all GPUs on heterogeneous multicore and mul...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
As users and developers, we are witnessing the opening of a new computing scenario: the introduction...
Low-rank matrices arise in many scientific and engineering computations. Both computational and stor...
Abstract: If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
QR factorization is a ubiquitous operation in many engineering and scientific applications. In this ...
We study the use of massively parallel architectures for computing a matrix inverse. Two different ...
Abstract. If multicore is a disruptive technology, try to imagine hybrid multicore systems enhanced ...
An iterative inversion algorithm for a class of square matrices is derived and tested. The inverted ...