On cache based computer architectures using current standard algorithms, Householder bidiagonalization requires a significant portion of the execution time for computing matrix singular values and vectors In this paper we reorganize the sequence of operations for Householder bidiagonalization of a general m × n matrix, so that two (_GEMV) vector-matrix multiplications can be done with one pass of the unreduced trailing part of the matrix through cache. Two new BLAS 2.5 operations approximately cut in half the transfer of data from main memory to cache. We give detailed algorithm descriptions and compare timings with the current LAPACK bidiagonalization algorithm
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
On cache based computer architectures using current standard al-gorithms, Householder bidiagonalizat...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
International audienceWe study tiled algorithms for going from a " full " matrix to a condensed " ba...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
A simple but highly effective approach for transforming high-performance implementations on cachebas...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
On cache based computer architectures using current standard al-gorithms, Householder bidiagonalizat...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
International audienceWe study tiled algorithms for going from a " full " matrix to a condensed " ba...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
A simple but highly effective approach for transforming high-performance implementations on cachebas...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...