On cache based computer architectures using current standard al-gorithms, Householder bidiagonalization requires a significant portion of the execution time for computing matrix singular values and vectors. In this paper we reorganize the sequence of operations for Householder bidiagonalization of a general m × n matrix, so that two ( GEMV) vector-matrix multiplications can be done with one pass of the unre-duced trailing part of the matrix through cache. Two new BLAS 2.5 operations approximately cut in half the transfer of data from main memory to cache. We give detailed algorithm descriptions and com-pare timings with the current LAPACK bidiagonalization algorithm.
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
On cache based computer architectures using current standard algorithms, Householder bidiagonalizati...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
International audienceWe study tiled algorithms for going from a " full " matrix to a condensed " ba...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
A simple but highly effective approach for transforming high-performance implementations on cachebas...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...
On cache based computer architectures using current standard algorithms, Householder bidiagonalizati...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
A current trend in high-performance computing is to decompose a large linear algebra problem into ba...
This Master Thesis examines if a matrix multiplication program that combines the two efficiency stra...
International audienceWe study tiled algorithms for going from a " full " matrix to a condensed " ba...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
Abstract-- In this work, the performance of basic and strassen’s matrix multiplication algorithms ar...
A simple but highly effective approach for transforming high-performance implementations on cachebas...
This report deals with the ecient calculation of matrix-matrix multiplication, without using explici...
The goal of the LAPACK project is to provide efficient and portable software for dense numerical lin...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
Abstract. The objective of this paper is to extend, in the context of multicore architectures, the c...
International audienceIn this paper, a new methodology for computing the Dense Matrix Vector Multipl...