The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the sparse matrix used is unstructured, however, standard SpMV multiplication implementations typically are inefficient in terms of cache usage, sometimes working at only a fraction of peak performance. Cache-aware algorithms take information on specifics of the cache architecture as a parameter to derive an efficient SpMV multiply. In contrast, cache-oblivious algorithms strive to obtain efficiency regardless of cache specifics. In earlier work in this latter area, Haase et al. (2007) use the Hilbert curve to order nonzeroes in the sparse matrix. They obtain speedup mainly when multiplying against multiple (up to eight) right-hand sides simulta...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
We investigate a new storage format for unstructured sparse ma-trices, based on the space filling Hi...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
The sparse matrix is one of the most important data storage format for large amount of data. Sparse ...
We present new performance models and a new, more compact data structure for cache blocking when ap...
Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) a...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear so...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
We investigate a new storage format for unstructured sparse ma-trices, based on the space filling Hi...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
The thesis introduces a cache-oblivious method for the sparse matrix-vector (SpMV) multiplication, w...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
Due to copyright restrictions, the access to the full text of this article is only available via sub...
The sparse matrix is one of the most important data storage format for large amount of data. Sparse ...
We present new performance models and a new, more compact data structure for cache blocking when ap...
Data movements between different levels of the memory hierarchy (I/O-transitions, or simply I/O s) a...
In earlier work we have introduced the “Recursive Sparse Blocks ” (RSB) sparse matrix storage scheme...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear so...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
We investigate a new storage format for unstructured sparse ma-trices, based on the space filling Hi...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...