We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpMV) operation, y # y +A x. Prior work indicates that cache blocked SpMV performs very well for some matrix and machine combinations, yielding speedups as high as 3x. We look at the general question of when and why performance improves, finding that cache blocking is most e#ective when simultaneously 1) x does not fit in cache, 2) y fits in cache, 3) the non-zeros are distributed throughout the matrix, and 4) the non-zero density is su#ciently high. We extend our prior performance models, which bounded performance by assuming x and y fit in cache, to consider these classes of matrices. Unl...
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear so...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Abstract. We present new performance models and more compact data structures for cache blocking when...
Abstract We present new performance models and more compact data structures for cache blocking when ...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear so...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...
Abstract. We present new performance models and more compact data structures for cache blocking when...
Abstract We present new performance models and more compact data structures for cache blocking when ...
We consider the problem of building high-performance implementations of sparse matrix-vector multipl...
Algorithms for the sparse matrix-vector multiplication (shortly SpMxV) are important building blocks...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
Sparse matrix-vector multiplication (shortly SpMV) is one of most common subroutines in the numerica...
In previous work it was found that cache blocking of sparse matrix vector multiplication yielded sig...
While there are many studies on the locality of dense codes, few deal with the locality of sparse co...
In this thesis we introduce a cost measure to compare the cache- friendliness of different permutati...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
In this article, we introduce a cache-oblivious method for sparse matrix–vector multiplication. Our ...
Sparse matrix-vector multiplication (SpMxV) is a kernel operation widely used in iterative linear so...
The sparse matrix–vector (SpMV) multiplication is an important kernel in many applications. When the...
We improve the performance of sparse matrix-vector multiply (SpMV) on modern cache-based superscalar...